Follow
Runji Lin
Title
Cited by
Cited by
Year
Qwen technical report
J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng, Y Fan, W Ge, Y Han, F Huang, ...
arXiv preprint arXiv:2309.16609, 2023
10802023
Qwen2 technical report
A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, ...
arXiv preprint arXiv:2407.10671, 2024
1832024
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
M Wen, JG Kuba, R Lin, W Zhang, Y Wen, J Wang, Y Yang
NeurIPS 2022, 2022
1592022
# instag: Instruction tagging for analyzing supervised fine-tuning of large language models
K Lu, H Yuan, Z Yuan, R Lin, J Lin, C Tan, C Zhou, J Zhou
The Twelfth International Conference on Learning Representations, 2023
432023
Routing to the expert: Efficient reward-guided ensemble of large language models
K Lu, H Yuan, R Lin, J Lin, Z Yuan, C Zhou, J Zhou
arXiv preprint arXiv:2311.08692, 2023
222023
Large Sequence Models for Sequential Decision-Making: A Survey
M WEN, R LIN, H WANG, Y YANG, Y WEN, L MAI, J WANG, H ZHANG, ...
Frontiers of Computer Science, 2023
212023
Large language models play starcraft ii: Benchmarks and a chain of summarization approach
W Ma, Q Mi, X Yan, Y Wu, R Lin, H Zhang, J Wang
arXiv preprint arXiv:2312.11865, 2023
202023
Contextual Transformer for Offline Meta Reinforcement Learning
R Lin, Y Li, X Feng, Z Zhang, XHW Fung, H Zhang, J Wang, Y Du, Y Yang
NeurIPS 2022 Workshop: Foundation Models for Decision Making, 2022
102022
Online merging optimizers for boosting rewards and mitigating tax in alignment
K Lu, B Yu, F Huang, Y Fan, R Lin, C Zhou
arXiv preprint arXiv:2405.17931, 2024
92024
Scalable Model-based Policy Optimization for Decentralized Networked Systems
Y Du, C Ma, Y Liu, R Lin, H Dong, J Wang, Y Yang
IROS 2022, 2022
7*2022
Learn to flap: foil non-parametric path planning via deep reinforcement learning
ZP Wang, RJ Lin, ZY Zhao, X Chen, PM Guo, N Yang, ZC Wang, DX Fan
Journal of Fluid Mechanics 984, A9, 2024
62024
Qwen2. 5-math technical report: Toward mathematical expert model via self-improvement
A Yang, B Zhang, B Hui, B Gao, B Yu, C Li, D Liu, J Tu, J Zhou, J Lin, K Lu, ...
arXiv preprint arXiv:2409.12122, 2024
22024
Increasing the Data Rate for Reflected Optical Camera Communication Using Uniform LED Light
Z Chen, R Lin, H Duan, Y Chen, Y Yang, R Wu, L Chen
IEEE INFOCOM 2020-IEEE Conference on Computer Communications Workshops …, 2020
12020
Online Decision MetaMorphFormer: A Casual Transformer-Based Reinforcement Learning Framework of Universal Embodied Intelligence
L Ji, R Lin
arXiv preprint arXiv:2409.07341, 2024
2024
The Reason behind Good or Bad: Towards a Better Mathematical Verifier with Natural Language Feedback
B Gao, Z Cai, R Xu, P Wang, C Zheng, R Lin, K Lu, J Lin, C Zhou, T Liu, ...
arXiv preprint arXiv:2406.14024, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–15