Runlong Zhou

20212022202320243 12 27 8

オープンアクセス

2 件の論文

0 件の論文

利用可能

利用不可

助成機関の要件に基づく

Simon Shaolei DuAssistant Professor, School of Computer Science and Engineering, University of Washington確認したメールアドレス: cs.washington.edu
Michal ValkoLlama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMind確認したメールアドレス: meta.com
Matteo PirottaResearch Scientist, Meta (FAIR)確認したメールアドレス: fb.com
Jean TarbouriechGoogle DeepMind確認したメールアドレス: google.com
Alessandro LazaricResearch Scientist, Facebook Artificial Intelligence Research確認したメールアドレス: inria.fr
Ruosong WangPhD Student, Carnegie Mellon University確認したメールアドレス: andrew.cmu.edu
Yuandong TianResearch Scientist, Meta AI (FAIR)確認したメールアドレス: fb.com
Yi WuInstitute for Interdisciplinary Information Sciences, Tsinghua University確認したメールアドレス: mail.tsinghua.edu.cn
Zhang ZihanTsinghua University確認したメールアドレス: mails.tsinghua.edu.cn

Runlong Zhou

Paul G. Allen School of Computer Science & Engineering, University of Washington

確認したメールアドレス: cs.washington.edu - ホームページ


タイトル引用回数順公開年順タイトル順	引用先引用先	年
Stochastic shortest path: Minimax, parameter-free and towards horizon-free regret J Tarbouriech, R Zhou, SS Du, M Pirotta, M Valko, A Lazaric Advances in neural information processing systems 34, 6843-6855, 2021	31	2021
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes R Zhou, R Wang, SS Du International Conference on Machine Learning, 42698-42723, 2023	7*	2023
Sharp variance-dependent bounds in reinforcement learning: Best of both worlds in stochastic and deterministic environments R Zhou, Z Zhang, SS Du International Conference on Machine Learning, 42878-42914, 2023	7	2023
Understanding curriculum learning in policy optimization for solving combinatorial optimization problems R Zhou, Y Tian, Y Wu, SS Du arXiv preprint arXiv:2202.05423, 2022	4*	2022
Free from bellman completeness: Trajectory stitching via model-based return-conditioned supervised learning Z Zhou, C Zhu, R Zhou, Q Cui, A Gupta, SS Du arXiv preprint arXiv:2310.19308, 2023	1	2023
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs R Zhou, SS Du, B Li arXiv preprint arXiv:2402.12621, 2024		2024

現在システムで処理を実行できません。しばらくしてからもう一度お試しください。

論文 1–6

年間引用数