‪Tianhao Wu‬ - ‪Google Scholar‬

自分のプロフィールを作成

引用先

	すべて	2019 年以来
引用	133	133
h 指標	5	5
i10 指標	4	4

0

50

25

202120222023202416 32 49 35

オープンアクセス

すべて表示

2 件の論文

0 件の論文

利用可能

利用不可

助成機関の要件に基づく

Tianhao Wu

Tianhao Wu

University of California, Berkeley

確認したメールアドレス: berkeley.edu - ホームページ


タイトル引用回数順公開年順タイトル順	引用先引用先	年
Sanity-checking pruning methods: Random tickets can win the jackpot J Su, Y Chen, T Cai, T Wu, R Gao, L Wang, JD Lee Advances in neural information processing systems 33, 20390-20401, 2020	72	2020
Starling-7b: Improving llm helpfulness & harmlessness with rlaif B Zhu, E Frick, T Wu, H Zhu, J Jiao November, 2023	15	2023
On reinforcement learning with adversarial corruption and its application to block mdp T Wu, Y Yang, S Du, L Wang International Conference on Machine Learning, 11296-11306, 2021	14	2021
Nearly optimal policy optimization with stable at any time guarantee T Wu, Y Yang, H Zhong, L Wang, S Du, J Jiao International Conference on Machine Learning, 24243-24265, 2022	11	2022
Pairwise proximal policy optimization: Harnessing relative feedback for llm alignment T Wu, B Zhu, R Zhang, Z Wen, K Ramchandran, J Jiao arXiv preprint arXiv:2310.00212, 2023	9	2023
A reduction-based framework for conservative bandits and reinforcement learning Y Yang, T Wu, H Zhong, E Garcelon, M Pirotta, A Lazaric, L Wang, SS Du arXiv preprint arXiv:2106.11692, 2021	4	2021
A reduction-based framework for sequential decision making with delayed feedback Y Yang, H Zhong, T Wu, B Liu, L Wang, SS Du Advances in Neural Information Processing Systems 36, 2024	3	2024
Statistical inference on multi-armed bandits with delayed feedback L Shi, J Wang, T Wu International Conference on Machine Learning, 31328-31352, 2023	3	2023
A unified framework for conservative exploration Y Yang, T Wu, H Zhong, E Garcelon, M Pirotta, A Lazaric, L Wang, SS Du arXiv preprint arXiv:2106.11692, 2021	2	2021

現在システムで処理を実行できません。しばらくしてからもう一度お試しください。

論文 1–9