フォロー
Tadashi Kozuno
Tadashi Kozuno
Omron Sinic X
確認したメール アドレス: sinicx.com - ホームページ
タイトル
引用先
引用先
Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning
N Vieillard, T Kozuno, B Scherrer, O Pietquin, R Munos, M Geist
The 34th Conference on Neural Information Processing Systems, 2020
99*2020
Theoretical analysis of efficiency and robustness of softmax and gap-increasing operators in reinforcement learning
T Kozuno, E Uchibe, K Doya
The 22nd International Conference on Artificial Intelligence and Statistics …, 2019
402019
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall
T Kozuno, P Ménard, R Munos, M Valko
Advances in Neural Information Processing Systems 35, 2021
30*2021
Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
H Furuta, T Matsushima, T Kozuno, Y Matsuo, S Levine, O Nachum, ...
The 38th International Conference on Machine Learning, 2021
182021
Revisiting Peng's Q () for Modern Reinforcement Learning
T Kozuno, Y Tang, M Rowland, R Munos, S Kapturowski, W Dabney, ...
The 38th International Conference on Machine Learning, 2021
172021
Greedification operators for policy optimization: Investigating forward and reverse kl divergences
A Chan, H Silva, S Lim, T Kozuno, AR Mahmood, M White
The Journal of Machine Learning Research 23 (1), 11474-11552, 2022
162022
Identifying Co-Adaptation of Algorithmic and Implementational Innovations in Deep Reinforcement Learning: A Taxonomy and Case Study of Inference-based Algorithms
H Furuta, T Kozuno, T Matsushima, Y Matsuo, SS Gu
Advances in Neural Information Processing Systems 35, 2021
12*2021
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
Y Tang, T Kozuno, M Rowland, R Munos, M Valko
Advances in Neural Information Processing Systems 35, 2021
102021
Confident Approximate Policy Iteration for Efficient Local Planning in -realizable MDPs
G Weisz, A György, T Kozuno, C Szepesvári
Advances in Neural Information Processing Systems 35, 25547-25559, 2022
62022
Study of White-LED Using Amorphous Carbon Nitride Grown by RF-sputtering and ECR-plasma CVD
T Kozuno, S Kishimoto, K Tachibana, K Itoh, Y Iwano, S Kunitsugu, ...
Journal of Light & Visual Environment 35 (1), 86-89, 2011
62011
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal
T Kozuno, W Yang, N Vieillard, T Kitamura, Y Tang, J Mei, P Ménard, ...
arXiv preprint arXiv:2205.14211, 2022
52022
Variational oracle guiding for reinforcement learning
D Han, T Kozuno, X Luo, ZY Chen, K Doya, Y Yang, D Li
International Conference on Learning Representations, 2021
52021
Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for Robotics Control with Action Constraints
K Kasaura, S Miura, T Kozuno, R Yonetani, K Hoshino, Y Hosoe
IEEE Robotics and Automation Letters, 2023
42023
Adapting to game trees in zero-sum imperfect information games
C Fiegel, P Ménard, T Kozuno, R Munos, V Perchet, M Valko
International Conference on Machine Learning, 10093-10135, 2023
32023
Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model
W Yang, H Wang, T Kozuno, SM Jordan, Z Zhang
arXiv preprint arXiv:2302.01248, 2023
32023
No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL
H Wang, A Sakhadeo, A White, J Bell, V Liu, X Zhao, P Liu, T Kozuno, ...
Transactions on Machine Learning Research, 2022
32022
Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning
T Kozuno, D Han, K Doya
arXiv preprint arXiv:1906.07586, 2019
32019
Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming
T Kozuno, E Uchibe, K Doya
arXiv preprint arXiv:1710.10866, 2017
32017
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
T Kitamura, T Kozuno, Y Tang, N Vieillard, M Valko, W Yang, J Mei, ...
International Conference on Machine Learning, 17135-17175, 2023
22023
Local and adaptive mirror descents in extensive-form games
C Fiegel, P Ménard, T Kozuno, R Munos, V Perchet, M Valko
arXiv preprint arXiv:2309.00656, 2023
12023
現在システムで処理を実行できません。しばらくしてからもう一度お試しください。
論文 1–20