Follow
Aviv Rosenberg
Title
Cited by
Cited by
Year
Online convex optimization in adversarial markov decision processes
A Rosenberg, Y Mansour
International Conference on Machine Learning, 5478-5486, 2019
1372019
Optimistic policy optimization with bandit feedback
L Shani, Y Efroni, A Rosenberg, S Mannor
International Conference on Machine Learning, 8604-8613, 2020
882020
Online stochastic shortest path with bandit feedback and unknown transition function
A Rosenberg, Y Mansour
Advances in Neural Information Processing Systems, 2212-2221, 2019
562019
Near-optimal regret bounds for stochastic shortest path
A Cohen, H Kaplan, Y Mansour, A Rosenberg
International Conference on Machine Learning, 8210-8219, 2020
522020
Stochastic Shortest Path with Adversarially Changing Costs
A Rosenberg, Y Mansour
Thirtieth International Joint Conference on Artificial Intelligence (IJCAI …, 2021
342021
Minimax regret for stochastic shortest path
A Cohen, Y Efroni, Y Mansour, A Rosenberg
Thirty-Fifth Conference on Neural Information Processing Systems, 2021
252021
Learning adversarial markov decision processes with delayed feedback
T Lancewicki, A Rosenberg, Y Mansour
Proceedings of the AAAI Conference on Artificial Intelligence 36 (7), 7281-7289, 2022
212022
Near-optimal regret for adversarial mdp with delayed bandit feedback
T Jin, T Lancewicki, H Luo, Y Mansour, A Rosenberg
Advances in Neural Information Processing Systems 35, 33469-33481, 2022
182022
Oracle-efficient regret minimization in factored mdps with unknown structure
A Rosenberg, Y Mansour
Advances in Neural Information Processing Systems 34, 11148-11159, 2021
18*2021
Policy optimization for stochastic shortest path
L Chen, H Luo, A Rosenberg
Conference on Learning Theory, 982-1046, 2022
122022
Planning and learning with adaptive lookahead
A Rosenberg, A Hallak, S Mannor, G Chechik, G Dalal
Proceedings of the AAAI Conference on Artificial Intelligence 37 (8), 9606-9613, 2023
62023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback
T Lancewicki, A Rosenberg, D Sotnikov
International Conference on Machine Learning, 18482-18534, 2023
22023
Cooperative online learning in stochastic and adversarial MDPs
T Lancewicki, A Rosenberg, Y Mansour
International Conference on Machine Learning, 11918-11968, 2022
22022
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
A Cassel, H Luo, A Rosenberg, D Sotnikov
arXiv preprint arXiv:2405.07637, 2024
12024
A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
D van der Hoeven, L Zierahn, T Lancewicki, A Rosenberg, ...
Conference on Learning Theory, 1285-1321, 2023
12023
Multi-turn Reinforcement Learning from Preference Human Feedback
L Shani, A Rosenberg, A Cassel, O Lang, D Calandriello, A Zipori, ...
arXiv preprint arXiv:2405.14655, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–16