Thompson sampling: An asymptotically optimal finite-time analysis E Kaufmann, N Korda, R Munos International conference on algorithmic learning theory, 199-213, 2012 | 782 | 2012 |
On the complexity of best-arm identification in multi-armed bandit models E Kaufmann, O Cappé, A Garivier The Journal of Machine Learning Research 17 (1), 1-42, 2016 | 624 | 2016 |
On Bayesian upper confidence bounds for bandit problems E Kaufmann, O Cappé, A Garivier Artificial intelligence and statistics, 592-600, 2012 | 466 | 2012 |
Optimal best arm identification with fixed confidence A Garivier, E Kaufmann Conference on Learning Theory, 998-1027, 2016 | 393 | 2016 |
Machine learning applications in drug development C Réda, E Kaufmann, A Delahaye-Duriez Computational and structural biotechnology journal 18, 241-252, 2020 | 226 | 2020 |
Information complexity in bandit subset selection E Kaufmann, S Kalyanakrishnan Conference on Learning Theory, 228-251, 2013 | 212 | 2013 |
Thompson sampling for 1-dimensional exponential family bandits N Korda, E Kaufmann, R Munos Advances in neural information processing systems 26, 2013 | 197 | 2013 |
On explore-then-commit strategies A Garivier, T Lattimore, E Kaufmann Advances in Neural Information Processing Systems 29, 2016 | 124 | 2016 |
Mixture martingales revisited with applications to sequential tests and confidence intervals E Kaufmann, WM Koolen Journal of Machine Learning Research 22 (246), 1-44, 2021 | 122 | 2021 |
Multi-player bandits revisited L Besson, E Kaufmann Algorithmic Learning Theory, 56-92, 2018 | 121 | 2018 |
Episodic reinforcement learning in finite mdps: Minimax lower bounds revisited OD Domingues, P Ménard, E Kaufmann, M Valko Algorithmic Learning Theory, 578-598, 2021 | 118 | 2021 |
What doubling tricks can and can't do for multi-armed bandits L Besson, E Kaufmann arXiv preprint arXiv:1803.06971, 2018 | 113 | 2018 |
Multi-Armed Bandit Learning in IoT Networks: Learning helps even in non-stationary settings R Bonnefoi, L Besson, C Moy, E Kaufmann, J Palicot International Conference on Cognitive Radio Oriented Wireless Networks, 173-185, 2017 | 107 | 2017 |
Adaptive reward-free exploration E Kaufmann, P Ménard, OD Domingues, A Jonsson, E Leurent, M Valko Algorithmic Learning Theory, 865-891, 2021 | 90 | 2021 |
Fast active learning for pure exploration in reinforcement learning P Ménard, OD Domingues, A Jonsson, E Kaufmann, E Leurent, M Valko International Conference on Machine Learning, 7599-7608, 2021 | 82 | 2021 |
A practical algorithm for multiplayer bandits when arm means vary among players A Mehrabian, E Boursier, E Kaufmann, V Perchet International Conference on Artificial Intelligence and Statistics, 1211-1221, 2020 | 80 | 2020 |
On Bayesian index policies for sequential resource allocation E Kaufmann The Annals of Statistics 46 (2), 842-865, 2018 | 78 | 2018 |
On the complexity of A/B testing E Kaufmann, O Cappé, A Garivier Conference on Learning Theory, 461-481, 2014 | 76 | 2014 |
On multi-armed bandit designs for dose-finding trials M Aziz, E Kaufmann, MK Riviere Journal of Machine Learning Research 22 (14), 1-38, 2021 | 73 | 2021 |
Fixed-confidence guarantees for bayesian best-arm identification X Shang, R Heide, P Menard, E Kaufmann, M Valko International Conference on Artificial Intelligence and Statistics, 1823-1832, 2020 | 70 | 2020 |