Pierre Ménard

CoRR, 2023

Model-free Posterior Sampling via Learning Rate Randomization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fast Rates for Maximum Entropy Exploration.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice.

[BibT_eX]

[DOI]

Mohammad Gheshlaghi Azar

Proceedings of the International Conference on Machine Learning, 2023

Adapting to game trees in zero-sum imperfect information games.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

2022

KL-UCB-Switch: Optimal Regret Bounds for Stochastic Bandits from Both a Distribution-Dependent and a Distribution-Free Viewpoints.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2022

KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal.

[BibT_eX]

[DOI]

Mohammad Gheshlaghi Azar

CoRR, 2022

Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Adaptive Multi-Goal Exploration.

[BibT_eX]

[DOI]

Jean Tarbouriech

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall.

[BibT_eX]

[DOI]

CoRR, 2021

Indexed Minimum Empirical Divergence for Unimodal Bandits.

[BibT_eX]

[DOI]

Hassan Saber

Odalric-Ambrym Maillard

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning in two-player zero-sum partially observable Markov games with perfect recall.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Bandits with many optimal arms.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

UCB Momentum Q-learning: Correcting the bias without forgetting.

[BibT_eX]

[DOI]

Xuedong Shang

Proceedings of the 38th International Conference on Machine Learning, 2021

Fast active learning for pure exploration in reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Kernel-Based Reinforcement Learning: A Finite-Time Analysis.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Problem Dependent View on Structured Thresholding Bandit Problems.

[BibT_eX]

[DOI]

James Cheshire

Alexandra Carpentier

Proceedings of the 38th International Conference on Machine Learning, 2021

Adaptive Reward-Free Exploration.

[BibT_eX]

[DOI]

Emilie Kaufmann

Anders Jonsson

Edouard Leurent

Proceedings of the Algorithmic Learning Theory, 2021

Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited.

[BibT_eX]

[DOI]

Emilie Kaufmann

Proceedings of the Algorithmic Learning Theory, 2021

A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Optimal Strategies for Graph-Structured Bandits.

[BibT_eX]

[DOI]

Hassan Saber

Odalric-Ambrym Maillard

CoRR, 2020

Forced-exploration free Strategies for Unimodal Bandits.

[BibT_eX]

[DOI]

Hassan Saber

Odalric-Ambrym Maillard

CoRR, 2020

Regret Bounds for Kernel-Based Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity.

[BibT_eX]

[DOI]

Anders Jonsson

Emilie Kaufmann

Edouard Leurent

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Gamification of Pure Exploration for Linear Bandits.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

The Influence of Shape Constraints on the Thresholding Bandit Problem.

[BibT_eX]

[DOI]

James Cheshire

Alexandra Carpentier

Proceedings of the Conference on Learning Theory, 2020

Fixed-confidence guarantees for Bayesian best-arm identification.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A single algorithm for both restless and rested rotting bandits.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

Explore First, Exploit Next: The True Shape of Regret in Bandit Problems.

[BibT_eX]

[DOI]

Aurélien Garivier

Gilles Stoltz

Math. Oper. Res., 2019

Gradient Ascent for Active Exploration in Bandit Problems.

[BibT_eX]

[DOI]

CoRR, 2019

Planning in entropy-regularized Markov decision processes and games.

[BibT_eX]

[DOI]

Jean-Bastien Grill

Rémi Munos

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Non-Asymptotic Pure Exploration by Solving Games.

[BibT_eX]

[DOI]

Rémy Degenne

Wouter M. Koolen

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2017

Fano's inequality for random variables.

[BibT_eX]

[DOI]

Sébastien Gerchinovitz

Gilles Stoltz

CoRR, 2017

A minimax and asymptotically optimal algorithm for stochastic bandits.

[BibT_eX]

[DOI]