Pierre Ménard

According to our database1, Pierre Ménard authored at least 39 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Optimal Design for Reward Modeling in RLHF.
CoRR, 2024

A New Bound on the Cumulant Generating Function of Dirichlet Processes.
CoRR, 2024

Demonstration-Regularized RL.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Local and adaptive mirror descents in extensive-form games.
CoRR, 2023

Learning Generative Models with Goal-conditioned Reinforcement Learning.
CoRR, 2023

Model-free Posterior Sampling via Learning Rate Randomization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fast Rates for Maximum Entropy Exploration.
Proceedings of the International Conference on Machine Learning, 2023

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice.
Proceedings of the International Conference on Machine Learning, 2023

Adapting to game trees in zero-sum imperfect information games.
Proceedings of the International Conference on Machine Learning, 2023

2022
KL-UCB-Switch: Optimal Regret Bounds for Stochastic Bandits from Both a Distribution-Dependent and a Distribution-Free Viewpoints.
J. Mach. Learn. Res., 2022

KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal.
CoRR, 2022

Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses.
Proceedings of the International Conference on Machine Learning, 2022

Adaptive Multi-Goal Exploration.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall.
CoRR, 2021

Indexed Minimum Empirical Divergence for Unimodal Bandits.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning in two-player zero-sum partially observable Markov games with perfect recall.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Bandits with many optimal arms.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

UCB Momentum Q-learning: Correcting the bias without forgetting.
Proceedings of the 38th International Conference on Machine Learning, 2021

Fast active learning for pure exploration in reinforcement learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Kernel-Based Reinforcement Learning: A Finite-Time Analysis.
Proceedings of the 38th International Conference on Machine Learning, 2021

Problem Dependent View on Structured Thresholding Bandit Problems.
Proceedings of the 38th International Conference on Machine Learning, 2021

Adaptive Reward-Free Exploration.
Proceedings of the Algorithmic Learning Theory, 2021

Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited.
Proceedings of the Algorithmic Learning Theory, 2021

A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
Optimal Strategies for Graph-Structured Bandits.
CoRR, 2020

Forced-exploration free Strategies for Unimodal Bandits.
CoRR, 2020

Regret Bounds for Kernel-Based Reinforcement Learning.
CoRR, 2020

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Gamification of Pure Exploration for Linear Bandits.
Proceedings of the 37th International Conference on Machine Learning, 2020

The Influence of Shape Constraints on the Thresholding Bandit Problem.
Proceedings of the Conference on Learning Theory, 2020

Fixed-confidence guarantees for Bayesian best-arm identification.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A single algorithm for both restless and rested rotting bandits.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
Explore First, Exploit Next: The True Shape of Regret in Bandit Problems.
Math. Oper. Res., 2019

Gradient Ascent for Active Exploration in Bandit Problems.
CoRR, 2019

Planning in entropy-regularized Markov decision processes and games.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Non-Asymptotic Pure Exploration by Solving Games.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2017
Fano's inequality for random variables.
CoRR, 2017

A minimax and asymptotically optimal algorithm for stochastic bandits.
Proceedings of the International Conference on Algorithmic Learning Theory, 2017


  Loading...