Christoph Dann

Claudio Gentile

Aldo Pacchiano

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023

Data-Driven Regret Balancing for Online Model Selection in Bandits.

[BibT_eX]

[DOI]

Aldo Pacchiano

Claudio Gentile

CoRR, 2023

Best of Both Worlds Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Reinforcement Learning Can Be More Efficient with Multiple Rewards.

[BibT_eX]

[DOI]

Yishay Mansour

Mehryar Mohri

Proceedings of the International Conference on Machine Learning, 2023

Learning in POMDPs is Sample-Efficient with Hindsight Observability.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

A Blackbox Approach to Best of Both Worlds in Bandits and Beyond.

[BibT_eX]

[DOI]

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

A Unified Algorithm for Stochastic Path Problems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Algorithmic Learning Theory, 2023

Pseudonorm Approachability and Applications to Regret Minimization.

[BibT_eX]

[DOI]

Balasubramanian Sivan

Proceedings of the International Conference on Algorithmic Learning Theory, 2023

Multiple-policy High-confidence Policy Evaluation.

[BibT_eX]

[DOI]

Mohammad Ghavamzadeh

Teodor V. Marinov

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

Best of Both Worlds Model Selection.

[BibT_eX]

[DOI]

Aldo Pacchiano

Claudio Gentile

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Same Cause; Different Effects in the Brain.

[BibT_eX]

[DOI]

Proceedings of the 1st Conference on Causal Learning and Reasoning, 2022

A Model Selection Approach for Corruption Robust Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Algorithmic Learning Theory, 29 March, 2022

Leveraging Initial Hints for Free in Stochastic Linear Bandits.

[BibT_eX]

[DOI]

Ashok Cutkosky

Abhimanyu Das

Qiuyi (Richard) Zhang

Proceedings of the International Conference on Algorithmic Learning Theory, 29 March, 2022

2021

Neural Active Learning with Performance Guarantees.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning.

[BibT_eX]

[DOI]

Teodor Vanislavov Marinov

Mehryar Mohri

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Dynamic Balancing for Model Selection in Bandits and RL.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Strategic Exploration in Reinforcement Learning - New Algorithms and Learning Guarantees.

[BibT_eX]

[DOI]

PhD thesis, 2020

Regret Bound Balancing and Elimination for Model Selection in Bandits and RL.

[BibT_eX]

[DOI]

CoRR, 2020

Reinforcement Learning with Feedback Graphs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Policy Certificates: Towards Accountable Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

On Polynomial Time PAC Reinforcement Learning with Rich Observations.

[BibT_eX]

[DOI]

CoRR, 2018

On Oracle-Efficient PAC RL with Rich Observations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Decoupling Gradient-Like Learning Rules from Representations.

[BibT_eX]

[DOI]

Philip S. Thomas

Proceedings of the 35th International Conference on Machine Learning, 2018

2017

Automated matching of pipeline corrosion features from in-line inspection data.

[BibT_eX]

[DOI]

Markus R. Dann

Reliab. Eng. Syst. Saf., 2017

Bayesian Time-of-Flight for Realtime Shape, Illumination and Albedo.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2017

Decoupling Learning Rules from Representations.

[BibT_eX]

[DOI]

Philip S. Thomas

CoRR, 2017

UBEV - A More Practical Algorithm for Episodic RL with Near-Optimal PAC and Regret Guarantees.

[BibT_eX]

[DOI]

Tor Lattimore

CoRR, 2017

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning.

[BibT_eX]

[DOI]

Tor Lattimore

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Sample Efficient Policy Search for Optimal Stopping Domains.

[BibT_eX]

[DOI]

Karan Goel

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

2016

Memory Lens: How Much Memory Does an Agent Use?

[BibT_eX]

[DOI]

Katja Hofmann

Sebastian Nowozin

CoRR, 2016

Energetic Natural Gradient Descent.

[BibT_eX]

[DOI]

Philip S. Thomas

Bruno Castro da Silva

Proceedings of the 33nd International Conference on Machine Learning, 2016

2015

RLPy: a value-function-based reinforcement learning framework for education and research.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2015

Thoughts on Massively Scalable Gaussian Processes.

[BibT_eX]

[DOI]

Andrew Gordon Wilson

Hannes Nickisch

CoRR, 2015

The Human Kernel.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Policy Evaluation with Temporal Differences: A Survey and Comparison (Extended Abstract).

[BibT_eX]

[DOI]

Gerhard Neumann

Jan Peters

Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling, 2015

2014

Policy evaluation with temporal differences: a survey and comparison.

[BibT_eX]

[DOI]