Tor Lattimore

According to our database1, Tor Lattimore authored at least 95 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Online Newton Method for Bandit Convex Optimisation.
CoRR, 2024

Bandit Convex Optimisation.
CoRR, 2024

Online Newton Method for Bandit Convex Optimisation Extended Abstract.
Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

2023
Linear Partial Monitoring for Sequential Decision Making: Algorithms, Regret Bounds and Applications.
J. Mach. Learn. Res., 2023

Sequential Best-Arm Identification with Application to Brain-Computer Interface.
CoRR, 2023

Probabilistic Inference in Reinforcement Learning Done Right.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Context-lumpable stochastic bandits.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Leveraging Demonstrations to Improve Online Learning: Quality Matters.
Proceedings of the International Conference on Machine Learning, 2023

Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost.
Proceedings of the International Conference on Machine Learning, 2023

A Lower Bound for Linear and Kernel Regression with Adaptive Covariates.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

A Second-Order Method for Stochastic Bandit Convex Optimisation.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

2022
Regret Bounds for Information-Directed Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Contextual Information-Directed Sampling.
Proceedings of the International Conference on Machine Learning, 2022

Return of the bias: Almost minimax optimal high probability bounds for adversarial linear bandits.
Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Minimax Regret for Partial Monitoring: Infinite Outcomes and Rustichini's Regret.
Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

2021
Minimax Regret for Bandit Convex Optimisation of Ridge Functions.
CoRR, 2021

Geometric Entropic Exploration.
CoRR, 2021

Matrix games with bandit feedback.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

Variational Bayesian Optimistic Sampling.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Bandit Phase Retrieval.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Information Directed Sampling for Sparse Linear Bandits.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On the Optimality of Batch Policy Optimization Algorithms.
Proceedings of the 38th International Conference on Machine Learning, 2021

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient.
Proceedings of the 38th International Conference on Machine Learning, 2021

Mirror Descent and the Information Ratio.
Proceedings of the Conference on Learning Theory, 2021

Improved Regret for Zeroth-Order Stochastic Convex Bandits.
Proceedings of the Conference on Learning Theory, 2021

Asymptotically Optimal Information-Directed Sampling.
Proceedings of the Conference on Learning Theory, 2021

Online Sparse Reinforcement Learning.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Gated Linear Networks.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Stochastic matrix games with bandit feedback.
CoRR, 2020

Improved Regret for Zeroth-Order Adversarial Bandit Convex Optimisation.
CoRR, 2020

Model Selection in Contextual Stochastic Bandit Problems.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

High-Dimensional Sparse Linear Bandits.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Gaussian Gated Linear Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Linear bandits with Stochastic Delayed Feedback.
Proceedings of the 37th International Conference on Machine Learning, 2020

Learning with Good Feature Representations in Bandits and in RL with a Generative Model.
Proceedings of the 37th International Conference on Machine Learning, 2020

Behaviour Suite for Reinforcement Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

Exploration by Optimisation in Partial Monitoring.
Proceedings of the Conference on Learning Theory, 2020

Information Directed Sampling for Linear Partial Monitoring.
Proceedings of the Conference on Learning Theory, 2020

Adaptive Exploration in Linear Contextual Bandit.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
Learning with Good Feature Representations in Bandits and in RL with a Generative Model.
CoRR, 2019

Gated Linear Networks.
CoRR, 2019

Zooming Cautiously: Linear-Memory Heuristic Search With Node Expansion Guarantees.
CoRR, 2019

Adaptivity, Variance and Separation for Adversarial Bandits.
CoRR, 2019

On First-Order Bounds, Variance and Gap-Dependent Bounds for Adversarial Bandits.
Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019

BubbleRank: Safe Online Learning to Re-Rank via Implicit Click Feedback.
Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Geometric Perspective on Optimal Representations for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Iterative Budgeted Exponential Search.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Online Learning to Rank with Features.
Proceedings of the 36th International Conference on Machine Learning, 2019

Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits.
Proceedings of the 36th International Conference on Machine Learning, 2019

An Information-Theoretic Approach to Minimax Regret in Partial Monitoring.
Proceedings of the Conference on Learning Theory, 2019

Cleaning up the neighborhood: A full classification for adversarial partial monitoring.
Proceedings of the Algorithmic Learning Theory, 2019

Degenerate Feedback Loops in Recommender Systems.
Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 2019

2018
Refining the Confidence Level for Optimistic Bandit Strategies.
J. Mach. Learn. Res., 2018

Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits.
CoRR, 2018

BubbleRank: Safe Online Learning to Rerank.
CoRR, 2018

Single-Agent Policy Tree Search With Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

TopRank: A practical algorithm for online stochastic ranking.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

2017
Following the Leader and Fast Rates in Online Linear Prediction: Curved Constraint Sets and Other Regularities.
J. Mach. Learn. Res., 2017

Online Learning with Gated Linear Networks.
CoRR, 2017

UBEV - A More Practical Algorithm for Episodic RL with Near-Optimal PAC and Regret Guarantees.
CoRR, 2017

A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

On Thompson Sampling and Asymptotic Optimality.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Soft-Bayes: Prod for Mixtures of Experts with Log-Loss.
Proceedings of the International Conference on Algorithmic Learning Theory, 2017

The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016
Regret Analysis of the Anytime Optimally Confident UCB Algorithm.
CoRR, 2016

Thompson Sampling is Asymptotically Optimal in General Environments.
Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Causal Bandits: Learning Good Interventions via Causal Inference.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Refined Lower Bounds for Adversarial Bandits.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

On Explore-Then-Commit strategies.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Conservative Bandits.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Regret Analysis of the Finite-Horizon Gittins Index Strategy for Multi-Armed Bandits.
Proceedings of the 29th Conference on Learning Theory, 2016

2015
On Martin-Löf (non-)convergence of Solomonoff's universal mixture.
Theor. Comput. Sci., 2015

Optimally Confident UCB : Improved Regret for Finite-Armed Bandits.
CoRR, 2015

Linear Multi-Resource Allocation with Semi-Bandit Feedback.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

The Pareto Regret Frontier for Bandits.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

2014
Near-optimal PAC bounds for discounted MDPs.
Theor. Comput. Sci., 2014

General time consistent discounting.
Theor. Comput. Sci., 2014

Asymptotics of Continuous Bayes for Non-i.i.d. Sources.
CoRR, 2014

Optimal Resource Allocation with Semi-Bandit Feedback.
Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 2014

Bounded Regret for Finite-Armed Structured Bandits.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Free Lunch for optimisation under the universal distribution.
Proceedings of the IEEE Congress on Evolutionary Computation, 2014

Bayesian Reinforcement Learning with Exploration.
Proceedings of the Algorithmic Learning Theory - 25th International Conference, 2014

On Learning the Optimal Waiting Time.
Proceedings of the Algorithmic Learning Theory - 25th International Conference, 2014

2013
On Martin-Löf Convergence of Solomonoff's Mixture.
Proceedings of the Theory and Applications of Models of Computation, 2013

The Sample-Complexity of General Reinforcement Learning.
Proceedings of the 30th International Conference on Machine Learning, 2013

Universal Knowledge-Seeking Agents for Stochastic Environments.
Proceedings of the Algorithmic Learning Theory - 24th International Conference, 2013

Concentration and Confidence for Discrete Bayesian Sequence Predictors.
Proceedings of the Algorithmic Learning Theory - 24th International Conference, 2013

2012
PAC Bounds for Discounted MDPs.
Proceedings of the Algorithmic Learning Theory - 23rd International Conference, 2012

2011
No Free Lunch versus Occam's Razor in Supervised Learning.
Proceedings of the Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence, 2011

Universal Prediction of Selected Bits.
Proceedings of the Algorithmic Learning Theory - 22nd International Conference, 2011

Time Consistent Discounting.
Proceedings of the Algorithmic Learning Theory - 22nd International Conference, 2011

Asymptotically Optimal Agents.
Proceedings of the Algorithmic Learning Theory - 22nd International Conference, 2011


  Loading...