2025

Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Parameters for Reasoning.

[DOI]

Charlie Victor Snell

Jaehoon Lee

Kelvin Xu

Aviral Kumar

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models.

[DOI]

Trans. Mach. Learn. Res., 2024

Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries.

[DOI]

CoRR, 2024

Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability.

[DOI]

CoRR, 2024

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters.

[DOI]

CoRR, 2024

Small-scale proxies for large-scale Transformer training instabilities.

[DOI]

Jascha Sohl-Dickstein

Proceedings of the Twelfth International Conference on Learning Representations, 2024

ContMulti-objective Optimization Model for Momentum Change Based on Genetic Algorithm.

[DOI]

Proceedings of the Advanced Intelligent Computing Technology and Applications, 2024

2023

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models.

[DOI]

CoRR, 2023

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

[DOI]

CoRR, 2023

Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance.

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2023

2022

Towards Adaptive, Continual Embodied Agents

[DOI]

Kelvin Xu

PhD thesis, 2022

Autonomous Reinforcement Learning: Formalism and Benchmarking.

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention.

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2021

2020

Continual Learning of Control Primitives : Skill Discovery via Reset-Games.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples.

[DOI]

Pierre-Antoine Manzagol

Hugo Larochelle

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples.

[DOI]

Pierre-Antoine Manzagol

Hugo Larochelle

CoRR, 2019

Privacy-Preserving Fall Detection with Deep Learning on mmWave Radar Signal.

[DOI]

Proceedings of the 2019 IEEE Visual Communications and Image Processing, 2019

Learning a Prior over Intent via Meta-Inverse Reinforcement Learning.

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

Probabilistic Model-Agnostic Meta-Learning.

[DOI]

Chelsea Finn

Kelvin Xu

Sergey Levine

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Trust-PCL: An Off-Policy Trust Region Method for Continuous Control.

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

On integrating a language model into neural machine translation.

[DOI]

Comput. Speech Lang., 2017

Bridging the Gap Between Value and Policy Based Reinforcement Learning.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Unsupervised Perceptual Rewards for Imitation Learning.

[DOI]

Pierre Sermanet

Kelvin Xu

Sergey Levine

Proceedings of the 5th International Conference on Learning Representations, 2017

An Actor-Critic Algorithm for Sequence Prediction.

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

2016

Theano: A Python framework for fast computation of mathematical expressions.

[DOI]

Nicolas Boulanger-Lewandowski

Xavier Bouthillier

Alexandre de Brébisson

Samira Ebrahimi Kahou

Pierre-Antoine Manzagol

Christopher Joseph Pal

S. Ramana Subramanyam

CoRR, 2016

2015

A Controller Recognizer Framework: How necessary is recognition for control?

[DOI]

CoRR, 2015

On Using Monolingual Corpora in Neural Machine Translation.

[DOI]

CoRR, 2015

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.

[DOI]

Proceedings of the 32nd International Conference on Machine Learning, 2015