Nino Vieillard

According to our database¹, Nino Vieillard authored at least 19 papers between 2019 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Imitating Language via Scalable Inverse Reinforcement Learning.

[BibT_eX]

[DOI]

Sarah Maria Elisabeth Bechtle

Jost Tobias Springenberg

CoRR, 2024

Gemma 2: Improving Open Language Models at a Practical Size.

[BibT_eX]

[DOI]

Christopher A. Choquette-Choo

Hanna Klimczak-Plucinska

CoRR, 2024

BOND: Aligning LLMs with Best-of-N Distillation.

[BibT_eX]

[DOI]

CoRR, 2024

WARP: On the Benefits of Weight Averaged Rewarded Policies.

[BibT_eX]

[DOI]

CoRR, 2024

WARM: On the Benefits of Weight Averaged Reward Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

GKD: Generalized Knowledge Distillation for Auto-regressive Sequence Models.

[BibT_eX]

[DOI]

CoRR, 2023

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice.

[BibT_eX]

[DOI]

Mohammad Gheshlaghi Azar

Proceedings of the International Conference on Machine Learning, 2023

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal.

[BibT_eX]

[DOI]

Mohammad Gheshlaghi Azar

CoRR, 2022

Implicitly Regularized RL with Implicit Q-values.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Offline Reinforcement Learning as Anti-exploration.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Offline Reinforcement Learning with Pseudometric Learning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Leverage the Average: an Analysis of Regularization in RL.

[BibT_eX]

[DOI]

CoRR, 2020

Munchausen Reinforcement Learning.

[BibT_eX]

[DOI]

Nino Vieillard

Olivier Pietquin

Matthieu Geist

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Momentum in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Deep Conservative Policy Iteration.

[BibT_eX]

[DOI]

Nino Vieillard

Olivier Pietquin

Matthieu Geist

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

On Connections between Constrained Optimization and Reinforcement Learning.

[BibT_eX]

[DOI]

Nino Vieillard

Olivier Pietquin

Matthieu Geist

CoRR, 2019

Nino Vieillard

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...