Johan Ferret

According to our database1, Johan Ferret authored at least 23 papers between 2019 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Survey of Temporal Credit Assignment in Deep Reinforcement Learning.
Trans. Mach. Learn. Res., 2024

Diversity-Rewarded CFG Distillation.
CoRR, 2024

Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL.
CoRR, 2024

Gemma 2: Improving Open Language Models at a Practical Size.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2024

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning.
CoRR, 2024

BOND: Aligning LLMs with Best-of-N Distillation.
CoRR, 2024

WARP: On the Benefits of Weight Averaged Rewarded Policies.
CoRR, 2024

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models.
CoRR, 2024

Gemma: Open Models Based on Gemini Research and Technology.
CoRR, 2024

Direct Language Model Alignment from Online AI Feedback.
CoRR, 2024

WARM: On the Benefits of Weight Averaged Reward Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Conditional Language Policy: A General Framework For Steerable Multi-Objective Finetuning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
On actions that matter: credit assignment and interpretability in reinforcement learning. (De l'importance des actions: assignation de crédit et interprétabilité pour l'apprentissage par renforcement).
PhD thesis, 2022

Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act.
CoRR, 2022

Lazy-MDPs: Towards Interpretable RL by Learning When to Act.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

2021
More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences.
CoRR, 2021

There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Adversarially Guided Actor-Critic.
Proceedings of the 9th International Conference on Learning Representations, 2021

Self-Imitation Advantage Learning.
Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

2020
Self-Attentional Credit Assignment for Transfer in Reinforcement Learning.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

2019
Credit Assignment as a Proxy for Transfer in Reinforcement Learning.
CoRR, 2019


  Loading...