Matthieu Geist

According to our database1, Matthieu Geist authored at least 168 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Survey of Temporal Credit Assignment in Deep Reinforcement Learning.
Trans. Mach. Learn. Res., 2024

Solving robust MDPs as a sequence of static RL problems.
CoRR, 2024

Imitating Language via Scalable Inverse Reinforcement Learning.
CoRR, 2024

Averaging log-likelihoods in direct alignment.
CoRR, 2024

Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion.
CoRR, 2024

RRLS : Robust Reinforcement Learning Suite.
CoRR, 2024

Time-Constrained Robust MDPs.
CoRR, 2024

Bootstrapping Expectiles in Reinforcement Learning.
CoRR, 2024

Self-Improving Robust Preference Optimization.
CoRR, 2024

Leveraging Procedural Generation for Learning Autonomous Peg-in-Hole Assembly in Space.
CoRR, 2024


MusicRL: Aligning Music Generation to Human Preferences.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning.
Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

Learning Discrete-Time Major-Minor Mean Field Games.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Nash Learning from Human Feedback.
CoRR, 2023

DRIFT: Deep Reinforcement Learning for Intelligent Floating Platforms Trajectories.
CoRR, 2023

A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning.
CoRR, 2023

GKD: Generalized Knowledge Distillation for Auto-regressive Sequence Models.
CoRR, 2023

Get Back Here: Robust Imitation by Return-to-Distribution Planning.
CoRR, 2023

Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization.
CoRR, 2023

Towards Minimax Optimality of Model-based Robust Reinforcement Learning.
CoRR, 2023

Policy Gradient for s-Rectangular Robust Markov Decision Processes.
CoRR, 2023

Offline Reinforcement Learning with On-Policy Q-Function Regularization.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative Model.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

On Imitation in Mean-field Games.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Policy Gradient for Rectangular Robust Markov Decision Processes.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Pose-graph SLAM Using Multi-order Ultrasonic Echoes and Beamforming for Long-range Inspection Robots.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games.
Proceedings of the International Conference on Machine Learning, 2023

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice.
Proceedings of the International Conference on Machine Learning, 2023

A Connection between One-Step RL and Critic Regularization in Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

Extreme Q-Learning: MaxEnt RL without Entropy.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
C3PO: Learning to Achieve Arbitrary Goals via Massively Entropic Pretraining.
CoRR, 2022

Learning Correlated Equilibria in Mean-Field Games.
CoRR, 2022

KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal.
CoRR, 2022

Learning Mean Field Games: A Survey.
CoRR, 2022

Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act.
CoRR, 2022

Learning Energy Networks with Generalized Fenchel-Young Losses.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Combined Grid and Feature-based Mapping of Metal Structures with Ultrasonic Guided Waves.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Scalable Deep Reinforcement Learning Algorithms for Mean Field Games.
Proceedings of the International Conference on Machine Learning, 2022

Large Batch Experience Replay.
Proceedings of the International Conference on Machine Learning, 2022

Continuous Control with Action Quantization from Demonstrations.
Proceedings of the International Conference on Machine Learning, 2022

Polygonal Shapes Reconstruction from Acoustic Echoes Using a Mobile Sensor and Beamforming.
Proceedings of the 30th European Signal Processing Conference, 2022

Scaling Mean Field Games by Online Mirror Descent.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

Lazy-MDPs: Towards Interpretable RL by Learning When to Act.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

Concave Utility Reinforcement Learning: The Mean-field Game Viewpoint.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

Implicitly Regularized RL with Implicit Q-values.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

A general class of surrogate functions for stable and efficient reinforcement learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Offline Reinforcement Learning as Anti-exploration.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Generalization in Mean Field Games by Learning Master Policies.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
How to Train Your HERON.
IEEE Robotics Autom. Lett., 2021

A FastSLAM Approach Integrating Beamforming Maps for Ultrasound-Based Robotic Inspection of Metal Structures.
IEEE Robotics Autom. Lett., 2021

Evaluation of Prioritized Deep System Identification on a Path Following Task.
J. Intell. Robotic Syst., 2021

A functional mirror ascent view of policy gradient methods with function approximation.
CoRR, 2021

Scaling up Mean Field Games with Online Mirror Descent.
CoRR, 2021

What Matters for Adversarial Imitation Learning?
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Twice regularized MDPs and the equivalence between robustness and regularization.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Mean Field Games Flock! The Reinforcement Learning Way.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Hyperparameter Selection for Imitation Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Offline Reinforcement Learning with Pseudometric Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Adversarially Guided Actor-Critic.
Proceedings of the 9th International Conference on Learning Representations, 2021

Primal Wasserstein Imitation Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study.
Proceedings of the 9th International Conference on Learning Representations, 2021

Learning Behaviors through Physics-driven Latent Imagination.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

Show Me the Way: Intrinsic Motivation from Demonstrations.
Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

Self-Imitation Advantage Learning.
Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

2020
What Matters In On-Policy Reinforcement Learning? A Large-Scale Empirical Study.
CoRR, 2020

Leverage the Average: an Analysis of Regularization in RL.
CoRR, 2020

Filling Gaps in Micro-meteorological Data.
Proceedings of the Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track, 2020

Munchausen Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Monte-Carlo Localization on Metal Plates Based on Ultrasonic Guided Waves.
Proceedings of the Experimental Robotics - The 17th International Symposium, 2020

Self-Attentional Credit Assignment for Transfer in Reinforcement Learning.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Image-Based Place Recognition on Bucolic Environment Across Seasons From Semantic Edge Description.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Modified Actor-Critics.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

CopyCAT: : Taking Control of Neural Policies with Constant Attacks.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Momentum in Reinforcement Learning.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Foolproof Cooperative Learning.
Proceedings of The 12th Asian Conference on Machine Learning, 2020

Deep Conservative Policy Iteration.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

On the Convergence of Model Free Learning in Mean Field Games.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Stable and Efficient Policy Evaluation.
IEEE Trans. Neural Networks Learn. Syst., 2019

Image-Based Place Recognition on Bucolic Environment Across Seasons From Semantic Edge Description.
CoRR, 2019

On Connections between Constrained Optimization and Reinforcement Learning.
CoRR, 2019

Credit Assignment as a Proxy for Transfer in Reinforcement Learning.
CoRR, 2019

Approximate Fictitious Play for Mean Field Games.
CoRR, 2019

MULEX: Disentangling Exploitation from Exploration in Deep RL.
CoRR, 2019

Targeted Attacks on Deep Reinforcement Learning Agents through Adversarial Observations.
CoRR, 2019

Image-Based Text Classification using 2D Convolutional Neural Networks.
Proceedings of the 2019 IEEE SmartWorld, 2019

Learning Sensor Placement from Demonstration for UAV networks.
Proceedings of the 2019 IEEE Symposium on Computers and Communications, 2019

Semi-supervised Domain Adaptation with Representation Learning for Semantic Segmentation Across Time.
Proceedings of the Neural Information Processing - 26th International Conference, 2019

Learning from a Learner.
Proceedings of the 36th International Conference on Machine Learning, 2019

A Theory of Regularized Markov Decision Processes.
Proceedings of the 36th International Conference on Machine Learning, 2019

ELF: Embedded Localisation of Features in Pre-Trained CNN.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Importance Sampling for Deep System Identification.
Proceedings of the 19th International Conference on Advanced Robotics, 2019

Deep Reinforcement Learning-based Continuous Control for Multicopter Systems.
Proceedings of the 6th International Conference on Control, 2019

2018
Image-based Natural Language Understanding Using 2D Convolutional Neural Networks.
CoRR, 2018

Anderson Acceleration for Reinforcement Learning.
CoRR, 2018

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation.
CoRR, 2018

A Deep Learning Approach for Privacy Preservation in Assisted Living.
Proceedings of the 2018 IEEE International Conference on Pervasive Computing and Communications Workshops, 2018

2017
Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning.
IEEE Trans. Neural Networks Learn. Syst., 2017

Reconstruct & Crush Network.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Is the Bellman residual a bad proxy?
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Human Activity Recognition Using Recurrent Neural Networks.
Proceedings of the Machine Learning and Knowledge Extraction, 2017

2016
Difference of Convex Functions Programming Applied to Control with Expert Data.
CoRR, 2016

Should one minimize the expected Bellman residual or maximize the mean value?
CoRR, 2016

Softened Approximate Policy Iteration for Markov Games.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Score-based Inverse Reinforcement Learning.
Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

2015
Recherche locale de politique dans un espace convexe.
Rev. d'Intelligence Artif., 2015

Soft-max boosting.
Mach. Learn., 2015

Approximate modified policy iteration and its application to the game of Tetris.
J. Mach. Learn. Res., 2015

Inverse Reinforcement Learning in Relational Domains.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Imitation Learning Applied to Embodied Conversational Agents.
Proceedings of the 4th Workshop on Machine Learning for Interactive Systems, 2015

Convolutional and Recurrent Neural Networks for Activity Recognition in Smart Environment.
Proceedings of the Towards Integrative Machine Learning and Knowledge Extraction, 2015

2014
Off-policy learning with eligibility traces: a survey.
J. Mach. Learn. Res., 2014

Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

Boosted Bellman Residual Minimization Handling Expert Demonstrations.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

Difference of Convex Functions Programming for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Predicting when to laugh with structured classification.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Boosted and reward-regularized classification for apprenticeship learning.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2014

2013
Algorithmic Survey of Parametric Value Function Approximation.
IEEE Trans. Neural Networks Learn. Syst., 2013

Classification structurée pour l'apprentissage par renforcement inverse.
Rev. d'Intelligence Artif., 2013

A C++ template-based reinforcement learning library: fitting the code to the mathematics.
J. Mach. Learn. Res., 2013

Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee.
CoRR, 2013

Model-free POMDP optimisation of tutoring systems with echo-state networks.
Proceedings of the SIGDIAL 2013 Conference, 2013

Learning from Demonstrations: Is It Worth Estimating a Reward Function?
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2013

A Cascaded Supervised Learning Approach to Inverse Reinforcement Learning.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2013

Particle swarm optimisation of spoken dialogue system strategies.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Random projections: A remedy for overfitting issues in time series prediction with echo state networks.
Proceedings of the IEEE International Conference on Acoustics, 2013

Laugh-aware virtual agent and its impact on user amusement.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

2012
A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization.
IEEE J. Sel. Top. Signal Process., 2012

Optimisation d'un tuteur intelligent à partir d'un jeu de données fixé (Optimization of a tutoring system from a fixed set of data) [in French].
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

Inverse Reinforcement Learning through Structured Classification.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Co-adaptation in Spoken Dialogue Systems.
Proceedings of the Natural Interaction with Robots, 2012

Approximate Modified Policy Iteration.
Proceedings of the 29th International Conference on Machine Learning, 2012

A Dantzig Selector Approach to Temporal Difference Learning.
Proceedings of the 29th International Conference on Machine Learning, 2012

Off-policy learning in large-scale POMDP-based dialogue systems.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Clustering behaviors of Spoken Dialogue Systems users.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Monte-Carlo Swarm Policy Search.
Proceedings of the Swarm and Evolutionary Computation, 2012

Behavior Specific User Simulation in Spoken Dialogue Systems.
Proceedings of the 10th ITG Conference on Speech Communication, 2012

2011
Sample-efficient batch reinforcement learning for dialogue management optimization.
ACM Trans. Speech Lang. Process., 2011

Managing Uncertainty within KTD.
Proceedings of the Active Learning and Experimental Design workshop, 2011

Optimization of a tutoring system from a fixed set of data.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2011

Uncertainty Management for On-Line Optimisation of a POMDP-Based Large-Scale Spoken Dialogue System.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

User Simulation in Dialogue Systems Using Inverse Reinforcement Learning.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Sample Efficient On-Line Learning of Optimal Dialogue Policies with Kalman Temporal Differences.
Proceedings of the IJCAI 2011, 2011

A Non-parametric Approach to Approximate Dynamic Programming.
Proceedings of the 10th International Conference on Machine Learning and Applications and Workshops, 2011

Performance evaluation for particle filters.
Proceedings of the 14th International Conference on Information Fusion, 2011

Recursive Least-Squares Learning with Eligibility Traces.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Batch, Off-Policy and Model-Free Apprenticeship Learning.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

ℓ1-Penalized Projected Bellman Residual.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Dynamic neural field optimization using the unscented Kalman filter.
Proceedings of the 2011 IEEE Symposium on Computational Intelligence, 2011

Parametric value function approximation: A unified view.
Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning, 2011

2010
Différences temporelles de Kalman. Cas déterministe.
Rev. d'Intelligence Artif., 2010

Kalman Temporal Differences.
J. Artif. Intell. Res., 2010

Sparse Approximate Dynamic Programming for Dialog Management.
Proceedings of the SIGDIAL 2010 Conference, 2010

Revisiting Natural Actor-Critics with Value Function Approximation.
Proceedings of the Modeling Decisions for Artificial Intelligence, 2010

Optimizing spoken dialogue management with fitted value iteration.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Eligibility traces through colored noises.
Proceedings of the International Conference on Ultra Modern Telecommunications, 2010

Statistically linearized least-squares temporal differences.
Proceedings of the International Conference on Ultra Modern Telecommunications, 2010

2009
Tracking in Reinforcement Learning.
Proceedings of the Neural Information Processing, 16th International Conference, 2009

Kernelizing Vector Quantization Algorithms.
Proceedings of the 17th European Symposium on Artificial Neural Networks, 2009

Kalman Temporal Differences: The deterministic case.
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009

2008
Bayesian Reward Filtering.
Proceedings of the Recent Advances in Reinforcement Learning, 8th European Workshop, 2008


  Loading...