2025
Reinforcement Learning from Human Feedback with High-Confidence Safety Constraints.
CoRR, June, 2025
A Descriptive and Normative Theory of Human Beliefs in RLHF.
CoRR, June, 2025
Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation.
CoRR, April, 2025
Fast Adaptation with Behavioral Foundation Models.
CoRR, April, 2025
Supervised Reward Inference.
CoRR, February, 2025
Influencing Humans to Conform to Preference Models for RLHF.
CoRR, January, 2025
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
2024
Models of human preference for learning reward functions.
Trans. Mach. Learn. Res., 2024
Granger Causal Interaction Skill Chains.
Trans. Mach. Learn. Res., 2024
RL Zero: Zero-Shot Language to Behaviors without any Supervision.
CoRR, 2024
Pareto-Optimal Learning from Preferences with Hidden Context.
CoRR, 2024
Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
D2PO: Discriminator-Guided DPO with Response Evaluation Models.
CoRR, 2024
Automated Discovery of Functional Actual Causes in Complex Environments.
CoRR, 2024
Learning Action-based Representations Using Invariance.
RLJ, 2024
SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Predicting Future Actions of Reinforcement Learning Agents.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Gaze Supervision for Mitigating Causal Confusion in Driving Agents.
Proceedings of the IEEE Intelligent Vehicles Symposium, 2024
Dual RL: Unification and New Methods for Reinforcement and Imitation Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Score Models for Offline Goal-Conditioned Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Contrastive Preference Learning: Learning from Human Feedback without Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
A Dual Approach to Imitation Learning from Observations with Offline Datasets.
Proceedings of the Conference on Robot Learning, 6-9 November 2024, Munich, Germany., 2024
Learning Optimal Advantage from Preferences and Mistaking It for Reward.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
A Ranking Game for Imitation Learning.
Trans. Mach. Learn. Res., 2023
Contrastive Preference Learning: Learning from Human Feedback without RL.
CoRR, 2023
Hierarchical Empowerment: Towards Tractable Empowerment-Based Skill-Learning.
CoRR, 2023
Granger-Causal Hierarchical Skill Discovery.
CoRR, 2023
Imitation from Arbitrary Experience: A Dual Unification of Reinforcement and Imitation Learning Methods.
CoRR, 2023
Language-guided Task Adaptation for Imitation Learning.
CoRR, 2023
The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Know Your Boundaries: The Necessity of Explicit Behavioral Cloning in Offline RL.
CoRR, 2022
Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?
Proceedings of the Learning for Dynamics and Control Conference, 2022
Understanding Acoustic Patterns of Human Teachers Demonstrating Manipulation Tasks to Robots.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022
Fairness Guarantees under Demographic Shift.
Proceedings of the Tenth International Conference on Learning Representations, 2022
2021
Importance sampling in reinforcement learning with an estimated behavior policy.
Mach. Learn., 2021
A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms.
J. Mach. Learn. Res., 2021
Robust Generative Adversarial Imitation Learning via Local Lipschitzness.
CoRR, 2021
Zero-shot Task Adaptation using Natural Language.
CoRR, 2021
SOPE: Spectrum of Off-Policy Estimators.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Adversarial Intrinsic Motivation for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Universal Off-Policy Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Self-Supervised Online Reward Shaping in Sparse-Reward Environments.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021
Understanding the Relationship between Interactions and Outcomes in Human-in-the-Loop Machine Learning.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
ScrewNet: Category-Independent Articulation Model Estimation From Depth Images Using Screw Theory.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021
Value Alignment Verification.
Proceedings of the 38th International Conference on Machine Learning, 2021
SCAPE: Learning Stiffness Control from Augmented Position Control Experiences.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021
Distributional Depth-Based Estimation of Object Articulation Models.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021
You Only Evaluate Once: a Simple Baseline Algorithm for Offline RL.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021
Efficiently Guiding Imitation Learning Agents with Human Gaze.
Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021
Demonstration of the EMPATHIC Framework for Task Learning from Implicit Human Feedback.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Value Alignment Verification.
CoRR, 2020
ScrewNet: Category-Independent Articulation Model Estimation From Depth Images Using Screw Theory.
CoRR, 2020
Efficiently Guiding Imitation Learning Algorithms with Human Gaze.
CoRR, 2020
Local Nonparametric Meta-Learning.
CoRR, 2020
Bayesian Robust Optimization for Imitation Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Learning Hybrid Object Kinematics for Efficient Hierarchical Planning Under Uncertainty.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020
Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020
Human Gaze Assisted Artificial Intelligence: A Review.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences.
Proceedings of the 37th International Conference on Machine Learning, 2020
PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards.
Proceedings of the 4th Conference on Robot Learning, 2020
The EMPATHIC Framework for Task Learning from Implicit Human Feedback.
Proceedings of the 4th Conference on Robot Learning, 2020
2019
Deep Bayesian Reward Learning from Preferences.
CoRR, 2019
Ranking-Based Reward Extrapolation without Rankings.
CoRR, 2019
Using Natural Language for Reward Shaping in Reinforcement Learning.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
One-Shot Learning of Multi-Step Tasks from Observation via Activity Localization in Auxiliary Video.
Proceedings of the International Conference on Robotics and Automation, 2019
Uncertainty-Aware Data Aggregation for Deep Imitation Learning.
Proceedings of the International Conference on Robotics and Automation, 2019
Importance Sampling Policy Evaluation with an Estimated Behavior Policy.
Proceedings of the 36th International Conference on Machine Learning, 2019
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations.
Proceedings of the 36th International Conference on Machine Learning, 2019
Enhancing Robot Learning with Human Social Cues.
Proceedings of the 14th ACM/IEEE International Conference on Human-Robot Interaction, 2019
Learning from Corrective Demonstrations.
Proceedings of the 14th ACM/IEEE International Conference on Human-Robot Interaction, 2019
Understanding Teacher Gaze Patterns for Robot Learning.
Proceedings of the 3rd Annual Conference on Robot Learning, 2019
Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations.
Proceedings of the 3rd Annual Conference on Robot Learning, 2019
Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
LAAIR: A Layered Architecture for Autonomous Interactive Robots.
CoRR, 2018
Towards Online Learning from Corrective Demonstrations.
CoRR, 2018
Learning Multi-Step Robotic Tasks from Observation.
CoRR, 2018
Human Gaze Following for Human-Robot Interaction.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018
Incremental Task Modification via Corrective Demonstrations.
Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018
Active Reward Learning from Critiques.
Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018
Asking for Help Effectively via Modeling of Human Beliefs.
Proceedings of the Companion of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, 2018
Efficient Hierarchical Robot Motion Planning Under Uncertainty and Hybrid Dynamics.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018
Risk-Aware Active Inverse Reinforcement Learning.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018
Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018
Safe Reinforcement Learning via Shielding.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018
2017
Viewpoint selection for visual failure detection.
Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017
Classification error correction: A case study in brain-computer interfacing.
Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017
Data-Efficient Policy Evaluation Through Behavior Policy Search.
Proceedings of the 34th International Conference on Machine Learning, 2017
Toward Probabilistic Safety Bounds for Robot Learning from Demonstration.
Proceedings of the 2017 AAAI Fall Symposia, Arlington, Virginia, USA, November 9-11, 2017, 2017
Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017
2016
High Confidence Off-Policy Evaluation with Models.
CoRR, 2016
On the Analysis of Complex Backup Strategies in Monte Carlo Tree Search.
Proceedings of the 33nd International Conference on Machine Learning, 2016
2015
Learning grounded finite-state representations from unstructured demonstrations.
Int. J. Robotics Res., 2015
Policy Evaluation Using the Ω-Return.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015
Online Bayesian changepoint detection for articulated motion models.
Proceedings of the IEEE International Conference on Robotics and Automation, 2015
Active articulation model estimation through interactive perception.
Proceedings of the IEEE International Conference on Robotics and Automation, 2015
2014
Learning pouring skills from demonstration and practice.
Proceedings of the 14th IEEE-RAS International Conference on Humanoid Robots, 2014
2013
Incremental Semantically Grounded Learning from Demonstration.
Proceedings of the Robotics: Science and Systems IX, Technische Universität Berlin, Berlin, Germany, June 24, 2013
An Integrated System for Learning Multi-Step Robotic Tasks from Unstructured Demonstrations.
Proceedings of the Designing Intelligent Robots: Reintegrating AI II, 2013
2012
Learning and generalization of complex tasks from unstructured demonstrations.
Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012
Complex Task Learning from Unstructured Demonstrations.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012
2011
TD_gamma: Re-evaluating Complex Backups in Temporal Difference Learning.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011
Evolution of reward functions for reinforcement learning.
Proceedings of the 13th Annual Genetic and Evolutionary Computation Conference, 2011
Clustering via Dirichlet Process Mixture Models for Portable Skill Discovery.
Proceedings of the Lifelong Learning, 2011
2010
Genetic Programming for Reward Function Search.
IEEE Trans. Auton. Ment. Dev., 2010
Evolved Intrinsic Reward Functions for Reinforcement Learning.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010