Aviral Kumar

According to our database1, Aviral Kumar authored at least 71 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance.
CoRR, 2024

Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning.
CoRR, 2024

RRM: Robust Reward Model Training Mitigates Reward Hacking.
CoRR, 2024

Training Language Models to Self-Correct via Reinforcement Learning.
CoRR, 2024

Generative Verifiers: Reward Modeling as Next-Token Prediction.
CoRR, 2024

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters.
CoRR, 2024

Recursive Introspection: Teaching Language Model Agents How to Self-Improve.
CoRR, 2024

RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold.
CoRR, 2024

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.
CoRR, 2024

Is Value Learning Really the Main Bottleneck in Offline RL?
CoRR, 2024

Unfamiliar Finetuning Examples Control How Language Models Hallucinate.
CoRR, 2024

Vision-Language Models Provide Promptable Representations for Reinforcement Learning.
CoRR, 2024

D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning.
RLJ, 2024

Robotic Offline RL from Internet Videos via Value-Function Learning.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Zero-Shot Robotic Manipulation with Pre-Trained Image-Editing Diffusion Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models.
CoRR, 2023

Latent Conservative Objective Models for Data-Driven Crystal Structure Prediction.
CoRR, 2023

Robotic Offline RL from Internet Videos via Value-Function Pre-Training.
CoRR, 2023

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning.
CoRR, 2023

Pre-Training for Robots: Offline RL Enables Learning New Tasks in a Handful of Trials.
Proceedings of the Robotics: Science and Systems XIX, Daegu, 2023

ReDS: Offline RL With Heteroskedastic Datasets via Support Constraints.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Efficient Deep Reinforcement Learning Requires Regulating Overfitting.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Confidence-Conditioned Value Functions for Offline Reinforcement Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning.
Proceedings of the Conference on Robot Learning, 2023


2022
Dual Generator Offline Reinforcement Learning.
CoRR, 2022

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints.
CoRR, 2022

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials.
CoRR, 2022

When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
CoRR, 2022

Off-Policy Actor-critic for Recommender Systems.
Proceedings of the RecSys '22: Sixteenth ACM Conference on Recommender Systems, Seattle, WA, USA, September 18, 2022

DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Data-Driven Offline Decision-Making via Invariant Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

How to Leverage Unlabeled Data in Offline Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2022

Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization.
Proceedings of the International Conference on Machine Learning, 2022

Data-Driven Offline Optimization for Architecting Hardware Accelerators.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Should I Run Offline Reinforcement Learning or Behavioral Cloning?
Proceedings of the Tenth International Conference on Learning Representations, 2022

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning.
Proceedings of the Conference on Robot Learning, 2022

2021
COMBO: Conservative Offline Model-Based Policy Optimization.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Conservative Objective Models for Effective Offline Model-Based Optimization.
Proceedings of the 38th International Conference on Machine Learning, 2021

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

Benchmarks for Deep Off-Policy Evaluation.
Proceedings of the 9th International Conference on Learning Representations, 2021

Conservative Safety Critics for Exploration.
Proceedings of the 9th International Conference on Learning Representations, 2021

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

A Workflow for Offline Model-Free Robotic Reinforcement Learning.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

2020
COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning.
CoRR, 2020

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems.
CoRR, 2020

D4RL: Datasets for Deep Data-Driven Reinforcement Learning.
CoRR, 2020

Conservative Q-Learning for Offline Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Model Inversion Networks for Model-Based Optimization.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Chaining Behaviors from Data with Model-Free Reinforcement Learning.
Proceedings of the 4th Conference on Robot Learning, 2020

2019
Reward-Conditioned Policies.
CoRR, 2019

Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning.
CoRR, 2019

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction.
CoRR, 2019

Calibration of Encoder Decoder Models for Neural Machine Translation.
CoRR, 2019

Graph Normalizing Flows.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Diagnosing Bottlenecks in Deep Q-learning Algorithms.
Proceedings of the 36th International Conference on Machine Learning, 2019

2018
Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
Challenges and Tool Implementation of Hybrid Rapidly-Exploring Random Trees.
Proceedings of the Numerical Software Verification - 10th International Workshop, 2017

The Reach-Avoid Problem for Constant-Rate Multi-mode Systems.
Proceedings of the Automated Technology for Verification and Analysis, 2017


  Loading...