Yangchen Pan

Philip H. S. Torr

Jindong Gu

CoRR, 2023

Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods.

[BibT_eX]

[DOI]

Avery Ma

CoRR, 2023

Conditionally optimistic exploration for cooperative deep multi-agent reinforcement learning.

[BibT_eX]

[DOI]

Janarthanan Rajendran

Proceedings of the Uncertainty in Artificial Intelligence, 2023

An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The In-Sample Softmax for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Label Alignment Regularization for Distribution Shift.

[BibT_eX]

[DOI]

CoRR, 2022

Memory-efficient Reinforcement Learning with Knowledge Consolidation.

[BibT_eX]

[DOI]

CoRR, 2022

Understanding and mitigating the limitations of prioritized experience replay.

[BibT_eX]

[DOI]

Jincheng Mei

Proceedings of the Uncertainty in Artificial Intelligence, 2022

TOPS: Transition-Based Volatility-Reduced Policy Search.

[BibT_eX]

[DOI]

Proceedings of the Autonomous Agents and Multiagent Systems. Best and Visionary Papers, 2022

An Alternate Policy Gradient Estimator for Softmax Policies.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online.

[BibT_eX]

[DOI]

Kirby Banman

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Beyond Prioritized Replay: Sampling States in Model-Based RL via Simulated Priorities.

[BibT_eX]

[DOI]

Jincheng Mei

Hengshuai Yao

CoRR, 2020

An implicit function learning approach for parametric modal regression.

[BibT_eX]

[DOI]

Ehsan Imani

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Frequency-based Search-control in Dyna.

[BibT_eX]

[DOI]

Jincheng Mei

Proceedings of the 8th International Conference on Learning Representations, 2020

Maxmin Q-learning: Controlling the Estimation Bias of Q-learning.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Deep Tile Coder: an Efficient Sparse Representation Learning Approach with applications in Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Hill Climbing on Value Estimates for Search-control in Dyna.

[BibT_eX]

[DOI]

Hengshuai Yao

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

2018

Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces.

[BibT_eX]

[DOI]

CoRR, 2018

Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

2017

Effective sketching methods for value function approximation.

[BibT_eX]

[DOI]

Erfan Sadeqi Azer

Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017

Adapting Kernel Representations Online Using Submodular Maximization.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

Accelerated Gradient Temporal Difference Learning.

[BibT_eX]

[DOI]

Adam White

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Incremental Truncated LSTD.

[BibT_eX]

[DOI]

Clement Gehring