Ziniu Li

Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024

Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order Optimization.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023

Policy Optimization in RLHF: The Impact of Out-of-preference Data.

[BibT_eX]

[DOI]

CoRR, 2023

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Deploying Offline Reinforcement Learning with Human Feedback.

[BibT_eX]

[DOI]

CoRR, 2023

Theoretical Analysis of Offline Imitation With Supplementary Dataset.

[BibT_eX]

[DOI]

CoRR, 2023

Provably Efficient Adversarial Imitation Learning with Unknown Transitions.

[BibT_eX]

[DOI]

Proceedings of the Uncertainty in Artificial Intelligence, 2023

Imitation Learning from Imperfection: Theoretical Justifications and Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022

Error Bounds of Imitating Policies and Environments for Reinforcement Learning.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis.

[BibT_eX]

[DOI]

CoRR, 2022

A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle.

[BibT_eX]

[DOI]

CoRR, 2022

Rethinking ValueDice: Does It Really Improve Performance?

[BibT_eX]

[DOI]

CoRR, 2022

HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions.

[BibT_eX]

[DOI]

CoRR, 2021

2020

Solving the Inverse Design Problem of Electrical Fuse With Machine Learning.

[BibT_eX]

[DOI]

IEEE Access, 2020

Error Bounds of Imitating Policies and Environments.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Efficient Exploration by Novelty-Pursuit.

[BibT_eX]

[DOI]

Xiong-Hui Chen

Proceedings of the Distributed Artificial Intelligence - Second International Conference, 2020

2019

On Value Discrepancy of Imitation Learning.

[BibT_eX]

[DOI]