Ziniu Li

Orcid: 0000-0003-0449-002X

According to our database1, Ziniu Li authored at least 25 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Sensing Jamming Strategy From Limited Observations: An Imitation Learning Perspective.
IEEE Trans. Signal Process., 2024

Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity.
CoRR, 2024

Adam-mini: Use Fewer Learning Rates To Gain More.
CoRR, 2024

BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation.
CoRR, 2024

On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization.
CoRR, 2024

Why Transformers Need Adam: A Hessian Perspective.
CoRR, 2024

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

When is RL better than DPO in RLHF? A Representation and Optimization Perspective.
Proceedings of the Second Tiny Papers Track at ICLR 2024, 2024

Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order Optimization.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023
Policy Optimization in RLHF: The Impact of Out-of-preference Data.
CoRR, 2023

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models.
CoRR, 2023

Deploying Offline Reinforcement Learning with Human Feedback.
CoRR, 2023

Theoretical Analysis of Offline Imitation With Supplementary Dataset.
CoRR, 2023

Provably Efficient Adversarial Imitation Learning with Unknown Transitions.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Imitation Learning from Imperfection: Theoretical Justifications and Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
Error Bounds of Imitating Policies and Environments for Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Understanding Adversarial Imitation Learning in Small Sample Regime: A Stage-coupled Analysis.
CoRR, 2022

A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle.
CoRR, 2022

Rethinking ValueDice: Does It Really Improve Performance?
CoRR, 2022

HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions.
CoRR, 2021

2020
Solving the Inverse Design Problem of Electrical Fuse With Machine Learning.
IEEE Access, 2020

Error Bounds of Imitating Policies and Environments.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Efficient Exploration by Novelty-Pursuit.
Proceedings of the Distributed Artificial Intelligence - Second International Conference, 2020

2019
On Value Discrepancy of Imitation Learning.
CoRR, 2019


  Loading...