We stand with Ukraine

We stand with Ukraine

Yuanzhao Zhai

Orcid: 0000-0003-1385-0074

According to our database¹, Yuanzhao Zhai authored at least 23 papers between 2021 and 2024.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

IEEE Trans. Artif. Intell., November, 2024

Nuclear Norm Maximization-Based Curiosity-Driven Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

IEEE Trans. Artif. Intell., May, 2024

Dynamic Memory-Based Curiosity: A Bootstrap Approach for Exploration in Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

IEEE Trans. Emerg. Top. Comput. Intell., April, 2024

C3F: Constant Collaboration and Communication Framework for Graph-Representation Dynamic Multi-Robotic Systems.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Robotics Autom. Lett., January, 2024

Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2024

Online Self-Preferring Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

COPR: Continual Human Preference Learning via Optimal Policy Regularization.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

Iterative Regularized Policy Optimization with Imperfect Demonstrations.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Nuclear-Norm Maximization for Low-Rank Updates.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Optimistic Model Rollouts for Pessimistic Offline Policy Optimization.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

COPF: Continual Learning Human Preference through Optimal Policy Fitting.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2023

Diversifying Message Aggregation in Multi-Agent Communication Via Normalized Tensor Nuclear Norm Regularization.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Progressive Diversifying Policy for Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

CRMRL: Collaborative Relationship Meta Reinforcement Learning for Effectively Adapting to Type Changes in Multi-Robotic System.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Robotics Autom. Lett., 2022

A Fast and Robust Solution for Common Knowledge Formation in Decentralized Swarm Robots.

[BibT_eX]

[DOI]

,

,

,

,

,

J. Intell. Robotic Syst., 2022

Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2022

Dynamic Memory-based Curiosity: A Bootstrap Approach for Exploration.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2022

Exploring Policy Diversity in Parallel Actor-Critic Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 34th IEEE International Conference on Tools with Artificial Intelligence, 2022

Pseudo Reward and Action Importance Classification for Sparse Reward Problem.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the ICMLC 2022: 14th International Conference on Machine Learning and Computing, Guangzhou, China, February 18, 2022

2021

Cloudroid Swarm: A QoS-Aware Framework for Multirobot Cooperation Offloading.

[BibT_eX]

[DOI]

,

,

,

Wirel. Commun. Mob. Comput., 2021

Decentralized Multi-Robot Collision Avoidance in Complex Scenarios With Selective Communication.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Robotics Autom. Lett., 2021

Accelerating Robot Reinforcement Learning with Samples of Different Simulation Precision.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, 2021

Loading...