Zongzhang Zhang

Orcid: 0000-0002-9238-4747

Affiliations:
  • Nanjing University, National Key Laboratory for Novel Software Technology, Nanjing, China
  • Soochow University, School of Computer Science and Technology, Suzhou, China (former)
  • University of Science and Technology of China, School of Computer Science and Technology, Hefei, China (former, PhD)


According to our database1, Zongzhang Zhang authored at least 80 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Communication-robust multi-agent learning by adaptable auxiliary multi-agent adversary generation.
Frontiers Comput. Sci., December, 2024

Model gradient: unified model and policy learning in model-based reinforcement learning.
Frontiers Comput. Sci., August, 2024

ODRL: A Benchmark for Off-Dynamics Reinforcement Learning.
CoRR, 2024

Hindsight Preference Learning for Offline Preference-based Reinforcement Learning.
CoRR, 2024

Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models.
CoRR, 2024

Q-Adapter: Training Your LLM Adapter as a Residual Q-Function.
CoRR, 2024

Alpha<sup>2</sup>: Discovering Logical Formulaic Alphas using Deep Reinforcement Learning.
CoRR, 2024

Reinforced In-Context Black-Box Optimization.
CoRR, 2024

Efficient and Stable Offline-to-online Reinforcement Learning via Continual Policy Revitalization.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Deep Demonstration Tracing: Learning Generalizable Imitator Policy for Runtime Imitation from a Single Demonstration.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Language Model Self-improvement by Reinforcement Learning Contemplation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Policy Rehearsing: Training Generalizable Policies for Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Attention-Guided Contrastive Role Representations for Multi-agent Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation.
Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

Deep Anomaly Detection via Active Anomaly Search.
Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

Multi-Expert Distillation for Few-Shot Coordination (Student Abstract).
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Generalizable Policy Improvement via Reinforcement Sampling (Student Abstract).
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Focus-Then-Decide: Segmentation-Assisted Reinforcement Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Imitator Learning: Achieve Out-of-the-Box Imitation Ability in Variable Environments.
CoRR, 2023

Robust Multi-agent Communication via Multi-view Message Certification.
CoRR, 2023

Efficient Communication via Self-supervised Information Aggregation for Online and Offline Multi-agent Reinforcement Learning.
CoRR, 2023

Internal Logical Induction for Pixel-Symbolic Reinforcement Learning.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Policy Regularization with Dataset Constraint for Offline Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

Retrosynthetic Planning with Dual Value Networks.
Proceedings of the International Conference on Machine Learning, 2023

Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Model-Based Offline Weighted Policy Optimization (Student Abstract).
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Anti-drifting Feature Selection via Deep Reinforcement Learning (Student Abstract).
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Policy-Independent Behavioral Metric-Based Representation for Deep Reinforcement Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Learning Generalizable Batch Active Learning Strategies via Deep Q-networks (Student Abstract).
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Expert Data Augmentation in Imitation Learning (Student Abstract).
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Towards Deployment-Efficient and Collision-Free Multi-Agent Path Finding (Student Abstract).
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Deep Anomaly Detection and Search via Reinforcement Learning (Student Abstract).
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Multi-Agent Policy Transfer via Task Relationship Modeling.
CoRR, 2022

Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Efficient Multi-agent Communication via Self-supervised Information Aggregation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Multi-agent Dynamic Algorithm Configuration.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Multi-Agent Concentrative Coordination with Decentralized Task Representation.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Efficient Multi-Agent Communication via Shapley Message Value.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Multi-Agent Incentive Communication via Decentralized Teammate Modeling.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Adapt to Environment Sudden Changes by Learning a Context Sensitive Policy.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Efficient policy detecting and reusing for non-stationarity in Markov games.
Auton. Agents Multi Agent Syst., 2021

Adaptive Online Packing-guided Search for POMDPs.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Enhancing Context-Based Meta-Reinforcement Learning Algorithms via An Efficient Task Encoder (Student Abstract).
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

LB-DESPOT: Efficient Online POMDP Planning Considering Lower Bound in Action Selection (Student Abstract).
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Efficient Multiagent Policy Optimization Based on Weighted Estimators in Stochastic Cooperative Environments.
J. Comput. Sci. Technol., 2020

Efficient Deep Reinforcement Learning via Adaptive Policy Transfer.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative Adversarial Nets.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Double Replay Buffers with Restricted Gradient.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

Recency-Weighted Acceleration for Continuous Control Through Deep Reinforcement Learning.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

Efficient Deep Reinforcement Learning through Policy Transfer.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Generative Adversarial Imitation Learning from Failed Experiences (Student Abstract).
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Third-Person Imitation Learning via Image Difference and Variational Discriminator Bottleneck (Student Abstract).
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Efficient reinforcement learning in continuous state and action spaces with Dyna and policy approximation.
Frontiers Comput. Sci., 2019

Monte Carlo Tree Search for Policy Optimization.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Experience Selection in Multi-agent Deep Reinforcement Learning.
Proceedings of the 31st IEEE International Conference on Tools with Artificial Intelligence, 2019

Deep Recurrent Policy Networks for Planning Under Partial Observability.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2019: Theoretical Neural Computation, 2019

2018
Hierarchical Deep Multiagent Reinforcement Learning.
CoRR, 2018

Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments.
CoRR, 2018

Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments.
Proceedings of the PRICAI 2018: Trends in Artificial Intelligence, 2018

ACGAIL: Imitation Learning About Multiple Intentions with Auxiliary Classifier GANs.
Proceedings of the PRICAI 2018: Trends in Artificial Intelligence, 2018

A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Asynchronous Value Iteration Network.
Proceedings of the Neural Information Processing - 25th International Conference, 2018

2017
Weighted Double Q-learning.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

2016
Reasoning and predicting POMDP planning complexity via covering numbers.
Frontiers Comput. Sci., 2016

Policy graph pruning and optimization in Monte Carlo Value Iteration for continuous-state POMDPs.
Proceedings of the 2016 IEEE Symposium Series on Computational Intelligence, 2016

Deep Q-Learning with Prioritized Sampling.
Proceedings of the Neural Information Processing - 23rd International Conference, 2016

Covering Number: Analyses for Approximate Continuous-state POMDP Planning (Extended Abstract).
Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

2015
PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces.
Proceedings of the Eighth Annual Symposium on Combinatorial Search, 2015

Intelligent Model Learning Based on Variance for Bayesian Reinforcement Learning.
Proceedings of the 27th IEEE International Conference on Tools with Artificial Intelligence, 2015

Trajectory Sampling Value Iteration: Improved Dyna Search for MDPs.
Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015

2014
Covering Number for Efficient Heuristic-based POMDP Planning.
Proceedings of the 31th International Conference on Machine Learning, 2014

Thompson Sampling Based Monte-Carlo Planning in POMDPs.
Proceedings of the Twenty-Fourth International Conference on Automated Planning and Scheduling, 2014

2012
FHHOP: A Factored Hybrid Heuristic Online Planning Algorithm for Large POMDPs.
Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, 2012

Covering Number as a Complexity Measure for POMDP Planning and Learning.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2010
Accelerating Point-Based POMDP Algorithms via Greedy Strategies.
Proceedings of the Simulation, Modeling, and Programming for Autonomous Robots, 2010


  Loading...