Yaodong Yang

Orcid: 0000-0001-8132-5613

Affiliations:
  • Peking University, Institute for AI, Beijing, China
  • King's College London, UK (former)
  • Huawei Technologies, Noah's Ark Lab, UK (former)
  • University College London, UK (PhD)


According to our database1, Yaodong Yang authored at least 159 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Self-Supervised MAFENN for Classifying Low-Labeled Distorted Images Over Mobile Fading Channels.
IEEE Trans. Mob. Comput., August, 2024

ASP: Learn a Universal Neural Solver!
IEEE Trans. Pattern Anal. Mach. Intell., June, 2024

Grasp Multiple Objects With One Hand.
IEEE Robotics Autom. Lett., May, 2024

Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

RoMAT: Role-based multi-agent transformer for generalizable heterogeneous cooperation.
Neural Networks, 2024

Adaptive pessimism via target Q-value for offline reinforcement learning.
Neural Networks, 2024

Heterogeneous-Agent Reinforcement Learning.
J. Mach. Learn. Res., 2024

Neural Attention Field: Emerging Point Relevance in 3D Scenes for One-Shot Dexterous Grasping.
CoRR, 2024

Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games.
CoRR, 2024

Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback.
CoRR, 2024

A Survey on Self-play Methods in Reinforcement Learning.
CoRR, 2024

ProgressGym: Alignment with a Millennium of Moral Progress.
CoRR, 2024

PKU-SafeRLHF: A Safety Alignment Preference Dataset for Llama Family Models.
CoRR, 2024

SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset.
CoRR, 2024

Language Models Resist Alignment.
CoRR, 2024

Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles.
CoRR, 2024

Efficient Model-agnostic Alignment via Bayesian Persuasion.
CoRR, 2024

Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation.
CoRR, 2024

Correlated Mean Field Imitation Learning.
CoRR, 2024

UniDexFPM: Universal Dexterous Functional Pre-grasp Manipulation Via Diffusion Policy.
CoRR, 2024

Leveraging Team Correlation for Approximating Equilibrium in Two-Team Zero-Sum Games.
CoRR, 2024

Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects.
CoRR, 2024

Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective.
CoRR, 2024

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction.
CoRR, 2024

Panacea: Pareto Alignment via Preference Adaptation for LLMs.
CoRR, 2024

CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents.
CoRR, 2024

Remember the Past for Better Future: Memory-Augmented Offline RL.
Proceedings of the International Joint Conference on Neural Networks, 2024

Off-Agent Trust Region Policy Optimization.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Maximum Entropy Heterogeneous-Agent Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Safe RLHF: Safe Reinforcement Learning from Human Feedback.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

SafeDreamer: Safe Reinforcement Learning with World Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

A Summary of Online Markov Decision Processes with Non-oblivious Strategic Adversary.
Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

GIPUT: Maximizing Photo Coverage Efficiency for UAV Trajectory.
Proceedings of the Web and Big Data - 8th International Joint Conference, 2024

ProAgent: Building Proactive Cooperative Agents with Large Language Models.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

STAS: Spatial-Temporal Return Decomposition for Solving Sparse Rewards Problems in Multi-agent Reinforcement Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Large sequence models for sequential decision-making: a survey.
Frontiers Comput. Sci., December, 2023

Safe multi-agent reinforcement learning for multi-robot control.
Artif. Intell., June, 2023

Online Markov decision processes with non-oblivious strategic adversary.
Auton. Agents Multi Agent Syst., June, 2023

Offline Pre-trained Multi-agent Decision Transformer.
Mach. Intell. Res., April, 2023

JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games.
Trans. Mach. Learn. Res., 2023

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning.
J. Mach. Learn. Res., 2023

TorchOpt: An Efficient Library for Differentiable Optimization.
J. Mach. Learn. Res., 2023

MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library.
J. Mach. Learn. Res., 2023

JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models.
CoRR, 2023

AI Alignment: A Comprehensive Survey.
CoRR, 2023

Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark.
CoRR, 2023

Masked Pretraining for Multi-Agent Decision Making.
CoRR, 2023

MIR2: Towards Provably Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization.
CoRR, 2023

Measuring Value Understanding in Language Models through Discriminator-Critique Gap.
CoRR, 2023

Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models.
CoRR, 2023

Mixup-Augmented Meta-Learning for Sample-Efficient Fine-Tuning of Protein Simulators.
CoRR, 2023

ProAgent: Building Proactive Cooperative AI with Large Language Models.
CoRR, 2023

Safe DreamerV3: Safe Reinforcement Learning with World Models.
CoRR, 2023

BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset.
CoRR, 2023

Maximum Entropy Heterogeneous-Agent Mirror Learning.
CoRR, 2023

Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork.
CoRR, 2023

Heterogeneous Value Evaluation for Large Language Models.
CoRR, 2023

Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game.
CoRR, 2023

OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research.
CoRR, 2023

Heterogeneous-Agent Reinforcement Learning.
CoRR, 2023

STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning.
CoRR, 2023

A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors.
CoRR, 2023

MANSA: Learning Fast and Slow in Multi-Agent Systems.
CoRR, 2023

MSRL: Distributed Reinforcement Learning with Dataflow Fragments.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023

Multi-Agent First Order Constrained Optimization in Policy Space.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Policy Space Diversity for Non-Transitive Games.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Hierarchical Multi-Agent Skill Discovery.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Team-PSRO for Learning Approximate TMECor in Large Team Games via Cooperative Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

GenDexGrasp: Generalizable Dexterous Grasping.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

RLAfford: End-to-End Affordance Learning for Robotic Manipulation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models.
Proceedings of the International Conference on Machine Learning, 2023

Regret-Minimizing Double Oracle for Extensive-Form Games.
Proceedings of the International Conference on Machine Learning, 2023

A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems.
Proceedings of the International Conference on Machine Learning, 2023

MANSA: Learning Fast and Slow in Multi-Agent Systems.
Proceedings of the International Conference on Machine Learning, 2023

Quality-Similar Diversity via Population Based Reinforcement Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

Dynamic Handover: Throw and Catch with Bimanual Hands.
Proceedings of the Conference on Robot Learning, 2023

Is Nash Equilibrium Approximator Learnable?
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Subspace-Aware Exploration for Sparse-Reward Multi-Agent Tasks.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Learning to Shape Rewards Using a Game of Two Partners.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

ACE: Cooperative Multi-Agent Q-learning with Bidirectional Action-Dependency.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Online Double Oracle.
Trans. Mach. Learn. Res., 2022

Illiquidity Comovement and Market Crisis.
J. Syst. Sci. Complex., 2022

Contextual Transformer for Offline Meta Reinforcement Learning.
CoRR, 2022

MARLlib: Extending RLlib for Multi-agent Reinforcement Learning.
CoRR, 2022

End-to-End Affordance Learning for Robotic Manipulation.
CoRR, 2022

Heterogeneous-Agent Mirror Learning: A Continuum of Solutions to Cooperative MARL.
CoRR, 2022

Fully Decentralized Model-based Policy Optimization for Networked Systems.
CoRR, 2022

Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning.
CoRR, 2022

Learning Risk-Averse Equilibria in Multi-Agent Systems.
CoRR, 2022

A Review of Safe Reinforcement Learning: Methods, Theory and Applications.
CoRR, 2022

Understanding Value Decomposition Algorithms in Deep Cooperative Multi-Agent Reinforcement Learning.
CoRR, 2022

Settling the Communication Complexity for Distributed Offline Reinforcement Learning.
CoRR, 2022

Efficient Policy Space Response Oracles.
CoRR, 2022

Measuring the Non-Transitivity in Chess.
Algorithms, 2022

Debias the Black-Box: A Fair Ranking Framework via Knowledge Distillation.
Proceedings of the Web Information Systems Engineering - WISE 2022, 2022

Constrained Update Projection Approach to Safe Policy Optimization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Scalable Model-based Policy Optimization for Decentralized Networked Systems.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

On the Convergence of Fictitious Play: A Decomposition Approach.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

A Game-Theoretic Approach to Multi-agent Trust Region Optimization.
Proceedings of the Distributed Artificial Intelligence - 4th International Conference, 2022

2021
Many-agent reinforcement learning
PhD thesis, 2021

On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games.
Electron. Colloquium Comput. Complex., 2021

Settling the Bias and Variance of Meta-Gradient Estimation for Meta-Reinforcement Learning.
CoRR, 2021

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks.
CoRR, 2021

A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers.
CoRR, 2021

DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention.
CoRR, 2021

Multi-Agent Constrained Policy Optimisation.
CoRR, 2021

Revisiting the Characteristics of Stochastic Gradient Noise and Dynamics.
CoRR, 2021

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning.
CoRR, 2021

Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games.
CoRR, 2021

Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games.
CoRR, 2021

Learning to Shape Rewards using a Game of Switching Controls.
CoRR, 2021

Modelling Behavioural Diversity for Learning in Open-Ended Games.
CoRR, 2021

Online Double Oracle.
CoRR, 2021

Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Settling the Variance of Multi-Agent Policy Gradients.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Neural Auto-Curricula in Two-Player Zero-Sum Games.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021


Modelling Behavioural Diversity for Learning in Open-Ended Games.
Proceedings of the 38th International Conference on Machine Learning, 2021

Learning in Nonzero-Sum Stochastic Games with Potentials.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
Order Execution Probability and Order Queue in Limit Order Markets.
J. Syst. Sci. Complex., 2020

Can deep learning predict risky retail investors? A case study in financial risk behavior forecasting.
Eur. J. Oper. Res., 2020

An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective.
CoRR, 2020

Replica-Exchange Nosé-Hoover Dynamics for Bayesian Learning on Large Datasets.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Multi-Agent Determinantal Q-Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Learning to Infer User Hidden States for Online Sequential Advertising.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

αα-Rank: Practically Scaling α-Rank through Stochastic Optimisation.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Sequential Advertising Agent with Interpretable User Hidden Intents.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Bi-Level Actor-Critic for Multi-Agent Coordination.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning.
CoRR, 2019

Multi-Agent Generalized Recursive Reasoning.
CoRR, 2019

Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning.
Proceedings of the World Wide Web Conference, 2019

Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning.
Proceedings of the 7th International Conference on Learning Representations, 2019

Adversarial Variational Bayes Methods for Tweedie Compound Poisson Mixed Models.
Proceedings of the IEEE International Conference on Acoustics, 2019

Factorized Q-learning for large-scale multi-agent systems.
Proceedings of the First International Conference on Distributed Artificial Intelligence, 2019

2018
Benchmarking Deep Sequential Models on Volatility Predictions for Financial Time Series.
CoRR, 2018

Factorized Q-Learning for Large-Scale Multi-Agent Systems.
CoRR, 2018

Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Mean Field Multi-Agent Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

A Study of AI Population Dynamics with Million-agent Reinforcement Learning.
Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

2017
An Empirical Study of AI Population Dynamics with Million-agent Reinforcement Learning.
CoRR, 2017

Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games.
CoRR, 2017


  Loading...