Yi Wu

Orcid: 0000-0001-9057-5817

Affiliations:

Tsinghua University, Institute of Interdisciplinary Information Sciences (IIIS), Beijing, China
University of California, Berkeley, CA, USA (PhD 2019)
Microsoft Research Asia

According to our database¹, Yi Wu authored at least 88 papers between 2012 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

OmniDrones: An Efficient and Flexible Platform for Reinforcement Learning in Drone Control.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., 2024

Quarl: A Learning-Based Quantum Circuit Optimizer.

[BibT_eX]

[DOI]

Proc. ACM Program. Lang., 2024

Few-shot In-Context Preference Learning Using Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

On Designing Effective RL Reward at Training Time for LLM Reasoning.

[BibT_eX]

[DOI]

CoRR, 2024

ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation.

[BibT_eX]

[DOI]

CoRR, 2024

FlightBench: A Comprehensive Benchmark of Spatial Planning Methods for Quadrotors.

[BibT_eX]

[DOI]

CoRR, 2024

Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Leveraging Symmetry in RL-based Legged Locomotion Control.

[BibT_eX]

[DOI]

Zhi Su

Xiaoyu Huang

Daniel Felipe Ordoñez Apraez

CoRR, 2024

Robot Synesthesia: In-Hand Manipulation with Visuotactile Sensing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

LAGOON: Language-Guided Motion Control.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Stylized Offline Reinforcement Learning: Extracting Diverse High-Quality Behaviors from Heterogeneous Datasets.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Beyond Information Gain: An Empirical Benchmark for Low-Switching-Cost Reinforcement Learning.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Learning Agile Bipedal Motions on a Quadrupedal Robot.

[BibT_eX]

[DOI]

CoRR, 2023

BitNet: Scaling 1-bit Transformers for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

DeRisk: An Effective Deep Learning Framework for Credit Risk Prediction over Real-World Financial Data.

[BibT_eX]

[DOI]

CoRR, 2023

SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores.

[BibT_eX]

[DOI]

CoRR, 2023

Language-Guided Generation of Physically Realistic Robot Motion and Control.

[BibT_eX]

[DOI]

CoRR, 2023

Grounding Object Relations in Language-Conditioned Robotic Manipulation with Semantic-Spatial Reasoning.

[BibT_eX]

[DOI]

Qian Luo

Yunfei Li

Yi Wu

CoRR, 2023

Iteratively Learn Diverse Strategies with State Distance Information.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

PhyloTransformer: A Self-supervised Discriminative Model for SARS-CoV-2 Viral Mutation Prediction Based on a Multi-head Self-attention Mechanism.

[BibT_eX]

[DOI]

Proceedings of the 6th International Workshop on Knowledge Discovery from Healthcare Data co-located with 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

Automatic Truss Design with Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Efficient Bimanual Handover and Rearrangement via Symmetry-Aware Actor-Critic Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2023

SpeedyZero: Mastering Atari with Limited Data and Time.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time Multi-Robot Cooperative Exploration.

[BibT_eX]

[DOI]

Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Fictitious Cross-Play: Learning Global Nash Equilibrium in Mixed Cooperative-Competitive Games.

[BibT_eX]

[DOI]

Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Differentiable Arbitrating in Zero-sum Markov Games.

[BibT_eX]

[DOI]

Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Maximum Entropy Population-Based Training for Zero-Shot Human-AI Coordination.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

AlphaSnake: Policy Iteration on a Nondeterministic NP-Hard Markov Decision Process (Student Abstract).

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

AlphaSnake: Policy Iteration on a Nondeterministic NP-hard Markov Decision Process.

[BibT_eX]

[DOI]

CoRR, 2022

Understanding Curriculum Learning in Policy Optimization for Solving Combinatorial Optimization Problems.

[BibT_eX]

[DOI]

CoRR, 2022

Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Grounded Reinforcement Learning: Learning to Win the Game under Human Commands.

[BibT_eX]

[DOI]

Shusheng Xu

Huaijie Wang

Yi Wu

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning Design and Construction with Varying-Sized Materials via Prioritized Memory Resets.

[BibT_eX]

[DOI]

Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Learning Efficient Multi-agent Cooperative Visual Exploration.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Sequence Level Contrastive Learning for Text Summarization.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Near-Linear Time Local Polynomial Nonparametric Estimation with Box Kernels.

[BibT_eX]

[DOI]

Yining Wang

Yi Wu

Simon S. Du

INFORMS J. Comput., 2021

Maximum Entropy Population Based Training for Zero-Shot Human-AI Coordination.

[BibT_eX]

[DOI]

CoRR, 2021

A Benchmark for Low-Switching-Cost Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Multi-Agent Vulnerability Discovery for Autonomous Driving with Hazard Arbitration Reward.

[BibT_eX]

[DOI]

CoRR, 2021

PhyloTransformer: A Discriminative Model for Mutation Prediction Based on a Multi-head Self-attention Mechanism.

[BibT_eX]

[DOI]

CoRR, 2021

Disentangled Attention as Intrinsic Regularization for Bimanual Multi-Object Manipulation.

[BibT_eX]

[DOI]

CoRR, 2021

The Surprising Effectiveness of MAPPO in Cooperative, Multi-Agent Games.

[BibT_eX]

[DOI]

CoRR, 2021

NovelD: A Simple yet Effective Exploration Criterion.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning to Design and Construct Bridge without Blueprint.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Temporal Induced Self-Play for Stochastic Bayesian Games.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Solving Compositional Reinforcement Learning Problems via Task Reduction.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Unlocking the Potential of MAPPO with Asynchronous Optimization.

[BibT_eX]

[DOI]

Proceedings of the Artificial Intelligence - First CAAI International Conference, 2021

2020

BeBold: Exploration Beyond the Boundary of Explored Regions.

[BibT_eX]

[DOI]

CoRR, 2020

Multi-Agent Collaboration via Reward Attribution Decomposition.

[BibT_eX]

[DOI]

CoRR, 2020

Multi-Task Reinforcement Learning with Soft Modularization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Emergent Tool Use From Multi-Agent Autocurricula.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Influence-Based Multi-Agent Exploration.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

2019

Bayesian Relational Memory for Semantic Visual Navigation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Deep Reinforcement Learning for Green Security Games with Real-Time Information.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Learning and Planning with a Semantic Model.

[BibT_eX]

[DOI]

CoRR, 2018

Near-Linear Time Local Polynomial Nonparametric Estimation.

[BibT_eX]

[DOI]

Yining Wang

Yi Wu

Simon S. Du

CoRR, 2018

Meta-Learning MCMC Proposals.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Building Generalizable Agents with a Realistic and Rich 3D Environment.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Deep Reinforcement Learning for Green Security Game with Online Information.

[BibT_eX]

[DOI]

Proceedings of the Workshops of the The Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Neural Block Sampling.

[BibT_eX]

[DOI]

CoRR, 2017

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Adversarial Training for Relation Extraction.

[BibT_eX]

[DOI]

Yi Wu

David Bamman

Stuart Russell

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

A Nearly-Black-Box Online Algorithm for Joint Parameter and State Estimation in Temporal Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Towards Practical Bayesian Parameter and State Estimation.

[BibT_eX]

[DOI]

CoRR, 2016

Value Iteration Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Swift: Compiled Inference for Probabilistic Programming Languages.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

2015

Understanding and Evaluating Sparse Linear Discriminant Analysis.

[BibT_eX]

[DOI]

Yi Wu

David P. Wipf

Jeong-Min Yun

Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, 2015

2012

Dual-Space Analysis of the Sparse Linear Model.

[BibT_eX]

[DOI]

David P. Wipf

Yi Wu

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Yi Wu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...