Kaiqing Zhang

Orcid: 0000-0002-7446-7581

According to our database1, Kaiqing Zhang authored at least 97 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Last-Iterate Convergence of Payoff-Based Independent Learning in Zero-Sum Stochastic Games.
CoRR, 2024

RLHF from Heterogeneous Feedback via Personalization and Preference Aggregation.
CoRR, 2024

Do LLM Agents Have Regret? A Case Study in Online Learning and Games.
CoRR, 2024

Robot Fleet Learning via Policy Merging.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Toward a Theoretical Foundation of Policy Optimization for Learning Control Policies.
Annu. Rev. Control. Robotics Auton. Syst., May, 2023

Towards Understanding Asynchronous Advantage Actor-Critic: Convergence and Linear Speedup.
IEEE Trans. Signal Process., 2023

Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity.
J. Mach. Learn. Res., 2023

Two-Timescale Q-Learning with Function Approximation in Zero-Sum Stochastic Games.
CoRR, 2023

Fleet Policy Learning via Weight Merging and An Application to Robotic Tool-Use.
CoRR, 2023

Multi-Player Zero-Sum Markov Games with Networked Separable Interactions.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Self-Supervised Reinforcement Learning that Transfers using Random Features.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control?
Proceedings of the Learning for Dynamics and Control Conference, 2023

Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation.
Proceedings of the International Conference on Machine Learning, 2023

Partially Observable Multi-agent RL with (Quasi-)Efficiency: The Blessing of Information Sharing.
Proceedings of the International Conference on Machine Learning, 2023

Does Learning from Decentralized Non-IID Unlabeled Data Benefit from Self Supervision?
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning to Extrapolate: A Transductive Approach.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

The Power of Regularization in Solving Extensive-Form Games.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

The Complexity of Markov Equilibrium in Stochastic Games.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Toward Understanding State Representation Learning in MuZero: A Case Study in Linear Quadratic Gaussian Control.
Proceedings of the 62nd IEEE Conference on Decision and Control, 2023

Symmetric (Optimistic) Natural Policy Gradient for Multi-Agent Learning with Parameter Convergence.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Byzantine-Robust Online and Offline Distributed Reinforcement Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning.
IEEE Trans. Control. Netw. Syst., 2022

Does Decentralized Learning with Non-IID Unlabeled Data Benefit from Self Supervision?
CoRR, 2022

Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies.
CoRR, 2022

Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs.
CoRR, 2022

Globally Convergent Policy Search over Dynamic Filters for Output Estimation.
CoRR, 2022

Do Differentiable Simulators Give Better Policy Gradients?
CoRR, 2022

Fully asynchronous policy evaluation in distributed reinforcement learning over networks.
Autom., 2022

Fictitious Play in Markov Games with Single Controller.
Proceedings of the EC '22: The 23rd ACM Conference on Economics and Computation, Boulder, CO, USA, July 11, 2022

Globally Convergent Policy Search for Output Estimation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

What is a Good Metric to Study Generalization of Minimax Learners?
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Do Differentiable Simulators Give Better Policy Gradients?
Proceedings of the International Conference on Machine Learning, 2022

On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2022

Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence.
Proceedings of the International Conference on Machine Learning, 2022

Discrete Approximate Information States in Partially Observable Environments.
Proceedings of the American Control Conference, 2022

Convergence and optimality of policy gradient primal-dual method for constrained Markov decision processes.
Proceedings of the American Control Conference, 2022

2021
Reinforcement learning for multi-agent and robust control systems
PhD thesis, 2021

The Effect of Low-Intensity Transcranial Ultrasound Stimulation on Neural Oscillation and Hemodynamics in the Mouse Visual Cortex Depends on Anesthesia Level and Ultrasound Intensity.
IEEE Trans. Biomed. Eng., 2021

Finite-Sample Analysis for Decentralized Batch Multiagent Reinforcement Learning With Networked Agents.
IEEE Trans. Autom. Control., 2021

Policy Optimization for ℋ<sub>2</sub> Linear Control with ℋ<sub>∞</sub> Robustness Guarantee: Implicit Regularization and Global Convergence.
SIAM J. Control. Optim., 2021

Influence of behavioral state on the neuromodulatory effect of low-intensity transcranial ultrasound stimulation on hippocampal CA1 in mouse.
NeuroImage, 2021

Decentralized multi-agent reinforcement learning with networked agents: recent advances.
Frontiers Inf. Technol. Electron. Eng., 2021

Independent Learning in Stochastic Games.
CoRR, 2021

Decentralized Cooperative Multi-Agent Reinforcement Learning with Exploration.
CoRR, 2021

Derivative-Free Policy Optimization for Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity.
CoRR, 2021

Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Decentralized Q-learning in Zero-sum Markov Games.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Reinforcement Learning for Cost-Aware Markov Decision Processes.
Proceedings of the 38th International Conference on Machine Learning, 2021

Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs.
Proceedings of the 38th International Conference on Machine Learning, 2021

Learning Safe Multi-agent Control with Decentralized Neural Barrier Certificates.
Proceedings of the 9th International Conference on Learning Representations, 2021

Decentralized Policy Gradient Descent Ascent for Safe Multi-Agent Reinforcement Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies.
SIAM J. Control. Optim., 2020

Asynchronous Advantage Actor Critic: Non-asymptotic Analysis and Linear Speedup.
CoRR, 2020

Near-Optimal Regret Bounds for Model-Free RL in Non-Stationary Episodic MDPs.
CoRR, 2020

Asynchronous Policy Evaluation in Distributed Reinforcement Learning over Networks.
CoRR, 2020

Distributed learning of average belief over networks using sequential observations.
Autom., 2020

Robust Multi-Agent Reinforcement Learning with Model Uncertainty.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

On the Stability and Convergence of Robust Adversarial Reinforcement Learning: A Case Study on Linear Quadratic Systems.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

POLY-HOOT: Monte-Carlo Planning in Continuous Space MDPs with Non-Asymptotic Analysis.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Natural Policy Gradient Primal-Dual Method for Constrained Markov Decision Processes.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic Mean-Field Games.
Proceedings of the 59th IEEE Conference on Decision and Control, 2020

Information State Embedding in Partially Observable Cooperative Multi-Agent Reinforcement Learning.
Proceedings of the 59th IEEE Conference on Decision and Control, 2020

Approximate Equilibrium Computation for Discrete-Time Linear-Quadratic Mean-Field Games.
Proceedings of the 2020 American Control Conference, 2020

2019
Projected Stochastic Primal-Dual Method for Constrained Online Learning With Kernels.
IEEE Trans. Signal Process., 2019

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms.
CoRR, 2019

Stochastic Convergence Results for Regularized Actor-Critic Methods.
CoRR, 2019

A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning.
CoRR, 2019

Non-Cooperative Inverse Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Policy Search in Infinite-Horizon Discounted Reinforcement Learning: Advances through Connections to Non-Convex Optimization : Invited Presentation.
Proceedings of the 53rd Annual Conference on Information Sciences and Systems, 2019

Convergence and Iteration Complexity of Policy Gradient Method for Infinite-horizon Reinforcement Learning.
Proceedings of the 58th IEEE Conference on Decision and Control, 2019

A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement Learning.
Proceedings of the 58th IEEE Conference on Decision and Control, 2019

Online Planning for Decentralized Stochastic Control with Partial History Sharing.
Proceedings of the 2019 American Control Conference, 2019

2018
Dynamic Power Distribution System Management With a Locally Connected Communication Network.
IEEE J. Sel. Top. Signal Process., 2018

Communication-Efficient Distributed Reinforcement Learning.
CoRR, 2018

Finite-Sample Analyses for Fully Decentralized Multi-Agent Reinforcement Learning.
CoRR, 2018

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents.
Proceedings of the 35th International Conference on Machine Learning, 2018

Networked Multi-Agent Reinforcement Learning in Continuous Spaces.
Proceedings of the 57th IEEE Conference on Decision and Control, 2018

A Finite Sample Analysis of the Actor-Critic Algorithm.
Proceedings of the 57th IEEE Conference on Decision and Control, 2018

Distributed Equilibrium-Learning for Power Network Voltage Control With a Locally Connected Communication Network.
Proceedings of the 2018 Annual American Control Conference, 2018

Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017
Consumption Behavior Analytics-Aided Energy Forecasting and Dispatch.
IEEE Intell. Syst., 2017

Parameter Sensitivity and Dependency Analysis for the WECC Dynamic Composite Load Model.
Proceedings of the 50th Hawaii International Conference on System Sciences, 2017

A game-theoretic approach for communication-free voltage-VAR optimization.
Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing, 2017

2016
On the performance of map-aware cooperative localization.
Proceedings of the 2016 IEEE International Conference on Communications, 2016

2015
Indoor Localization Algorithm For Smartphones.
CoRR, 2015

Enhanced multi-parameter cognitive architecture for future wireless communications.
IEEE Commun. Mag., 2015

Spectrum prediction and channel selection for sensing-based spectrum sharing scheme using online learning techniques.
Proceedings of the 26th IEEE Annual International Symposium on Personal, 2015

An area state-aided indoor localization algorithm and its implementation.
Proceedings of the IEEE International Conference on Communication, 2015

Sequential Detection Aided Modulation Classification in Cognitive Radio Networks.
Proceedings of the 2015 IEEE Global Communications Conference, 2015

2014
Enhanced Multi-Parameter Cognitive Architecture for Future Wireless Communications.
CoRR, 2014

Machine learning techniques for spectrum sensing when primary user has multiple transmit powers.
Proceedings of the IEEE International Conference on Communication Systems, 2014


  Loading...