Zhuoran Yang

Orcid: 0000-0001-5269-9958

According to our database1, Zhuoran Yang authored at least 202 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
False Correlation Reduction for Offline Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell., February, 2024

Pessimistic value iteration for multi-task data sharing in Offline Reinforcement Learning.
Artif. Intell., January, 2024

Neural Temporal Difference and Q Learning Provably Converge to Global Optima.
Math. Oper. Res., 2024

Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers.
CoRR, 2024

Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods.
CoRR, 2024

Provable Statistical Rates for Consistency Diffusion Models.
CoRR, 2024

STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making.
CoRR, 2024

A Mean-Field Analysis of Neural Gradient Descent-Ascent: Applications to Functional Conditional Moment Equations.
CoRR, 2024

Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory.
CoRR, 2024

On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games.
CoRR, 2024

Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality.
CoRR, 2024

How Does Goal Relabeling Improve Sample Efficiency?
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

A General Framework for Sequential Decision-Making under Adaptivity Constraints.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Theory of Consistency Diffusion Models: Distribution Estimation Meets Fast Sampling.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Sample-Efficient Multi-Agent RL: An Optimization Perspective.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality (extended abstract).
Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

2023
Provably Efficient Reinforcement Learning with Linear Function Approximation.
Math. Oper. Res., August, 2023

Being Trustworthy is Not Enough: How Untrustworthy Artificial Intelligence (AI) Can Deceive the End-Users and Gain Their Trust.
Proc. ACM Hum. Comput. Interact., April, 2023

A Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic.
SIAM J. Optim., March, 2023

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium.
Math. Oper. Res., February, 2023

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning.
J. Mach. Learn. Res., 2023

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers?
J. Mach. Learn. Res., 2023

Empowering Autonomous Driving with Large Language Models: A Safety Perspective.
CoRR, 2023

Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks.
CoRR, 2023

Learning Regularized Graphon Mean-Field Games with Unknown Graphons.
CoRR, 2023

Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks.
CoRR, 2023

Contextual Dynamic Pricing with Strategic Buyers.
CoRR, 2023

What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization.
CoRR, 2023

Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism.
CoRR, 2023

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration.
CoRR, 2023

A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations.
CoRR, 2023

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning.
CoRR, 2023

Partial Discharge Characteristics and Growth Stage Recognition of Electrical Tree in XLPE Insulation.
IEEE Access, 2023

The Sample Complexity of Online Contract Design.
Proceedings of the 24th ACM Conference on Economics and Computation, 2023

Online Performative Gradient Descent for Learning Nash Equilibria in Decision-Dependent Games.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning Regularized Monotone Graphon Mean-Field Games.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning.
Proceedings of the Learning for Dynamics and Control Conference, 2023

Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP.
Proceedings of the International Conference on Machine Learning, 2023

Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model.
Proceedings of the International Conference on Machine Learning, 2023

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments.
Proceedings of the International Conference on Machine Learning, 2023

Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Can We Find Nash Equilibria at a Linear Rate in Markov Games?
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning.
Proceedings of the ACM/IEEE 14th International Conference on Cyber-Physical Systems, 2023

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models via Reinforcement Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
Offline Policy Optimization in RL with Variance Regularizaton.
CoRR, 2022

Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information.
CoRR, 2022

Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality.
CoRR, 2022

GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond.
CoRR, 2022

A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design.
CoRR, 2022

Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes.
CoRR, 2022

Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments.
CoRR, 2022

Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency.
CoRR, 2022

Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations.
CoRR, 2022

The Best of Both Worlds: Reinforcement Learning with Logarithmic Regret and Policy Switches.
CoRR, 2022

Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach.
CoRR, 2022

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning.
CoRR, 2022

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning.
Proceedings of the EC '22: The 23rd ACM Conference on Economics and Computation, Boulder, CO, USA, July 11, 2022

Accelerate online reinforcement learning for building HVAC control with heterogeneous expert guidances.
Proceedings of the 9th ACM International Conference on Systems for Energy-Efficient Buildings, 2022

Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Unifying Framework of Off-Policy General Value Function Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Reinforcement Learning with Logarithmic Regret and Policy Switches.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Exponential Family Model-Based Reinforcement Learning via Score Matching.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets.
Proceedings of the International Conference on Machine Learning, 2022

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2022

Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2022

Learning from Demonstration: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation.
Proceedings of the International Conference on Machine Learning, 2022

Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy.
Proceedings of the International Conference on Machine Learning, 2022

Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes.
Proceedings of the International Conference on Machine Learning, 2022

Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation.
Proceedings of the International Conference on Machine Learning, 2022

Adaptive Model Design for Markov Decision Process.
Proceedings of the International Conference on Machine Learning, 2022

Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency.
Proceedings of the International Conference on Machine Learning, 2022

Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Towards General Function Approximation in Zero-Sum Markov Games.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Gap-Dependent Bounds for Two-Player Markov Games.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Finite-Sample Analysis for Decentralized Batch Multiagent Reinforcement Learning With Networked Agents.
IEEE Trans. Autom. Control., 2021

Decentralized multi-agent reinforcement learning with networked agents: recent advances.
Frontiers Inf. Technol. Electron. Eng., 2021

On Finite-Time Convergence of Actor-Critic Algorithm.
IEEE J. Sel. Areas Inf. Theory, 2021

Efficient and doubly-robust methods for variable selection and parameter estimation in longitudinal data analysis.
Comput. Stat., 2021

Generalized estimating equations for analyzing multivariate survival data.
Commun. Stat. Simul. Comput., 2021

Exponential Family Model-Based Reinforcement Learning via Score Matching.
CoRR, 2021

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?
CoRR, 2021

ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning.
CoRR, 2021

SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning.
CoRR, 2021

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs.
CoRR, 2021

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima.
CoRR, 2021

Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation.
CoRR, 2021

Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning.
CoRR, 2021

A Unified Off-Policy Evaluation Approach for General Value Function.
CoRR, 2021

Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning.
CoRR, 2021

A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization.
CoRR, 2021

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

BooVI: Provably Efficient Bootstrapped Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Provably Sample Efficient Reinforcement Learning in Competitive Linear Quadratic Systems.
Proceedings of the 3rd Annual Conference on Learning for Dynamics and Control, 2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality.
Proceedings of the 38th International Conference on Machine Learning, 2021

Learning While Playing in Mean-Field Games: Convergence and Optimality.
Proceedings of the 38th International Conference on Machine Learning, 2021

Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time.
Proceedings of the 38th International Conference on Machine Learning, 2021

Reinforcement Learning for Cost-Aware Markov Decision Processes.
Proceedings of the 38th International Conference on Machine Learning, 2021

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game.
Proceedings of the 38th International Conference on Machine Learning, 2021

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions.
Proceedings of the 38th International Conference on Machine Learning, 2021

Infinite-Dimensional Optimization for Zero-Sum Games via Variational Transport.
Proceedings of the 38th International Conference on Machine Learning, 2021

Is Pessimism Provably Efficient for Offline RL?
Proceedings of the 38th International Conference on Machine Learning, 2021

Randomized Exploration in Reinforcement Learning with General Value Function Approximation.
Proceedings of the 38th International Conference on Machine Learning, 2021

Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games.
Proceedings of the 38th International Conference on Machine Learning, 2021

Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach.
Proceedings of the 38th International Conference on Machine Learning, 2021

Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy.
Proceedings of the 9th International Conference on Learning Representations, 2021

Provably Efficient Actor-Critic for Risk-Sensitive and Robust Adversarial RL: A Linear-Quadratic Case.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Sample Elicitation.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Provably Efficient Safe Exploration via Primal-Dual Policy Optimization.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
A Novel Model Integrating Deep Learning for Land Use/Cover Change Reconstruction: A Case Study of Zhenlai County, Northeast China.
Remote. Sens., 2020

Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy.
CoRR, 2020

Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization.
CoRR, 2020

Bridging Exploration and General Function Approximation in Reinforcement Learning: Provably Efficient Kernel and Neural Value Iterations.
CoRR, 2020

Provable Fictitious Play for General Mean-Field Games.
CoRR, 2020

Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning.
CoRR, 2020

Understanding Implicit Regularization in Over-Parameterized Nonlinear Statistical Model.
CoRR, 2020

A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic.
CoRR, 2020

Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach.
CoRR, 2020

Neural Certificates for Safe Control Policies.
CoRR, 2020

Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate.
CoRR, 2020

Upper Confidence Primal-Dual Optimization: Stochastically Constrained Markov Decision Processes with Adversarial Losses and Unknown Transitions.
CoRR, 2020

Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Neural GTD for Off-Policy Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Dynamic Regret of Policy Optimization in Non-Stationary Environments.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

A Theoretical Analysis of Deep Q-Learning.
Proceedings of the 2nd Annual Conference on Learning for Dynamics and Control, 2020

Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate.
Proceedings of the 37th International Conference on Machine Learning, 2020

Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

On the Global Optimality of Model-Agnostic Meta-Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Robust One-Bit Recovery via ReLU Generative Networks: Near-Optimal Statistical Rate and Global Landscape Analysis.
Proceedings of the 37th International Conference on Machine Learning, 2020

Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees.
Proceedings of the 37th International Conference on Machine Learning, 2020

Provably Efficient Exploration in Policy Optimization.
Proceedings of the 37th International Conference on Machine Learning, 2020

Neural Policy Gradient Methods: Global Optimality and Rates of Convergence.
Proceedings of the 8th International Conference on Learning Representations, 2020

Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games.
Proceedings of the 8th International Conference on Learning Representations, 2020

On Computation and Generalization of Generative Adversarial Imitation Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Misspecified nonconvex statistical optimization for sparse phase retrieval.
Math. Program., 2019

High-dimensional Varying Index Coefficient Models via Stein's Identity.
J. Mach. Learn. Res., 2019

Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator.
CoRR, 2019

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms.
CoRR, 2019

Credible Sample Elicitation by Deep Learning, for Deep Learning.
CoRR, 2019

Robust One-Bit Recovery via ReLU Generative Networks: Improved Statistical Rates and Global Landscape Analysis.
CoRR, 2019

Fast Multi-Agent Temporal-Difference Learning via Homotopy Stochastic Primal-Dual Optimization.
CoRR, 2019

On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost.
CoRR, 2019

Stochastic Convergence Results for Regularized Actor-Critic Methods.
CoRR, 2019

Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy.
CoRR, 2019

A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning.
CoRR, 2019

A Theoretical Analysis of Deep Q-Learning.
CoRR, 2019

Surface Charge Transport Characteristics of ZnO/Silicone Rubber Composites Under Impulse Superimposed on DC Voltage.
IEEE Access, 2019

Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Convergent Policy Optimization for Safe Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Statistical-Computational Tradeoff in Single Index Models.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Variance Reduced Policy Evaluation with Smooth Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural Temporal-Difference Learning Converges to Global Optima.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

On the statistical rate of nonlinear recovery in generative models with heavy-tailed data.
Proceedings of the 36th International Conference on Machine Learning, 2019

Research Character Analyzation of Urban Security Based on Urban Resilience Using Big Data Method.
Proceedings of the Big Data and Security - First International Conference, 2019

Learning Partially Observable Markov Decision Processes Using Coupled Canonical Polyadic Decomposition.
Proceedings of the IEEE Data Science Workshop, 2019

A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement Learning.
Proceedings of the 58th IEEE Conference on Decision and Control, 2019

Design of Single Channel Speech Separation System Based on Deep Clustering Model.
Proceedings of the 18th IEEE/ACIS International Conference on Computer and Information Science, 2019

2018
On Semiparametric Exponential Family Graphical Models.
J. Mach. Learn. Res., 2018

Finite-Sample Analyses for Fully Decentralized Multi-Agent Reinforcement Learning.
CoRR, 2018

Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space.
CoRR, 2018

Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval.
CoRR, 2018

Provable Gaussian Embedding with One Observation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Contrastive Learning from Pairwise Measurements.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents.
Proceedings of the 35th International Conference on Machine Learning, 2018

Networked Multi-Agent Reinforcement Learning in Continuous Spaces.
Proceedings of the 57th IEEE Conference on Decision and Control, 2018

A Finite Sample Analysis of the Actor-Critic Algorithm.
Proceedings of the 57th IEEE Conference on Decision and Control, 2018

Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017
Misspecified Nonconvex Statistical Optimization for Phase Retrieval.
CoRR, 2017

Estimating High-dimensional Non-Gaussian Multiple Index Models via Stein's Lemma.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation.
Proceedings of the 34th International Conference on Machine Learning, 2017

2016
More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Sparse Nonlinear Regression: Parameter Estimation under Nonconvexity.
Proceedings of the 33nd International Conference on Machine Learning, 2016

2015
Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference.
CoRR, 2015

Human Memory Search as Initial-Visit Emitting Random Walk.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015


  Loading...