Zhaoran Wang

Orcid: 0000-0002-1824-2580

Affiliations:
  • Northwestern University, Evanston, IL, USA


According to our database1, Zhaoran Wang authored at least 227 papers between 2013 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Monotonic Quantile Network for Worst-Case Offline Reinforcement Learning.
IEEE Trans. Neural Networks Learn. Syst., July, 2024

Dynamic datasets and market environments for financial reinforcement learning.
Mach. Learn., May, 2024

False Correlation Reduction for Offline Reinforcement Learning.
IEEE Trans. Pattern Anal. Mach. Intell., February, 2024

Wardrop Equilibrium Can Be Boundedly Rational: A New Behavioral Theory of Route Choice.
Transp. Sci., 2024

Neural Temporal Difference and Q Learning Provably Converge to Global Optima.
Math. Oper. Res., 2024

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs.
CoRR, 2024

Just say what you want: only-prompting self-rewarding online preference optimization.
CoRR, 2024

Safe MPC Alignment with Human Directional Feedback.
CoRR, 2024

Toward Optimal LLM Alignments Using Two-Player Games.
CoRR, 2024

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment.
CoRR, 2024

Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer.
CoRR, 2024

A Mean-Field Analysis of Neural Gradient Descent-Ascent: Applications to Functional Conditional Moment Equations.
CoRR, 2024

Advancing Object Goal Navigation Through LLM-enhanced Object Affinities Transfer.
CoRR, 2024

Can Large Language Models Play Games? A Case Study of A Self-Play Approach.
CoRR, 2024

How Can LLM Guide RL? A Value-Based Approach.
CoRR, 2024

Human-Instruction-Free LLM Self-Alignment with Limited Samples.
CoRR, 2024

How Does Goal Relabeling Improve Sample Efficiency?
Proceedings of the Forty-first International Conference on Machine Learning, 2024

A General Framework for Sequential Decision-Making under Adaptivity Constraints.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Sample-Efficient Multi-Agent RL: An Optimization Perspective.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Let Models Speak Ciphers: Multiagent Debate through Embeddings.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning.
IEEE Trans. Neural Networks Learn. Syst., August, 2023

Provably Efficient Reinforcement Learning with Linear Function Approximation.
Math. Oper. Res., August, 2023

A Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic.
SIAM J. Optim., March, 2023

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium.
Math. Oper. Res., February, 2023

Addressing Hindsight Bias in Multigoal Reinforcement Learning.
IEEE Trans. Cybern., 2023

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning.
J. Mach. Learn. Res., 2023

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers?
J. Mach. Learn. Res., 2023

Empowering Autonomous Driving with Large Language Models: A Safety Perspective.
CoRR, 2023

Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks.
CoRR, 2023

A Principled Framework for Knowledge-enhanced Large Language Model.
CoRR, 2023

Learning Regularized Graphon Mean-Field Games with Unknown Graphons.
CoRR, 2023

Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency.
CoRR, 2023

Contextual Dynamic Pricing with Strategic Buyers.
CoRR, 2023

What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization.
CoRR, 2023

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration.
CoRR, 2023

Wardrop Equilibrium Can Be Boundedly Rational: A New Behavioral Theory of Route Choice.
CoRR, 2023

A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations.
CoRR, 2023

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning.
CoRR, 2023

Achieving Hierarchy-Free Approximation for Bilevel Programs With Equilibrium Constraints.
CoRR, 2023

Learning Regularized Monotone Graphon Mean-Field Games.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning.
Proceedings of the Learning for Dynamics and Control Conference, 2023

Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics.
Proceedings of the International Conference on Machine Learning, 2023

Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints.
Proceedings of the International Conference on Machine Learning, 2023

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments.
Proceedings of the International Conference on Machine Learning, 2023

Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Latent Variable Representation for Reinforcement Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning.
Proceedings of the ACM/IEEE 14th International Conference on Cyber-Physical Systems, 2023

Differentiable Arbitrating in Zero-sum Markov Games.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models via Reinforcement Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models.
CoRR, 2022

Offline Policy Optimization in RL with Variance Regularizaton.
CoRR, 2022

Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information.
CoRR, 2022

Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality.
CoRR, 2022

GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond.
CoRR, 2022

A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design.
CoRR, 2022

Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes.
CoRR, 2022

Differentiable Bilevel Programming for Stackelberg Congestion Games.
CoRR, 2022

Federated Offline Reinforcement Learning.
CoRR, 2022

Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency.
CoRR, 2022

Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations.
CoRR, 2022

Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach.
CoRR, 2022

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning.
CoRR, 2022

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning.
Proceedings of the EC '22: The 23rd ACM Conference on Economics and Computation, Boulder, CO, USA, July 11, 2022

Accelerate online reinforcement learning for building HVAC control with heterogeneous expert guidances.
Proceedings of the 9th ACM International Conference on Systems for Energy-Efficient Buildings, 2022

Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

RORL: Robust Offline Reinforcement Learning via Conservative Smoothing.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Unifying Framework of Off-Policy General Value Function Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Exponential Family Model-Based Reinforcement Learning via Score Matching.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets.
Proceedings of the International Conference on Machine Learning, 2022

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2022

Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2022

Learning from Demonstration: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation.
Proceedings of the International Conference on Machine Learning, 2022

Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy.
Proceedings of the International Conference on Machine Learning, 2022

Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes.
Proceedings of the International Conference on Machine Learning, 2022

Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation.
Proceedings of the International Conference on Machine Learning, 2022

Adaptive Model Design for Markov Decision Process.
Proceedings of the International Conference on Machine Learning, 2022

Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency.
Proceedings of the International Conference on Machine Learning, 2022

Towards General Function Approximation in Zero-Sum Markov Games.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Design-while-verify: correct-by-construction control learning with verification in the loop.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Gap-Dependent Bounds for Two-Player Markov Games.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Spectrum Truncation Power Iteration for Agnostic Matrix Phase Retrieval.
IEEE Trans. Signal Process., 2021

On Finite-Time Convergence of Actor-Critic Algorithm.
IEEE J. Sel. Areas Inf. Theory, 2021

Exponential Family Model-Based Reinforcement Learning via Score Matching.
CoRR, 2021

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?
CoRR, 2021

FinRL-Meta: A Universe of Near-Real Market Environments for Data-Driven Deep Reinforcement Learning in Quantitative Finance.
CoRR, 2021

ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning.
CoRR, 2021

SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning.
CoRR, 2021

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs.
CoRR, 2021

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima.
CoRR, 2021

Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation.
CoRR, 2021

Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning.
CoRR, 2021

A Unified Off-Policy Evaluation Approach for General Value Function.
CoRR, 2021

Verification in the Loop: Correct-by-Construction Control Learning with Reach-avoid Guarantees.
CoRR, 2021

Permutation Invariant Policy Optimization for Mean-Field Multi-Agent Reinforcement Learning: A Principled Approach.
CoRR, 2021

Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning.
CoRR, 2021

A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization.
CoRR, 2021

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

BooVI: Provably Efficient Bootstrapped Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Dynamic Bottleneck for Robust Self-Supervised Exploration.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Provably Sample Efficient Reinforcement Learning in Competitive Linear Quadratic Systems.
Proceedings of the 3rd Annual Conference on Learning for Dynamics and Control, 2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality.
Proceedings of the 38th International Conference on Machine Learning, 2021

Learning While Playing in Mean-Field Games: Convergence and Optimality.
Proceedings of the 38th International Conference on Machine Learning, 2021

Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time.
Proceedings of the 38th International Conference on Machine Learning, 2021

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game.
Proceedings of the 38th International Conference on Machine Learning, 2021

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions.
Proceedings of the 38th International Conference on Machine Learning, 2021

Infinite-Dimensional Optimization for Zero-Sum Games via Variational Transport.
Proceedings of the 38th International Conference on Machine Learning, 2021

Is Pessimism Provably Efficient for Offline RL?
Proceedings of the 38th International Conference on Machine Learning, 2021

Randomized Exploration in Reinforcement Learning with General Value Function Approximation.
Proceedings of the 38th International Conference on Machine Learning, 2021

Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games.
Proceedings of the 38th International Conference on Machine Learning, 2021

Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach.
Proceedings of the 38th International Conference on Machine Learning, 2021

Principled Exploration via Optimistic Bootstrapping and Backward Induction.
Proceedings of the 38th International Conference on Machine Learning, 2021

Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy.
Proceedings of the 9th International Conference on Learning Representations, 2021

FinRL-podracer: high performance and scalable deep reinforcement learning for quantitative finance.
Proceedings of the ICAIF'21: 2nd ACM International Conference on AI in Finance, Virtual Event, November 3, 2021

Cocktail: Learn a Better Neural Network Controller from Multiple Experts via Adaptive Mixing and Robust Distillation.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

Provably Efficient Actor-Critic for Risk-Sensitive and Robust Adversarial RL: A Linear-Quadratic Case.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Sample Elicitation.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Provably Efficient Safe Exploration via Primal-Dual Policy Optimization.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
Tensor Graphical Model: Non-Convex Optimization and Statistical Inference.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Agnostic Estimation for Phase Retrieval.
J. Mach. Learn. Res., 2020

Provably Training Neural Network Classifiers under Fairness Constraints.
CoRR, 2020

Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy.
CoRR, 2020

Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization.
CoRR, 2020

Bridging Exploration and General Function Approximation in Reinforcement Learning: Provably Efficient Kernel and Neural Value Iterations.
CoRR, 2020

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning.
CoRR, 2020

Provable Fictitious Play for General Mean-Field Games.
CoRR, 2020

Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection.
CoRR, 2020

Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning.
CoRR, 2020

A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic.
CoRR, 2020

Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach.
CoRR, 2020

Neural Certificates for Safe Control Policies.
CoRR, 2020

Deep Reinforcement Learning with Smooth Policy.
CoRR, 2020

Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate.
CoRR, 2020

Upper Confidence Primal-Dual Optimization: Stochastically Constrained Markov Decision Processes with Adversarial Losses and Unknown Transitions.
CoRR, 2020

Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Neural GTD for Off-Policy Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

End-to-End Learning and Intervention in Games.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Dynamic Regret of Policy Optimization in Non-Stationary Environments.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

A Theoretical Analysis of Deep Q-Learning.
Proceedings of the 2nd Annual Conference on Learning for Dynamics and Control, 2020

Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate.
Proceedings of the 37th International Conference on Machine Learning, 2020

Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

On the Global Optimality of Model-Agnostic Meta-Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Deep Reinforcement Learning with Robust and Smooth Policy.
Proceedings of the 37th International Conference on Machine Learning, 2020

Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees.
Proceedings of the 37th International Conference on Machine Learning, 2020

Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model.
Proceedings of the 37th International Conference on Machine Learning, 2020

Provably Efficient Exploration in Policy Optimization.
Proceedings of the 37th International Conference on Machine Learning, 2020

Neural Policy Gradient Methods: Global Optimality and Rates of Convergence.
Proceedings of the 8th International Conference on Learning Representations, 2020

Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games.
Proceedings of the 8th International Conference on Learning Representations, 2020

On Computation and Generalization of Generative Adversarial Imitation Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Symmetry, Saddle Points, and Global Optimization Landscape of Nonconvex Matrix Factorization.
IEEE Trans. Inf. Theory, 2019

Misspecified nonconvex statistical optimization for sparse phase retrieval.
Math. Program., 2019

High-dimensional Varying Index Coefficient Models via Stein's Identity.
J. Mach. Learn. Res., 2019

Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator.
CoRR, 2019

Credible Sample Elicitation by Deep Learning, for Deep Learning.
CoRR, 2019

Fast Multi-Agent Temporal-Difference Learning via Homotopy Stochastic Primal-Dual Optimization.
CoRR, 2019

On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost.
CoRR, 2019

Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy.
CoRR, 2019

A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning.
CoRR, 2019

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator.
CoRR, 2019

A Theoretical Analysis of Deep Q-Learning.
CoRR, 2019

Convergent Policy Optimization for Safe Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Statistical-Computational Tradeoff in Single Index Models.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Variance Reduced Policy Evaluation with Smooth Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural Temporal-Difference Learning Converges to Global Optima.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

On the statistical rate of nonlinear recovery in generative models with heavy-tailed data.
Proceedings of the 36th International Conference on Machine Learning, 2019

Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy.
Proceedings of the 7th International Conference on Learning Representations, 2019

Accelerating Nonconvex Learning via Replica Exchange Langevin diffusion.
Proceedings of the 7th International Conference on Learning Representations, 2019

Learning Partially Observable Markov Decision Processes Using Coupled Canonical Polyadic Decomposition.
Proceedings of the IEEE Data Science Workshop, 2019

A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement Learning.
Proceedings of the 58th IEEE Conference on Decision and Control, 2019

2018
Nonconvex Statistical Optimization
PhD thesis, 2018

A convex formulation for high-dimensional sparse sliced inverse regression.
CoRR, 2018

Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval.
CoRR, 2018

On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets, and Beyond.
CoRR, 2018

Detecting Nonlinear Causality in Multivariate Time Series with Sparse Additive Models.
CoRR, 2018

Provable Gaussian Embedding with One Observation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Contrastive Learning from Pairwise Measurements.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference.
Proceedings of the 35th International Conference on Machine Learning, 2018

Dynamic Truth Discovery on Numerical Data.
Proceedings of the IEEE International Conference on Data Mining, 2018

Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017
Misspecified Nonconvex Statistical Optimization for Phase Retrieval.
CoRR, 2017

Estimating High-dimensional Non-Gaussian Multiple Index Models via Stein's Lemma.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

2016
Symmetry, Saddle Points, and Global Geometry of Nonconvex Matrix Factorization.
CoRR, 2016

More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Agnostic Estimation for Misspecified Phase Retrieval Models.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Blind Attacks on Machine Learners.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

A Truth Discovery Approach with Theoretical Guarantee.
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

Sparse Nonlinear Regression: Parameter Estimation under Nonconvexity.
Proceedings of the 33nd International Conference on Machine Learning, 2016

On the Statistical Limits of Convex Relaxations.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Low-Rank and Sparse Structure Pursuit via Alternating Minimization.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

2015
Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference.
CoRR, 2015

A Nonconvex Optimization Framework for Low Rank Matrix Estimation.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Optimal Linear Estimation under Unknown Nonlinear Transform.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Non-convex Statistical Optimization for Sparse Tensor Graphical Model.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

2014
A Strictly Contractive Peaceman-Rachford Splitting Method for Convex Programming.
SIAM J. Optim., 2014

Nonconvex Statistical Optimization: Minimax-Optimal Sparse PCA in Polynomial Time.
CoRR, 2014

Tighten after Relax: Minimax-Optimal Sparse PCA in Polynomial Time.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Sparse PCA with Oracle Property.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013
Sparse Principal Component Analysis for High Dimensional Multivariate Time Series.
Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, 2013


  Loading...