Zhaoran Wang

Yu Marco Nie

Transp. Sci., 2024

Neural Temporal Difference and Q Learning Provably Converge to Global Optima.

[BibT_eX]

[DOI]

Math. Oper. Res., 2024

Language-Model-Assisted Bi-Level Programming for Reward Learning from Internet Videos.

[BibT_eX]

[DOI]

CoRR, 2024

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Just say what you want: only-prompting self-rewarding online preference optimization.

[BibT_eX]

[DOI]

CoRR, 2024

Safe MPC Alignment with Human Directional Feedback.

[BibT_eX]

[DOI]

CoRR, 2024

Toward Optimal LLM Alignments Using Two-Player Games.

[BibT_eX]

[DOI]

CoRR, 2024

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment.

[BibT_eX]

[DOI]

CoRR, 2024

Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer.

[BibT_eX]

[DOI]

CoRR, 2024

A Mean-Field Analysis of Neural Gradient Descent-Ascent: Applications to Functional Conditional Moment Equations.

[BibT_eX]

[DOI]

CoRR, 2024

Advancing Object Goal Navigation Through LLM-enhanced Object Affinities Transfer.

[BibT_eX]

[DOI]

CoRR, 2024

Can Large Language Models Play Games? A Case Study of A Self-Play Approach.

[BibT_eX]

[DOI]

CoRR, 2024

How Can LLM Guide RL? A Value-Based Approach.

[BibT_eX]

[DOI]

CoRR, 2024

Human-Instruction-Free LLM Self-Alignment with Limited Samples.

[BibT_eX]

[DOI]

CoRR, 2024

How Does Goal Relabeling Improve Sample Efficiency?

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

A General Framework for Sequential Decision-Making under Adaptivity Constraints.

[BibT_eX]

[DOI]

Nuoya Xiong

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Sample-Efficient Multi-Agent RL: An Optimization Perspective.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Let Models Speak Ciphers: Multiagent Debate through Embeddings.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., August, 2023

Provably Efficient Reinforcement Learning with Linear Function Approximation.

[BibT_eX]

[DOI]

Math. Oper. Res., August, 2023

A Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic.

[BibT_eX]

[DOI]

SIAM J. Optim., March, 2023

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium.

[BibT_eX]

[DOI]

Math. Oper. Res., February, 2023

Addressing Hindsight Bias in Multigoal Reinforcement Learning.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2023

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers?

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

Empowering Autonomous Driving with Large Language Models: A Safety Perspective.

[BibT_eX]

[DOI]

CoRR, 2023

Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks.

[BibT_eX]

[DOI]

CoRR, 2023

A Principled Framework for Knowledge-enhanced Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2023

Learning Regularized Graphon Mean-Field Games with Unknown Graphons.

[BibT_eX]

[DOI]

CoRR, 2023

Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency.

[BibT_eX]

[DOI]

CoRR, 2023

Contextual Dynamic Pricing with Strategic Buyers.

[BibT_eX]

[DOI]

CoRR, 2023

What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization.

[BibT_eX]

[DOI]

CoRR, 2023

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration.

[BibT_eX]

[DOI]

CoRR, 2023

A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations.

[BibT_eX]

[DOI]

CoRR, 2023

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Learning Regularized Monotone Graphon Mean-Field Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Learning for Dynamics and Control Conference, 2023

Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics.

[BibT_eX]

[DOI]

Shenao Zhang

Wanxin Jin

Proceedings of the International Conference on Machine Learning, 2023

Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Latent Variable Representation for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE 14th International Conference on Cyber-Physical Systems, 2023

Differentiable Arbitrating in Zero-sum Markov Games.

[BibT_eX]

[DOI]

Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models via Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models.

[BibT_eX]

[DOI]

CoRR, 2022

Offline Policy Optimization in RL with Variance Regularizaton.

[BibT_eX]

[DOI]

CoRR, 2022

Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information.

[BibT_eX]

[DOI]

CoRR, 2022

Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality.

[BibT_eX]

[DOI]

CoRR, 2022

GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond.

[BibT_eX]

[DOI]

CoRR, 2022

A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design.

[BibT_eX]

[DOI]

CoRR, 2022

Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes.

[BibT_eX]

[DOI]

CoRR, 2022

Differentiable Bilevel Programming for Stackelberg Congestion Games.

[BibT_eX]

[DOI]

CoRR, 2022

Federated Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency.

[BibT_eX]

[DOI]

CoRR, 2022

Sample-Efficient Reinforcement Learning for POMDPs with Linear Function Approximations.

[BibT_eX]

[DOI]

Qi Cai

CoRR, 2022

Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach.

[BibT_eX]

[DOI]

CoRR, 2022

Joint Differentiable Optimization and Verification for Certified Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the EC '22: The 23rd ACM Conference on Economics and Computation, Boulder, CO, USA, July 11, 2022

Accelerate online reinforcement learning for building HVAC control with heterogeneous expert guidances.

[BibT_eX]

[DOI]

Proceedings of the 9th ACM International Conference on Systems for Energy-Efficient Buildings, 2022

Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

RORL: Robust Offline Reinforcement Learning via Conservative Smoothing.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Unifying Framework of Off-Policy General Value Function Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Exponential Family Model-Based Reinforcement Learning via Score Matching.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Learning from Demonstration: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Adaptive Model Design for Markov Decision Process.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency.

[BibT_eX]

[DOI]

Qi Cai

Proceedings of the International Conference on Machine Learning, 2022

Towards General Function Approximation in Zero-Sum Markov Games.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Design-while-verify: correct-by-construction control learning with verification in the loop.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Gap-Dependent Bounds for Two-Player Markov Games.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Spectrum Truncation Power Iteration for Agnostic Matrix Phase Retrieval.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2021

On Finite-Time Convergence of Actor-Critic Algorithm.

[BibT_eX]

[DOI]

IEEE J. Sel. Areas Inf. Theory, 2021

Exponential Family Model-Based Reinforcement Learning via Score Matching.

[BibT_eX]

[DOI]

CoRR, 2021

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?

[BibT_eX]

[DOI]

CoRR, 2021

FinRL-Meta: A Universe of Near-Real Market Environments for Data-Driven Deep Reinforcement Learning in Quantitative Finance.

[BibT_eX]

[DOI]

CoRR, 2021

ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs.

[BibT_eX]

[DOI]

CoRR, 2021

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima.

[BibT_eX]

[DOI]

CoRR, 2021

Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation.

[BibT_eX]

[DOI]

CoRR, 2021

Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

A Unified Off-Policy Evaluation Approach for General Value Function.

[BibT_eX]

[DOI]

CoRR, 2021

Verification in the Loop: Correct-by-Construction Control Learning with Reach-avoid Guarantees.

[BibT_eX]

[DOI]

CoRR, 2021

Permutation Invariant Policy Optimization for Mean-Field Multi-Agent Reinforcement Learning: A Principled Approach.

[BibT_eX]

[DOI]

CoRR, 2021

Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization.

[BibT_eX]

[DOI]

CoRR, 2021

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data.

[BibT_eX]

[DOI]

Lingxiao Wang

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

BooVI: Provably Efficient Bootstrapped Value Iteration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Dynamic Bottleneck for Robust Self-Supervised Exploration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Provably Sample Efficient Reinforcement Learning in Competitive Linear Quadratic Systems.

[BibT_eX]

[DOI]

Proceedings of the 3rd Annual Conference on Learning for Dynamics and Control, 2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Learning While Playing in Mean-Field Games: Convergence and Optimality.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Infinite-Dimensional Optimization for Zero-Sum Games via Variational Transport.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Is Pessimism Provably Efficient for Offline RL?

[BibT_eX]

[DOI]

Ying Jin

Proceedings of the 38th International Conference on Machine Learning, 2021

Randomized Exploration in Reinforcement Learning with General Value Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach.

[BibT_eX]

[DOI]

Yingjie Fei

Proceedings of the 38th International Conference on Machine Learning, 2021

Principled Exploration via Optimistic Bootstrapping and Backward Induction.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy.

[BibT_eX]

[DOI]

Zuyue Fu

Proceedings of the 9th International Conference on Learning Representations, 2021

FinRL-podracer: high performance and scalable deep reinforcement learning for quantitative finance.

[BibT_eX]

[DOI]

Proceedings of the ICAIF'21: 2nd ACM International Conference on AI in Finance, Virtual Event, November 3, 2021

Cocktail: Learn a Better Neural Network Controller from Multiple Experts via Adaptive Mixing and Robust Distillation.

[BibT_eX]

[DOI]

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

Provably Efficient Actor-Critic for Risk-Sensitive and Robust Adversarial RL: A Linear-Quadratic Case.

[BibT_eX]

[DOI]

Yufeng Zhang

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Sample Elicitation.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Provably Efficient Safe Exploration via Primal-Dual Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Tensor Graphical Model: Non-Convex Optimization and Statistical Inference.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2020

Agnostic Estimation for Phase Retrieval.

[BibT_eX]

[DOI]

Matey Neykov

J. Mach. Learn. Res., 2020

Provably Training Neural Network Classifiers under Fairness Constraints.

[BibT_eX]

[DOI]

You-Lin Chen

Mladen Kolar

CoRR, 2020

Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy.

[BibT_eX]

[DOI]

CoRR, 2020

Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization.

[BibT_eX]

[DOI]

CoRR, 2020

Bridging Exploration and General Function Approximation in Reinforcement Learning: Provably Efficient Kernel and Neural Value Iterations.

[BibT_eX]

[DOI]

CoRR, 2020

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Provable Fictitious Play for General Mean-Field Games.

[BibT_eX]

[DOI]

CoRR, 2020

Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection.

[BibT_eX]

[DOI]

CoRR, 2020

Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning.

[BibT_eX]

[DOI]

CoRR, 2020

A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic.

[BibT_eX]

[DOI]

CoRR, 2020

Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach.

[BibT_eX]

[DOI]

CoRR, 2020

Neural Certificates for Safe Control Policies.

[BibT_eX]

[DOI]

CoRR, 2020

Deep Reinforcement Learning with Smooth Policy.

[BibT_eX]

[DOI]

CoRR, 2020

Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate.

[BibT_eX]

[DOI]

CoRR, 2020

Upper Confidence Primal-Dual Optimization: Stochastically Constrained Markov Decision Processes with Adversarial Losses and Unknown Transitions.

[BibT_eX]

[DOI]

CoRR, 2020

Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Neural GTD for Off-Policy Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

End-to-End Learning and Intervention in Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Dynamic Regret of Policy Optimization in Non-Stationary Environments.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

A Theoretical Analysis of Deep Q-Learning.

[BibT_eX]

[DOI]

Proceedings of the 2nd Annual Conference on Learning for Dynamics and Control, 2020

Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning.

[BibT_eX]

[DOI]

Lingxiao Wang

Proceedings of the 37th International Conference on Machine Learning, 2020

On the Global Optimality of Model-Agnostic Meta-Learning.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Deep Reinforcement Learning with Robust and Smooth Policy.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model.

[BibT_eX]

[DOI]

Ying Jin

Junwei Lu

Proceedings of the 37th International Conference on Machine Learning, 2020

Provably Efficient Exploration in Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Neural Policy Gradient Methods: Global Optimality and Rates of Convergence.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

On Computation and Generalization of Generative Adversarial Imitation Learning.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Symmetry, Saddle Points, and Global Optimization Landscape of Nonconvex Matrix Factorization.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Theory, 2019

Misspecified nonconvex statistical optimization for sparse phase retrieval.

[BibT_eX]

[DOI]

Math. Program., 2019

High-dimensional Varying Index Coefficient Models via Stein's Identity.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2019

Natural Actor-Critic Converges Globally for Hierarchical Linear Quadratic Regulator.

[BibT_eX]

[DOI]

CoRR, 2019

Credible Sample Elicitation by Deep Learning, for Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Fast Multi-Agent Temporal-Difference Learning via Homotopy Stochastic Primal-Dual Optimization.

[BibT_eX]

[DOI]

CoRR, 2019

On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost.

[BibT_eX]

[DOI]

CoRR, 2019

Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy.

[BibT_eX]

[DOI]

CoRR, 2019

A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator.

[BibT_eX]

[DOI]

CoRR, 2019

A Theoretical Analysis of Deep Q-Learning.

[BibT_eX]

[DOI]

Yuchen Xie

CoRR, 2019

Convergent Policy Optimization for Safe Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Statistical-Computational Tradeoff in Single Index Models.

[BibT_eX]

[DOI]

Lingxiao Wang

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Variance Reduced Policy Evaluation with Smooth Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural Temporal-Difference Learning Converges to Global Optima.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

On the statistical rate of nonlinear recovery in generative models with heavy-tailed data.

[BibT_eX]

[DOI]

Xiaohan Wei

Proceedings of the 36th International Conference on Machine Learning, 2019

Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Accelerating Nonconvex Learning via Replica Exchange Langevin diffusion.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Learning Partially Observable Markov Decision Processes Using Coupled Canonical Polyadic Decomposition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Data Science Workshop, 2019

A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 58th IEEE Conference on Decision and Control, 2019

2018

Nonconvex Statistical Optimization

[BibT_eX]

[DOI]

PhD thesis, 2018

A convex formulation for high-dimensional sparse sliced inverse regression.

[BibT_eX]

[DOI]

CoRR, 2018

Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval.

[BibT_eX]

[DOI]

CoRR, 2018

On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets, and Beyond.

[BibT_eX]

[DOI]

CoRR, 2018

Detecting Nonlinear Causality in Multivariate Time Series with Sparse Additive Models.

[BibT_eX]

[DOI]

CoRR, 2018

Provable Gaussian Embedding with One Observation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Contrastive Learning from Pairwise Measurements.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Dynamic Truth Discovery on Numerical Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Data Mining, 2018

Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding.

[BibT_eX]

[DOI]

Kaiqing Zhang

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

2017

Misspecified Nonconvex Statistical Optimization for Phase Retrieval.

[BibT_eX]

[DOI]

CoRR, 2017

Estimating High-dimensional Non-Gaussian Multiple Index Models via Stein's Lemma.

[BibT_eX]

[DOI]

Krishnakumar Balasubramanian

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

2016

Symmetry, Saddle Points, and Global Geometry of Nonconvex Matrix Factorization.

[BibT_eX]

[DOI]

CoRR, 2016

More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning.

[BibT_eX]

[DOI]

Xinyang Yi

Constantine Caramanis

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Agnostic Estimation for Misspecified Phase Retrieval Models.

[BibT_eX]

[DOI]

Matey Neykov

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes.

[BibT_eX]

[DOI]

Chris Junchi Li

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Blind Attacks on Machine Learners.

[BibT_eX]

[DOI]

Alex Beatson

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

A Truth Discovery Approach with Theoretical Guarantee.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

Sparse Nonlinear Regression: Parameter Estimation under Nonconvexity.

[BibT_eX]

[DOI]

Proceedings of the 33nd International Conference on Machine Learning, 2016

On the Statistical Limits of Convex Relaxations.

[BibT_eX]

[DOI]

Quanquan Gu

Proceedings of the 33nd International Conference on Machine Learning, 2016

Low-Rank and Sparse Structure Pursuit via Alternating Minimization.

[BibT_eX]

[DOI]

Quanquan Gu

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

2015

Sparse Nonlinear Regression: Parameter Estimation and Asymptotic Inference.

[BibT_eX]

[DOI]

CoRR, 2015

A Nonconvex Optimization Framework for Low Rank Matrix Estimation.

[BibT_eX]

[DOI]

Tuo Zhao

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Optimal Linear Estimation under Unknown Nonlinear Transform.

[BibT_eX]

[DOI]

Xinyang Yi

Constantine Caramanis

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Non-convex Statistical Optimization for Sparse Tensor Graphical Model.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

2014

A Strictly Contractive Peaceman-Rachford Splitting Method for Convex Programming.

[BibT_eX]

[DOI]

SIAM J. Optim., 2014

Nonconvex Statistical Optimization: Minimax-Optimal Sparse PCA in Polynomial Time.

[BibT_eX]

[DOI]

Huanran Lu

CoRR, 2014

Tighten after Relax: Minimax-Optimal Sparse PCA in Polynomial Time.

[BibT_eX]

[DOI]

Huanran Lu

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Sparse PCA with Oracle Property.

[BibT_eX]

[DOI]

Quanquan Gu

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013

Sparse Principal Component Analysis for High Dimensional Multivariate Time Series.

[BibT_eX]

[DOI]

Fang Han