Tengyu Xu

According to our database1, Tengyu Xu authored at least 27 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Provably Efficient Offline Reinforcement Learning With Trajectory-Wise Reward.
IEEE Trans. Inf. Theory, September, 2024

Faster algorithm and sharper analysis for constrained Markov decision process.
Oper. Res. Lett., 2024

Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions Following.
CoRR, 2024

The Perfect Blend: Redefining RLHF with Mixture of Judges.
CoRR, 2024

2023
Constraint-based multi-agent reinforcement learning for collaborative tasks.
Comput. Animat. Virtual Worlds, 2023

2022
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward.
CoRR, 2022

Deterministic policy gradient: Convergence analysis.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

A Unifying Framework of Off-Policy General Value Function Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Model-Based Offline Meta-Reinforcement Learning with Regularization.
Proceedings of the Tenth International Conference on Learning Representations, 2022

PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
A Unified Off-Policy Evaluation Approach for General Value Function.
CoRR, 2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality.
Proceedings of the 38th International Conference on Machine Learning, 2021

CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee.
Proceedings of the 38th International Conference on Machine Learning, 2021

Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry.
Proceedings of the 9th International Conference on Learning Representations, 2021

Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
A Primal Approach to Constrained Policy Optimization: Global Optimality and Finite-Time Analysis.
CoRR, 2020

Enhanced First and Zeroth Order Variance Reduced Algorithms for Min-Max Optimization.
CoRR, 2020

Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms.
CoRR, 2020

Improving Sample Complexity Bounds for Actor-Critic Algorithms.
CoRR, 2020

Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Reanalysis of Variance Reduced Temporal Difference Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation.
CoRR, 2019

Finite-Sample Analysis for SARSA with Linear Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2018
Convergence of SGD in Learning ReLU Models with Separable Data.
CoRR, 2018


  Loading...