Yufei Zhang
Orcid: 0000-0001-9843-1404Affiliations:
- Imperial College London, Department of Mathematics, Westminster, UK
- University of Oxford, Mathematical Institute, UK
- London School of Economics and Political Science, Department of Statistics, UK (2021-2023)
According to our database1,
Yufei Zhang
authored at least 27 papers
between 2019 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
2019
2020
2021
2022
2023
2024
0
1
2
3
4
5
6
7
6
6
4
4
4
2
1
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Optimal Scheduling of Entropy Regularizer for Continuous-Time Linear-Quadratic Reinforcement Learning.
SIAM J. Control. Optim., February, 2024
A Fast Iterative PDE-Based Algorithm for Feedback Controls of Nonsmooth Mean-Field Control Problems.
SIAM J. Sci. Comput., 2024
Convergence of Policy Gradient Methods for Finite-Horizon Exploratory Linear-Quadratic Control Problems.
SIAM J. Control. Optim., 2024
CoRR, 2024
2023
Linear Convergence of a Policy Gradient Method for Some Finite Horizon Continuous Time Control Problems.
SIAM J. Control. Optim., December, 2023
Reinforcement Learning for Linear-Convex Models with Jumps via Stability Analysis of Feedback Controls.
SIAM J. Control. Optim., April, 2023
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces.
CoRR, 2023
CoRR, 2023
CoRR, 2023
2022
Logarithmic Regret for Episodic Continuous-Time Linear-Quadratic Reinforcement Learning over a Finite-Time Horizon.
J. Mach. Learn. Res., 2022
Convergence of policy gradient methods for finite-horizon stochastic linear-quadratic control problems.
CoRR, 2022
Optimal scheduling of entropy regulariser for continuous-time linear-quadratic reinforcement learning.
CoRR, 2022
Linear convergence of a policy gradient method for finite horizon continuous time stochastic control problems.
CoRR, 2022
2021
A Neural Network-Based Policy Iteration Algorithm with Global H<sup>2</sup>-Superlinear Convergence for Stochastic Games on Domains.
Found. Comput. Math., 2021
Exploration-exploitation trade-off for continuous-time episodic reinforcement learning with linear-convex models.
CoRR, 2021
A penalty scheme and policy iteration for nonlocal HJB variational inequalities with monotone nonlinearities.
Comput. Math. Appl., 2021
2020
Error Estimates of Penalty Schemes for Quasi-Variational Inequalities Arising from Impulse Control Problems.
SIAM J. Control. Optim., 2020
Regularity and time discretization of extended mean field control problems: a McKean-Vlasov FBSDE approach.
CoRR, 2020
CoRR, 2020
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
2019
A Penalty Scheme for Monotone Systems with Interconnected Obstacles: Convergence and Error Estimates.
SIAM J. Numer. Anal., 2019
Rectified deep neural networks overcome the curse of dimensionality for nonsmooth value functions in zero-sum games of nonlinear stiff systems.
CoRR, 2019