Qinbo Bai

Orcid: 0000-0003-2933-1180

According to our database1, Qinbo Bai authored at least 17 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms.
Found. Trends Optim., 2024

Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm.
CoRR, 2024

Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
A Reinforcement Learning Framework for Vehicular Network Routing Under Peak and Average Constraints.
IEEE Trans. Veh. Technol., May, 2023

Provably Sample-Efficient Model-Free Algorithm for MDPs with Peak Constraints.
J. Mach. Learn. Res., 2023

Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Concave Utility Reinforcement Learning with Zero-Constraint Violations.
Trans. Mach. Learn. Res., 2022

Joint Optimization of Concave Scalarized Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm.
J. Artif. Intell. Res., 2022

Regret guarantees for model-based reinforcement learning with long-term average constraints.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Markov Decision Processes with Long-Term Average Constraints.
CoRR, 2021

Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm.
CoRR, 2021

Reinforcement Learning for Constrained Markov Decision Processes.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
Deep Learning-Based Channel Estimation Algorithm Over Time Selective Fading Channels.
IEEE Trans. Cogn. Commun. Netw., 2020

Model-Free Algorithm and Regret Analysis for MDPs with Peak Constraints.
CoRR, 2020

Escaping Saddle Points for Zeroth-order Non-convex Optimization using Estimated Gradient Descent.
Proceedings of the 54th Annual Conference on Information Sciences and Systems, 2020

2019
Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent.
CoRR, 2019


  Loading...