Qingpeng Cai

Orcid: 0000-0001-6451-9299

Affiliations:
  • Kuaishou Technology, Beijing, China
  • Alibaba Group (former)
  • Tsinghua University, China (former)


According to our database1, Qingpeng Cai authored at least 44 papers between 2016 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
DLCRec: A Novel Approach for Managing Diversity in LLM-Based Recommender Systems.
CoRR, 2024

Rectifying Reinforcement Learning for Reward Matching.
CoRR, 2024

Bifurcated Generative Flow Networks.
CoRR, 2024

M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework.
CoRR, 2024

Future Impact Decomposition in Request-level Recommendations.
CoRR, 2024

M<sup>3</sup>oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Sequential Recommendation for Optimizing Both Immediate Feedback and Long-term Retention.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

AgentIR: 1st Workshop on Agent-based Information Retrieval.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Future Impact Decomposition in Request-level Recommendations.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Modeling User Retention through Generative Flow Networks.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

2023
AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term User Engagement.
CoRR, 2023

A Large Language Model Enhanced Conversational Recommender System.
CoRR, 2023

Multi-Task Recommendations with Reinforcement Learning.
Proceedings of the ACM Web Conference 2023, 2023

Exploration and Regularization of the Latent Action Space in Recommendation.
Proceedings of the ACM Web Conference 2023, 2023

Two-Stage Constrained Actor-Critic for Short Video Recommendation.
Proceedings of the ACM Web Conference 2023, 2023

Reinforcing User Retention in a Billion Scale Short Video Recommender System.
Proceedings of the Companion Proceedings of the ACM Web Conference 2023, 2023

KuaiSim: A Comprehensive Simulator for Recommender Systems.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

State Regularized Policy Optimization on Data with Dynamics Shift.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Generative Flow Network for Listwise Recommendation.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
PrefRec: Preference-based Recommender Systems for Reinforcing Long-term User Engagement.
CoRR, 2022

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor.
CoRR, 2022

Constrained Reinforcement Learning for Short Video Recommendation.
CoRR, 2022

2021
Exploration in policy optimization through multiple paths.
Auton. Agents Multi Agent Syst., 2021

2020
Generator and Critic: A Deep Reinforcement Learning Approach for Slate Re-ranking in E-commerce.
CoRR, 2020

Softmax Deep Double Deterministic Policy Gradients.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Reinforcement Learning with Dynamic Boltzmann Softmax Updates.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Multi-Path Policy Optimization.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Deterministic Value-Policy Gradients.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Reinforcement Learning Driven Heuristic Optimization.
CoRR, 2019

Reinforcement Learning with Dynamic Boltzmann Softmax Updates.
CoRR, 2019

Policy Gradients for Contextual Recommendations.
Proceedings of the World Wide Web Conference, 2019

Policy Optimization with Model-Based Explorations.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

A Deep Reinforcement Learning Framework for Rebalancing Dockless Bike Sharing Systems.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Generalized deterministic policy gradient algorithms.
CoRR, 2018

Rebalancing Dockless Bike Sharing Systems.
CoRR, 2018

Policy Gradients for Contextual Bandits.
CoRR, 2018

Reinforcement Mechanism Design for e-commerce.
Proceedings of the 2018 World Wide Web Conference on World Wide Web, 2018

Ranking Mechanism Design for Price-setting Agents in E-commerce.
Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Reinforcement Mechanism Design for Fraudulent Behaviour in e-Commerce.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Multi-armed Bandit Mechanism with Private Histories.
Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 2017

2016
Mechanism Design for Personalized Recommender Systems.
Proceedings of the 10th ACM Conference on Recommender Systems, 2016

Facility Location with Minimax Envy.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016


  Loading...