We stand with Ukraine

We stand with Ukraine

Qingpeng Cai

Orcid: 0000-0001-6451-9299

Affiliations:

Kuaishou Technology, Beijing, China
Alibaba Group (former)
Tsinghua University, China (former)

According to our database¹, Qingpeng Cai authored at least 46 papers between 2016 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

On csauthors.net:

Bibliography

2024

GAS: Generative Auto-bidding with Post-training Search.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

LLM-Powered User Simulator for Recommender System.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

DLCRec: A Novel Approach for Managing Diversity in LLM-Based Recommender Systems.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2024

Rectifying Reinforcement Learning for Reward Matching.

[BibT_eX]

[DOI]

,

Emmanuel Bengio

,

,

CoRR, 2024

Bifurcated Generative Flow Networks.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2024

M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

Future Impact Decomposition in Request-level Recommendations.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

M<sup>3</sup>oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Sequential Recommendation for Optimizing Both Immediate Feedback and Long-term Retention.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

AgentIR: 1st Workshop on Agent-based Information Retrieval.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Future Impact Decomposition in Request-level Recommendations.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Modeling User Retention through Generative Flow Networks.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

2023

AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term User Engagement.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2023

A Large Language Model Enhanced Conversational Recommender System.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2023

Multi-Task Recommendations with Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the ACM Web Conference 2023, 2023

Exploration and Regularization of the Latent Action Space in Recommendation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the ACM Web Conference 2023, 2023

Two-Stage Constrained Actor-Critic for Short Video Recommendation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the ACM Web Conference 2023, 2023

Reinforcing User Retention in a Billion Scale Short Video Recommender System.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Companion Proceedings of the ACM Web Conference 2023, 2023

KuaiSim: A Comprehensive Simulator for Recommender Systems.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

State Regularized Policy Optimization on Data with Dynamics Shift.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Generative Flow Network for Listwise Recommendation.

[BibT_eX]

[DOI]

,

,

,

,

Julian J. McAuley

,

,

,

Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

PrefRec: Preference-based Recommender Systems for Reinforcing Long-term User Engagement.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2022

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2022

Constrained Reinforcement Learning for Short Video Recommendation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2022

2021

Exploration in policy optimization through multiple paths.

[BibT_eX]

[DOI]

,

,

Auton. Agents Multi Agent Syst., 2021

2020

Generator and Critic: A Deep Reinforcement Learning Approach for Slate Re-ranking in E-commerce.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2020

Softmax Deep Double Deterministic Policy Gradients.

[BibT_eX]

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Reinforcement Learning with Dynamic Boltzmann Softmax Updates.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Multi-Path Policy Optimization.

[BibT_eX]

[DOI]

,

,

Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Deterministic Value-Policy Gradients.

[BibT_eX]

[DOI]

,

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Reinforcement Learning Driven Heuristic Optimization.

[BibT_eX]

[DOI]

,

,

Azalia Mirhoseini

,

,

,

CoRR, 2019

Reinforcement Learning with Dynamic Boltzmann Softmax Updates.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2019

Policy Gradients for Contextual Recommendations.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the World Wide Web Conference, 2019

Policy Optimization with Model-Based Explorations.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

A Deep Reinforcement Learning Framework for Rebalancing Dockless Bike Sharing Systems.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Generalized deterministic policy gradient algorithms.

[BibT_eX]

[DOI]

,

,

CoRR, 2018

Rebalancing Dockless Bike Sharing Systems.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2018

Policy Gradients for Contextual Bandits.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2018

Reinforcement Mechanism Design for e-commerce.

[BibT_eX]

[DOI]

,

Aris Filos-Ratsikas

,

,

Proceedings of the 2018 World Wide Web Conference on World Wide Web, 2018

Ranking Mechanism Design for Price-setting Agents in E-commerce.

[BibT_eX]

[DOI]

,

,

Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Reinforcement Mechanism Design for Fraudulent Behaviour in e-Commerce.

[BibT_eX]

[DOI]

,

Aris Filos-Ratsikas

,

,

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Multi-armed Bandit Mechanism with Private Histories.

[BibT_eX]

[DOI]

,

,

Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 2017

2016

Mechanism Design for Personalized Recommender Systems.

[BibT_eX]

[DOI]

,

Aris Filos-Ratsikas

,

,

Proceedings of the 10th ACM Conference on Recommender Systems, 2016

Facility Location with Minimax Envy.

[BibT_eX]

[DOI]

,

Aris Filos-Ratsikas

,

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Loading...