An Empirical Study of Deep Reinforcement Learning in Continuing Tasks.
CoRR, January, 2025
ExpDrug: An explainable drug recommendation model based on space feature mapping.
Neurocomputing, 2025
Learning to bid and rank together in recommendation systems.
Mach. Learn., May, 2024
Correction to: Learning to bid and rank together in recommendation systems.
Mach. Learn., 2024
Epinet for Content Cold Start.
CoRR, 2024
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank.
CoRR, 2024
Uncertainty of Joint Neural Contextual Bandit.
CoRR, 2024
Neural Collapse To Multiple Centers For Imbalanced Data.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Offline Reinforcement Learning for Optimizing Production Bidding Policies.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024
IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024
Pearl: A Production-ready Reinforcement Learning Agent.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Non-Stationary Contextual Bandit Learning via Neural Predictive Ensemble Sampling.
CoRR, 2023
Two-tiered Online Optimization of Region-wide Datacenter Resource Allocation via Deep Reinforcement Learning.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Optimism Based Exploration in Large-Scale Recommender Systems.
CoRR, 2023
Deep Exploration for Recommendation Systems.
Proceedings of the 17th ACM Conference on Recommender Systems, 2023
Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning.
Proceedings of the 17th ACM Conference on Recommender Systems, 2023
Scalable Neural Contextual Bandit for Recommender Systems.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023
Multi-Agent Safe Planning with Gaussian Processes.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020
A Modified Levy Jump-Diffusion Model Based on Market Sentiment Memory for Online Jump Prediction.
CoRR, 2017