Accelerating Synchronous Distributed Data Parallel Training with Small Batch Sizes.
Proceedings of the Database Systems for Advanced Applications, 2024
StreamRec: A Recommendation Inference System with CUDA Stream Acceleration.
Proceedings of the Database Systems for Advanced Applications, 2024
Accelerating Recommendation Inference via GPU Streams.
Proceedings of the Database Systems for Advanced Applications, 2023