Accelerating Neural Recommendation Training with Embedding Scheduling.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024
Towards Domain-Specific Network Transport for Distributed DNN Training.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024
AutoPipe: Automatic Configuration of Pipeline Parallelism in Shared GPU Cluster.
Proceedings of the 53rd International Conference on Parallel Processing, 2024
AutoByte: Automatic Configuration for Optimal Communication Scheduling in DNN Training.
Proceedings of the IEEE INFOCOM 2022, 2022
Herald: An Embedding Scheduler for Distributed Embedding Model Training.
Proceedings of the 6th Asia-Pacific Workshop on Networking, 2022
Automatic Configuration for Optimal Communication Scheduling in DNN Training.
CoRR, 2021
TACC: A Full-stack Cloud Computing Infrastructure for Machine Learning Tasks.
CoRR, 2021
Domain-specific Communication Optimization for Distributed DNN Training.
CoRR, 2020
RAT - Resilient Allreduce Tree for Distributed Machine Learning.
Proceedings of the APNet '20: 4th Asia-Pacific Workshop on Networking, 2020