StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation.
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, April, 2025
PipeWeaver: Addressing Data Dynamicity in Large Multimodal Model Training with Dynamic Interleaved Pipeline.
CoRR, April, 2025
Optimizing RLHF Training for Large Language Models with Stage Fusion.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 22nd USENIX Symposium on Networked Systems Design and Implementation, 2025
RLHFuse: Efficient RLHF Training for Large Language Models with Inter- and Intra-Stage Fusion.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
DistTrain: Addressing Model and Data Heterogeneity with Disaggregated Training for Multimodal Large Language Models.
CoRR, 2024
CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024
dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training.
CoRR, 2022
dPRO: A Generic Performance Diagnosis and Optimization Toolkit for Expediting Distributed DNN Training.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022
Distributed Machine Learning through Heterogeneous Edge Systems.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020