Conglong Li
According to our database1,
Conglong Li
authored at least 24 papers
between 2013 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs.
CoRR, 2024
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies.
CoRR, 2023
DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention.
CoRR, 2023
DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales.
CoRR, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
2022
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing.
CoRR, 2022
Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers.
CoRR, 2022
CoRR, 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale.
Proceedings of the International Conference on Machine Learning, 2022
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed.
Proceedings of the 29th IEEE International Conference on High Performance Computing, 2022
2021
Curriculum Learning: A Regularization Method for Efficient and Stable Billion-Scale GPT Model Pre-Training.
CoRR, 2021
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed.
Proceedings of the 38th International Conference on Machine Learning, 2021
2020
Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination.
Proceedings of the 2020 International Conference on Management of Data, 2020
2019
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019
2018
Proceedings of the 2018 World Wide Web Conference on World Wide Web, 2018
2017
Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 2017
Proceedings of the ACM/IEEE Symposium on Architectures for Networking and Communications Systems, 2017
2015
Proceedings of the Tenth European Conference on Computer Systems, 2015
Proceedings of the 11th ACM Conference on Emerging Networking Experiments and Technologies, 2015
2013
ACM Trans. Archit. Code Optim., 2013