Cheng Li
Orcid: 0000-0001-7064-6120Affiliations:
- University of Science and Technology of China (USTC), China
- Max Planck Institute for Software Systems, Kaiserslautern / Saarbrücken, Germany (former)
According to our database1,
Cheng Li
authored at least 65 papers
between 2010 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
Perform. Evaluation, 2025
2024
IEEE Trans. Parallel Distributed Syst., December, 2024
Fastmove: A Comprehensive Study of On-Chip DMA and its Demonstration for Accelerating Data Movement in NVM-based Storage Systems.
ACM Trans. Storage, August, 2024
nnScaler: Constraint-Guided Parallelization Plan Generation for Deep Learning Training.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024
Designing Non-uniform Locally Repairable Codes for Wide Stripes under Skewed File Accesses.
Proceedings of the 53rd International Conference on Parallel Processing, 2024
Optimal Wide Stripe Generation in Locally Repairable Codes via Staged Stripe Merging.
Proceedings of the 44th IEEE International Conference on Distributed Computing Systems, 2024
Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Proceedings of the Nineteenth European Conference on Computer Systems, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Exploring Post-training Quantization in LLMs from Comprehensive Study to Low Rank Compensation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Understand Data Preprocessing for Effective End-to-End Training of Deep Neural Networks.
CoRR, 2023
CoRR, 2023
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases.
CoRR, 2023
CoRR, 2023
Proceedings of the 29th Symposium on Operating Systems Principles, 2023
Proceedings of the 29th Symposium on Operating Systems Principles, 2023
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023
Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases.
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
MPress: Democratizing Billion-Scale Model Training on Multi-GPU Servers via Memory-Saving Inter-Operator Parallelism.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Revitalizing the Forgotten On-Chip DMA to Expedite Data Movement in NVM-based Storage Systems.
Proceedings of the 21st USENIX Conference on File and Storage Technologies, 2023
Proceedings of the Eighteenth European Conference on Computer Systems, 2023
CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2022
vPipe: A Virtualized Acceleration System for Achieving Efficient and Scalable Pipeline Parallel DNN Training.
IEEE Trans. Parallel Distributed Syst., 2022
IEEE Trans. Parallel Distributed Syst., 2022
A Data Layout and Fast Failure Recovery Scheme for Distributed Storage Systems With Mixed Erasure Codes.
IEEE Trans. Computers, 2022
Random-LTD: Random and Layerwise Token Dropping Brings Efficient Training for Large-scale Transformers.
CoRR, 2022
DeepSpeed- Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
Repair-Optimal Data Placement for Locally Repairable Codes with Optimal Minimum Hamming Distance.
Proceedings of the 51st International Conference on Parallel Processing, 2022
NASPipe: high performance and reproducible pipeline parallel supernet training via causal synchronous parallelism.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
2021
IEEE Trans. Parallel Distributed Syst., 2021
ACM Trans. Storage, 2021
MTFC: A Multi-GPU Training Framework for Cube-CNN-Based Hyperspectral Image Classification.
IEEE Trans. Emerg. Top. Comput., 2021
Towards Cost-Effective and Elastic Cloud Database Deployment via Memory Disaggregation.
Proc. VLDB Endow., 2021
AutoGR: Automated Geo-Replication with Fast System Performance and Preserved Application Semantics.
Proc. VLDB Endow., 2021
Concurr. Comput. Pract. Exp., 2021
Proceedings of the SOSP '21: ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021
Proceedings of the International Conference for High Performance Computing, 2021
Proceedings of the 19th USENIX Conference on File and Storage Technologies, 2021
Achieving low tail-latency and high scalability for serializable transactions in edge computing.
Proceedings of the EuroSys '21: Sixteenth European Conference on Computer Systems, 2021
Lessons learned from migrating complex stateful applications onto serverless platforms.
Proceedings of the APSys '21: 12th ACM SIGOPS Asia-Pacific Workshop on Systems, 2021
2020
Not All Explorations Are Equal: Harnessing Heterogeneous Profiling Cost for Efficient MLaaS Training.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020
PDL: A Data Layout towards Fast Failure Recovery for Erasure-coded Distributed Storage Systems.
Proceedings of the 39th IEEE Conference on Computer Communications, 2020
Proceedings of the 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2020
Proceedings of the SoCC '20: ACM Symposium on Cloud Computing, 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
Explicit Data Correlations-Directed Metadata Prefetching Method in Distributed File Systems.
IEEE Trans. Parallel Distributed Syst., 2019
ElasticBF: Elastic Bloom Filter with Hotness Awareness for Boosting Read Performance in Large Key-Value Stores.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019
Proceedings of the AI 2019: Advances in Artificial Intelligence, 2019
2018
Proceedings of the 2018 USENIX Annual Technical Conference, 2018
Proceedings of the 2018 IEEE International Conference on Networking, 2018
Proceedings of the Smart Multimedia - First International Conference, 2018
ElasticBF: Fine-grained and Elastic Bloom Filter Towards Efficient Read for LSM-tree-based KV Stores.
Proceedings of the 10th USENIX Workshop on Hot Topics in Storage and File Systems, 2018
2016
PhD thesis, 2016
IEEE Data Eng. Bull., 2016
2015
Proceedings of the Tenth European Conference on Computer Systems, 2015
Proceedings of the First Workshop on Principles and Practice of Consistency for Distributed Data, 2015
2014
Proceedings of the 2014 USENIX Annual Technical Conference, 2014
2012
Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, 2012
2011
Proceedings of the European Conference on Computer Systems, 2011
2010
Proceedings of the 2010 IEEE/IFIP International Conference on Dependable Systems and Networks, 2010