Donglin Zhuang
Orcid: 0000-0003-3355-407X
According to our database1,
Donglin Zhuang
authored at least 13 papers
between 2020 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design.
CoRR, 2024
Quant-LLM: Accelerating the Serving of Large Language Models via FP6-Centric Algorithm-System Co-Design on Modern GPUs.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024
MonoNN: Enabling a New Monolithic Optimization Space for Neural Network Inference Tasks on Modern GPU-Centric Architectures.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024
2023
Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.
Proc. VLDB Endow., 2023
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.
CoRR, 2023
2022
DynamAP: Architectural Support for Dynamic Graph Traversal on the Automata Processor.
ACM Trans. Archit. Code Optim., 2022
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022
Bring orders into uncertainty: enabling efficient uncertain graph processing via novel path sampling on multi-accelerator systems.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
2021
Enabling Highly Efficient Capsule Networks Processing Through Software-Hardware Co-Design.
IEEE Trans. Computers, 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
η-LSTM: Co-Designing Highly-Efficient Large LSTM Training via Exploiting Memory-Saving and Architectural Design Opportunities.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
ClickTrain: efficient and accurate end-to-end deep learning training via fine-grained architecture-preserving pruning.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021
2020
An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning.
CoRR, 2020