Xiafei Qiu
Orcid: 0009-0008-8803-928X
According to our database1,
Xiafei Qiu
authored at least 8 papers
between 2018 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache.
CoRR, 2024
MonoNN: Enabling a New Monolithic Optimization Space for Neural Network Inference Tasks on Modern GPU-Centric Architectures.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024
2023
BladeDISC: Optimizing Dynamic Shape Machine Learning Workloads via Compiler Approach.
Proc. ACM Manag. Data, September, 2023
Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.
Proc. VLDB Endow., 2023
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.
CoRR, 2023
RECom: A Compiler Approach to Accelerating Recommendation Model Inference with Massive Embedding Columns.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2018
Proc. VLDB Endow., 2018