Xupeng Miao
Orcid: 0000-0002-9371-8358
According to our database1,
Xupeng Miao
authored at least 59 papers
between 2019 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management.
CoRR, 2024
Atlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUs (Extended Version).
CoRR, 2024
CoRR, 2024
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism.
CoRR, 2024
Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs.
CoRR, 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning.
CoRR, 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
CoRR, 2024
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024
Proceedings of the Companion of the 2024 International Conference on Management of Data, 2024
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024
X-former Elucidator: Reviving Efficient Attention for Long Context Language Modeling.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
MFIX: An Efficient and Reliable Index Advisor via Multi-Fidelity Bayesian Optimization.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Proc. VLDB Endow., December, 2023
P<sup>2</sup>CG: a privacy preserving collaborative graph neural network training framework.
VLDB J., July, 2023
Sci. China Inf. Sci., January, 2023
Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-Aware Deep Architecture.
IEEE Trans. Knowl. Data Eng., 2023
Proc. VLDB Endow., 2023
SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training.
Proc. VLDB Endow., 2023
FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement.
Proc. ACM Manag. Data, 2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems.
CoRR, 2023
CoRR, 2023
FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference.
CoRR, 2023
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.
CoRR, 2023
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
IEEE Trans. Knowl. Data Eng., 2022
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism.
Proc. VLDB Endow., 2022
Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Update.
Proc. VLDB Endow., 2022
Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Updates.
CoRR, 2022
CoRR, 2022
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022
TSPLIT: Fine-grained GPU Memory Management for Efficient DNN Training via Tensor Splitting.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022
Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-aware Deep Architecture (Extended Abstract).
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022
HET-KG: Communication-Efficient Knowledge Graph Embedding Training via Hotness-Aware Cache.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022
2021
Memory-aware framework for fast and scalable second-order random walk over billion-edge natural graphs.
VLDB J., 2021
HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework.
Proc. VLDB Endow., 2021
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021
CuWide: Towards Efficient Flow-based Training for Sparse Wide Models on GPUs (Extended Abstract).
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021
2020
Proceedings of the 2020 International Conference on Management of Data, 2020
Proceedings of the 2020 International Conference on Management of Data, 2020
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020
2019
Proceedings of the 2019 International Conference on Management of Data, 2019