Xupeng Miao

Orcid: 0000-0002-9371-8358

According to our database1, Xupeng Miao authored at least 60 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Distributed Graph Neural Network Training: A Survey.
ACM Comput. Surv., August, 2024

Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management.
CoRR, 2024

Atlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUs (Extended Version).
CoRR, 2024

PQCache: Product Quantization-based KVCache for Long Context LLM Inference.
CoRR, 2024

GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism.
CoRR, 2024

Optimal Kernel Orchestration for Tensor Programs with Korch.
CoRR, 2024

Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs.
CoRR, 2024

FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning.
CoRR, 2024

Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
CoRR, 2024

Enabling Parallelism Hot Switching for Efficient Training of Large Language Models.
Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024

Demystifying Data Management for Large Language Models.
Proceedings of the Companion of the 2024 International Conference on Management of Data, 2024

Atlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUs.
Proceedings of the International Conference for High Performance Computing, 2024

Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances.
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024

X-former Elucidator: Reviving Efficient Attention for Long Context Language Modeling.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

MFIX: An Efficient and Reliable Index Advisor via Multi-Fidelity Bayesian Optimization.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Generative Dense Retrieval: Memory Can Be a Burden.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

SpotServe: Serving Generative Large Language Models on Preemptible Instances.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Optimal Kernel Orchestration for Tensor Programs with Korch.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Experimental Analysis of Large-scale Learnable Vector Storage Compression.
Proc. VLDB Endow., December, 2023

P<sup>2</sup>CG: a privacy preserving collaborative graph neural network training framework.
VLDB J., July, 2023

Hetu: a highly efficient automatic parallel distributed deep learning system.
Sci. China Inf. Sci., January, 2023

Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-Aware Deep Architecture.
IEEE Trans. Knowl. Data Eng., 2023

Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent.
Proc. VLDB Endow., 2023

SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training.
Proc. VLDB Endow., 2023

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement.
Proc. ACM Manag. Data, 2023

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems.
CoRR, 2023

Improving Automatic Parallel Training via Balanced Memory Workload Optimization.
CoRR, 2023

FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference.
CoRR, 2023

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.
CoRR, 2023

EINNET: Optimizing Tensor Programs with Derivation-Based Transformations.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Model-enhanced Vector Index.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
CuWide: Towards Efficient Flow-Based Training for Sparse Wide Models on GPUs.
IEEE Trans. Knowl. Data Eng., 2022

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism.
Proc. VLDB Endow., 2022

Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Update.
Proc. VLDB Endow., 2022

OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning.
CoRR, 2022

Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Updates.
CoRR, 2022

HetuMoE: An Efficient Trillion-scale Mixture-of-Expert Distributed Training System.
CoRR, 2022

HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

TSPLIT: Fine-grained GPU Memory Management for Efficient DNN Training via Tensor Splitting.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Lasagne: A Multi-Layer Graph Convolutional Network Framework via Node-aware Deep Architecture (Extended Abstract).
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Zoomer: Boosting Retrieval on Web-scale Graphs by Regions of Interest.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

HET-KG: Communication-Efficient Knowledge Graph Embedding Training via Hotness-Aware Cache.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

PointCLIP: Point Cloud Understanding by CLIP.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Scalable Graph Sampling on GPUs with Compressed Graph.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

2021
Memory-aware framework for fast and scalable second-order random walk over billion-edge natural graphs.
VLDB J., 2021

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework.
Proc. VLDB Endow., 2021

Dense-to-Sparse Gate for Mixture-of-Experts.
CoRR, 2021

Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

ROD: Reception-aware Online Distillation for Sparse Graphs.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

DeGNN: Improving Graph Neural Networks with Graph Decomposition.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

CuWide: Towards Efficient Flow-based Training for Sparse Wide Models on GPUs (Extended Abstract).
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

2020
Reliable Data Distillation on Graph Convolutional Network.
Proceedings of the 2020 International Conference on Management of Data, 2020

Memory-Aware Framework for Efficient Second-Order Random Walk on Large Graphs.
Proceedings of the 2020 International Conference on Management of Data, 2020

PSGraph: How Tencent trains extremely large-scale graphs with Spark?
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

2019
PS2: Parameter Server on Spark.
Proceedings of the 2019 International Conference on Management of Data, 2019


  Loading...