Zhihao Jia
Orcid: 0000-0002-1270-5185
According to our database1,
Zhihao Jia
authored at least 72 papers
between 2012 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
2012
2014
2016
2018
2020
2022
2024
0
5
10
15
20
25
15
5
7
3
1
1
1
11
5
6
4
2
3
2
1
1
1
2
1
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Drone-NeRF: Efficient NeRF based 3D scene reconstruction for large-scale drone survey.
Image Vis. Comput., 2024
CoRR, 2024
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention.
CoRR, 2024
Atlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUs (Extended Version).
CoRR, 2024
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism.
CoRR, 2024
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices.
CoRR, 2024
Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs.
CoRR, 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning.
CoRR, 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
CoRR, 2024
Proceedings of the Companion of the 2024 International Conference on Management of Data, 2024
CLLP: Contrastive Learning Framework Based on Latent Preferences for Next POI Recommendation.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024
Proceedings of the International Conference for High Performance Computing, 2024
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024
X-former Elucidator: Reviving Efficient Attention for Long Context Language Modeling.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Intracranial Steno-Occlusive Lesion Detection on Magnetic Resonance Angiography Images.
Proceedings of the 2024 16th International Conference on Bioinformatics and Biomedical Technology, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
IEEE Trans. Computers, December, 2023
Dynamic Correlation Adjacency-Matrix-Based Graph Neural Networks for Traffic Flow Prediction.
Sensors, March, 2023
SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training.
Proc. VLDB Endow., 2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems.
CoRR, 2023
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.
CoRR, 2023
Proceedings of the 29th Symposium on Operating Systems Principles, 2023
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023
TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs.
Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, 2023
Bamboo: Making Preemptible Instances Resilient for Affordable Training of Large DNNs.
Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation, 2023
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023
2022
Proc. VLDB Endow., 2022
CoRR, 2022
Proceedings of the PLDI '22: 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, San Diego, CA, USA, June 13, 2022
Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization.
Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Software-hardware co-design for fast and scalable training of deep learning recommendation models.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022
2021
CoRR, 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections.
Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation, 2021
Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads.
Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation, 2021
Proceedings of the Fourth Conference on Machine Learning and Systems, 2021
2020
Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc.
Proceedings of the Third Conference on Machine Learning and Systems, 2020
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020
2019
TASO: optimizing deep learning computation with automatic generation of graph substitutions.
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019
2018
J. Intell. Fuzzy Syst., 2018
Proceedings of the 32nd International Conference on Supercomputing, 2018
Proceedings of the 35th International Conference on Machine Learning, 2018
2017
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017
2016
Proceedings of the 2016 USENIX Annual Technical Conference, 2016
2015
Automatic and transparent I/O optimization with storage integrated application runtime support.
Proceedings of the 10th Parallel Data Storage Workshop, 2015
2012
Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, 2012
Proceedings of the Asia-Pacific Workshop on Systems, 2012