Zihao Ye

Orcid: 0000-0002-6450-8108

Affiliations:
  • University of Washington, School of Computer Science and Engineering, Seattle, WA, USA
  • Amazon Web Services, Shanghai AI Lab, Shanghai, China


According to our database1, Zihao Ye authored at least 13 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
NanoFlow: Towards Optimal Large Language Model Serving Throughput.
CoRR, 2024

Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

Punica: Multi-Tenant LoRA Serving.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs.
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

2023
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning.
CoRR, 2023

SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

TensorIR: An Abstraction for Automatic Tensorized Program Optimization.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
Graphiler: Optimizing Graph Neural Networks with Message Passing Data Flow Graph.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

2020
Transformer on a Diet.
CoRR, 2020

DGL-KE: Training Knowledge Graph Embeddings at Scale.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

FeatGraph: a flexible and efficient backend for graph neural network systems.
Proceedings of the International Conference for High Performance Computing, 2020

2019
BP-Transformer: Modelling Long-Range Context via Binary Partitioning.
CoRR, 2019

Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs.
CoRR, 2019


  Loading...