2024
Improving Throughput-oriented LLM Inference with CPU Computations.
Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024

2023
Improving Throughput-oriented Generative Inference with CPUs.
Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems, 2023

2022
A Black-Box Graph Partitioner for Generalized Deep Neural Network Parallelization.
Proceedings of the Economics of Grids, Clouds, Systems, and Services, 2022

2021
Can VM Live Migration Improve Job Throughput? Evidence from a Real World Cluster Trace.
Proceedings of the Economics of Grids, Clouds, Systems, and Services, 2021