2024
NanoFlow: Towards Optimal Large Language Model Serving Throughput.
CoRR, 2024

Can Storage Devices be Power Adaptive?
Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems, 2024

2022
Optimizing half precision Winograd convolution on ARM many-core processors.
Proceedings of the APSys '22: 13th ACM SIGOPS Asia-Pacific Workshop on Systems, Virtual Event, Singapore, August 23, 2022