2024
Optimizing GPU Multiplexing for Efficient and Cost-Effective Access to Diverse Large Language Models in GPU Clusters.
Proceedings of the 32nd International Conference on Modeling, 2024

Characterizing Training Performance and Energy for Foundation Models and Image Classifiers on Multi-Instance GPUs.
Proceedings of the 4th Workshop on Machine Learning and Systems, 2024