2025
Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM Training.
CoRR, April, 2025

MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from μWatts to MWatts for Sustainable AI.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025

2024
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI.
CoRR, 2024

2023
MTrainS: Improving DLRM training efficiency using heterogeneous memories.
CoRR, 2023

2022
Efficient Utilization of Heterogeneous Compute and Memory Systems
PhD thesis, 2022

Power-optimized Deployment of Key-value Stores Using Storage Class Memory.
ACM Trans. Storage, 2022

HetSched: Quality-of-Mission Aware Scheduling for Autonomous Vehicle SoCs.
CoRR, 2022

2021
Heterogeneity-Aware Scheduling on SoCs for Autonomous Vehicles.
IEEE Comput. Archit. Lett., 2021

Improving Performance of Flash Based Key-Value Stores Using Storage Class Memory as a Volatile Memory Extension.
Proceedings of the 2021 USENIX Annual Technical Conference, 2021

ChipAdvisor: A Machine Learning Approach for Mapping Applications to Heterogeneous Systems.
Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021

2020
STOMP: A Tool for Evaluation of Scheduling Policies in Heterogeneous Multi-Processors.
CoRR, 2020

2018
Heterogeneous Memory Subsystem for Natural Graph Analytics.
Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018