Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM Training.
CoRR, April, 2025
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from μWatts to MWatts for Sustainable AI.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
MTrainS: Improving DLRM training efficiency using heterogeneous memories.
CoRR, 2023
Efficient Utilization of Heterogeneous Compute and Memory Systems
PhD thesis, 2022
Power-optimized Deployment of Key-value Stores Using Storage Class Memory.
ACM Trans. Storage, 2022
HetSched: Quality-of-Mission Aware Scheduling for Autonomous Vehicle SoCs.
CoRR, 2022
Heterogeneity-Aware Scheduling on SoCs for Autonomous Vehicles.
IEEE Comput. Archit. Lett., 2021
Improving Performance of Flash Based Key-Value Stores Using Storage Class Memory as a Volatile Memory Extension.
Proceedings of the 2021 USENIX Annual Technical Conference, 2021
ChipAdvisor: A Machine Learning Approach for Mapping Applications to Heterogeneous Systems.
Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021
STOMP: A Tool for Evaluation of Scheduling Policies in Heterogeneous Multi-Processors.
CoRR, 2020
Heterogeneous Memory Subsystem for Natural Graph Analytics.
Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018