2025
Meta's Second Generation AI Chip: Model-Chip Co-Design and Productionization Experiences.
Proceedings of the 52nd Annual International Symposium on Computer Architecture, 2025

2024
PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel.
Proc. VLDB Endow., 2023

MTIA: First Generation Silicon Targeting Meta's Recommendation Systems.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2022
Software-hardware co-design for fast and scalable training of deep learning recommendation models.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022