Amar Phanishayee

Orcid: 0009-0001-2777-1118

According to our database¹, Amar Phanishayee authored at least 58 papers between 2006 and 2025.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Forecasting GPU Performance for Deep Learning Training and Inference.

[BibT_eX]

[DOI]

Seonho Lee

Amar Phanishayee

Divya Mahajan

Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025

2024

Data-driven Forecasting of Deep Learning Performance on GPUs.

[BibT_eX]

[DOI]

Seonho Lee

Amar Phanishayee

Divya Mahajan

CoRR, 2024

Workload-Aware Hardware Accelerator Mining for Distributed Deep Learning Training.

[BibT_eX]

[DOI]

CoRR, 2024

Integrated Hardware Architecture and Device Placement Search.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

DéjàVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

MGit: A Model Versioning and Management System.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Blox: A Modular Toolkit for Deep Learning Schedulers.

[BibT_eX]

[DOI]

Saurabh Agarwal

Amar Phanishayee

Shivaram Venkataraman

Proceedings of the Nineteenth European Conference on Computer Systems, 2024

2023

Packrat: Automatic Reconfiguration for Latency Minimization in CPU-based DNN Serving.

[BibT_eX]

[DOI]

CoRR, 2023

MGit: A Model Versioning and Management System.

[BibT_eX]

[DOI]

CoRR, 2023

2022

Harmony: Overcoming the hurdles of GPU memory capacity to train massive DNN models on commodity servers.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2022

A Study on the Intersection of GPU Utilization and CNN Inference.

[BibT_eX]

[DOI]

Jack Kosaian

Amar Phanishayee

CoRR, 2022

Looking Beyond GPUs for DNN Scheduling on Multi-Tenant Clusters.

[BibT_eX]

[DOI]

Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

2021

Analyzing and Mitigating Data Stalls in DNN Training.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2021

Synergy: Resource Sensitive DNN Scheduling in Multi-Tenant Clusters.

[BibT_eX]

[DOI]

CoRR, 2021

Efficient Large-Scale Language Model Training on GPU Clusters.

[BibT_eX]

[DOI]

CoRR, 2021

Efficient large-scale language model training on GPU clusters using megatron-LM.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2021

Piper: Multidimensional Planner for DNN Parallelization.

[BibT_eX]

[DOI]

Jakub Tarnawski

Deepak Narayanan

Amar Phanishayee

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Memory-Efficient Pipeline-Parallel DNN Training.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Boosting the Throughput and Accelerator Utilization of Specialized CNN Inference Beyond Increasing Batch Size.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Doing more with less: training large DNN models on commodity servers for the masses.

[BibT_eX]

[DOI]

Proceedings of the HotOS '21: Workshop on Hot Topics in Operating Systems, 2021

CheckFreq: Frequent, Fine-Grained DNN Checkpointing.

[BibT_eX]

[DOI]

Jayashree Mohan

Amar Phanishayee

Vijay Chidambaram

Proceedings of the 19th USENIX Conference on File and Storage Technologies, 2021

2020

Daydream: Accurately Estimating the Efficacy of Optimizations for DNN Training.

[BibT_eX]

[DOI]

Hongyu Zhu

Amar Phanishayee

Gennady Pekhimenko

Proceedings of the 2020 USENIX Annual Technical Conference, 2020

Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads.

[BibT_eX]

[DOI]

Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Themis: Fair and Efficient GPU Cluster Scheduling.

[BibT_eX]

[DOI]

Kshiteej Mahajan

Arjun Balasubramanian

Arjun Singhvi

Shivaram Venkataraman

Aditya Akella

Amar Phanishayee

Shuchi Chawla

Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation, 2020

Efficient Algorithms for Device Placement of DNN Graph Operators.

[BibT_eX]

[DOI]

Fanny Nina Paravecino

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Blink: Fast and Generic Collectives for Distributed ML.

[BibT_eX]

[DOI]

Guanhua Wang

Shivaram Venkataraman

Proceedings of the Third Conference on Machine Learning and Systems, 2020

The Non-IID Data Quagmire of Decentralized Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

2019

Themis: Fair and Efficient GPU Cluster Scheduling for Machine Learning Workloads.

[BibT_eX]

[DOI]

Kshiteej Mahajan

Arjun Singhvi

Arjun Balasubramanian

Varun Batra

Surya Teja Chavali

Shivaram Venkataraman

Aditya Akella

Amar Phanishayee

Shuchi Chawla

CoRR, 2019

Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads.

[BibT_eX]

[DOI]

Myeongjae Jeon

Shivaram Venkataraman

Proceedings of the 2019 USENIX Annual Technical Conference, 2019

PipeDream: generalized pipeline parallelism for DNN training.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019

The Case for Unifying Data Loading in Machine Learning Clusters.

[BibT_eX]

[DOI]

Aarati Kakaraparthy

Abhay Venkatesh

Amar Phanishayee

Shivaram Venkataraman

Proceedings of the 11th USENIX Workshop on Hot Topics in Cloud Computing, 2019

2018

Compositional programming and testing of dynamic distributed systems.

[BibT_eX]

[DOI]

Proc. ACM Program. Lang., 2018

PipeDream: Fast and Efficient Pipeline Parallel DNN Training.

[BibT_eX]

[DOI]

CoRR, 2018

TBD: Benchmarking and Analyzing Deep Neural Network Training.

[BibT_eX]

[DOI]

CoRR, 2018

Parameter Hub: High Performance Parameter Servers for Efficient Distributed Deep Neural Network Training.

[BibT_eX]

[DOI]

CoRR, 2018

Gist: Efficient Data Encoding for Deep Neural Network Training.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

Benchmarking and Analyzing Deep Neural Network Training.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018

Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training.

[BibT_eX]

[DOI]

Proceedings of the ACM Symposium on Cloud Computing, 2018

2017

RAIL: A Case for Redundant Arrays of Inexpensive Links in Data Center Networks.

[BibT_eX]

[DOI]

Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation, 2017

Atomic In-place Updates for Non-volatile Main Memories with Kamino-Tx.

[BibT_eX]

[DOI]

Amir Saman Memaripour

Proceedings of the Twelfth European Conference on Computer Systems, 2017

2016

Beam: Ending Monolithic Applications for Connected Devices.

[BibT_eX]

[DOI]

Proceedings of the 2016 USENIX Annual Technical Conference, 2016

ProjecToR: Agile Reconfigurable Data Center Interconnect.

[BibT_eX]

[DOI]

Pierre-Alexandre Blanche

Houman Rastegarfar

Madeleine Glick

Daniel C. Kilper

Proceedings of the ACM SIGCOMM 2016 Conference, Florianopolis, Brazil, August 22-26, 2016, 2016

Evaluation of elastic modulation gains in microsoft's optical backbone in North America.

[BibT_eX]

[DOI]

Proceedings of the Optical Fiber Communications Conference and Exhibition, 2016

2015

It's Time to End Monolithic Apps for Connected Devices.

[BibT_eX]

[DOI]

A Case for Ending Monolithic Apps for Connected Devices.

[BibT_eX]

[DOI]

Proceedings of the 15th Workshop on Hot Topics in Operating Systems, 2015

2014

Bolt: Data Management for Connected Homes.

[BibT_eX]

[DOI]

Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation, 2014

2013

HomeLab: a platform for conducting experiments with connected devices in the home.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2013 Conference, 2013

Lab of things: a platform for conducting studies with connected devices in multiple homes.

[BibT_eX]

[DOI]

Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2013

2011

FAWN: a fast array of wimpy nodes.

[BibT_eX]

[DOI]

Commun. ACM, 2011

2009

Safe and effective fine-grained TCP retransmissions for datacenter communication.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGCOMM 2009 Conference on Applications, 2009

FAWNdamentally Power-efficient Clusters.

[BibT_eX]

[DOI]

Proceedings of HotOS'09: 12th Workshop on Hot Topics in Operating Systems, 2009

Scaling all-pairs overlay routing.

[BibT_eX]

[DOI]

Proceedings of the 2009 ACM Conference on Emerging Networking Experiments and Technology, 2009

2008

Ditto: a system for opportunistic caching in multi-hop wireless networks.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual International Conference on Mobile Computing and Networking, 2008

Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems.

[BibT_eX]

[DOI]

Proceedings of the 6th USENIX Conference on File and Storage Technologies, 2008

2007

On application-level approaches to avoiding TCP throughput collapse in cluster-based storage systems.

[BibT_eX]

[DOI]

Proceedings of the 2nd International Petascale Data Storage Workshop (PDSW '07), 2007

Ricochet: Lateral Error Correction for Time-Critical Multicast.

[BibT_eX]

[DOI]

Proceedings of the 4th Symposium on Networked Systems Design and Implementation (NSDI 2007), 2007

Scalable Multicast Platforms for a New Generation of Robust Distributed Applications.

[BibT_eX]

[DOI]

Proceedings of the Second International Conference on COMmunication System softWAre and MiddlewaRE (COMSWARE 2007), 2007

2006

PLATO: Predictive Latency-Aware Total Ordering.

[BibT_eX]

[DOI]

Mahesh Balakrishnan

Ken Birman

Amar Phanishayee

Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems (SRDS 2006), 2006

Amar Phanishayee

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...