Deepak Narayanan

Orcid: 0000-0002-3020-2848

According to our database1, Deepak Narayanan authored at least 36 papers between 2016 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Nemotron-4 340B Technical Report.
CoRR, 2024

An Empirical Study of Mamba-based Language Models.
CoRR, 2024

Nemotron-4 15B Technical Report.
CoRR, 2024

The Case for Co-Designing Model Architectures with Hardware.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

MGit: A Model Versioning and Management System.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
Holistic Evaluation of Language Models.
Trans. Mach. Learn. Res., 2023

Packrat: Automatic Reconfiguration for Latency Minimization in CPU-based DNN Serving.
CoRR, 2023

MGit: A Model Versioning and Management System.
CoRR, 2023

Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs.
CoRR, 2023

Kerveros: Efficient and Scalable Cloud Admission Control.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Cheaply Estimating Inference Efficiency Metrics for Autoregressive Transformer Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Holistic Evaluation of Text-to-Image Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

MegaBlocks: Efficient Sparse Training with Mixture-of-Experts.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

2022
Allocation of fungible resources via a fast, scalable price discovery method.
Math. Program. Comput., 2022

2021
Resource-efficient execution of deep learning computations.
PhD thesis, 2021

Don't Give Up on Large Optimization Problems; POP Them!
CoRR, 2021

Efficient Large-Scale Language Model Training on GPU Clusters.
CoRR, 2021

Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP.
Proceedings of the SOSP '21: ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021

Efficient large-scale language model training on GPU clusters using megatron-LM.
Proceedings of the International Conference for High Performance Computing, 2021

Piper: Multidimensional Planner for DNN Parallelization.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Memory-Efficient Pipeline-Parallel DNN Training.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
A Demonstration of Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference.
Proc. VLDB Endow., 2020

Offload Annotations: Bringing Heterogeneous Computing to Existing Libraries and Workloads.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020


Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference.
Proceedings of the Third Conference on Machine Learning and Systems, 2020

2019
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark.
ACM SIGOPS Oper. Syst. Rev., 2019

MLPerf Training Benchmark.
CoRR, 2019

PipeDream: generalized pipeline parallelism for DNN training.
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 2019

2018
MacroBase: Prioritizing Attention in Fast Data.
ACM Trans. Database Syst., 2018

Evaluating End-to-End Optimization for Data Analytics Applications in Weld.
Proc. VLDB Endow., 2018

PipeDream: Fast and Efficient Pipeline Parallel DNN Training.
CoRR, 2018

2017
Weld: Rethinking the Interface Between Data-Intensive Applications.
CoRR, 2017

MacroBase: Prioritizing Attention in Fast Data.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

A Common Runtime for High Performance Data Analysis.
Proceedings of the 8th Biennial Conference on Innovative Data Systems Research, 2017

2016
MacroBase: Analytic Monitoring for the Internet of Things.
CoRR, 2016


  Loading...