Jeff Rasley

According to our database1, Jeff Rasley authored at least 19 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference.
CoRR, 2024

2023
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies.
CoRR, 2023

DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales.
CoRR, 2023

MCR-DL: Mix-and-Match Communication Runtime for Deep Learning.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

2022
DeepSpeed- Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale.
Proceedings of the International Conference on Machine Learning, 2022

2021
ZeRO-infinity: breaking the GPU memory wall for extreme scale deep learning.
Proceedings of the International Conference for High Performance Computing, 2021

2020
ZeRO: memory optimizations toward training trillion parameter models.
Proceedings of the International Conference for High Performance Computing, 2020

DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

2019
Application-Aware Cluster Resource Management.
PhD thesis, 2019

ZeRO: Memory Optimization Towards Training A Trillion Parameter Models.
CoRR, 2019

Accelerating Large Scale Deep Learning Inference through DeepCPU at Microsoft.
Proceedings of the 2019 USENIX Conference on Operational Machine Learning, 2019

2017
HyperDrive: exploring hyperparameters with POP scheduling.
Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference, Las Vegas, NV, USA, December 11, 2017

2016
Efficient queue management for cluster scheduling.
Proceedings of the Eleventh European Conference on Computer Systems, 2016

2015
Detecting latent cross-platform API violations.
Proceedings of the 26th IEEE International Symposium on Software Reliability Engineering, 2015

Crowdsourcing from Scratch: A Pragmatic Experiment in Data Collection by Novice Requesters.
Proceedings of the Third AAAI Conference on Human Computation and Crowdsourcing, 2015

2014
Planck: millisecond-scale monitoring and control for commodity networks.
Proceedings of the ACM SIGCOMM 2014 Conference, 2014

Low-latency Network Monitoring via Oversubscribed Port Mirroring.
Proceedings of the Open Networking Summit 2014 - Research Track, 2014

2010
Retaining sandbox containment despite bugs in privileged memory-safe code.
Proceedings of the 17th ACM Conference on Computer and Communications Security, 2010


  Loading...