Quentin Anthony

Orcid: 0000-0002-6823-9080

According to our database¹, Quentin Anthony authored at least 39 papers between 2019 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Simple and Scalable Strategies to Continually Pre-train Large Language Models.

[BibT_eX]

[DOI]

Quentin Gregory Anthony

Eugene Belilovsky

Timothée Lesort

Irina Rish

Trans. Mach. Learn. Res., 2024

The Zamba2 Suite: Technical Report.

[BibT_eX]

[DOI]

CoRR, 2024

RedPajama: an Open Dataset for Training Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Zyda-2: a 5 Trillion Token High-Quality Dataset.

[BibT_eX]

[DOI]

CoRR, 2024

Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters.

[BibT_eX]

[DOI]

CoRR, 2024

Zyda: A 1.3T Dataset for Open Language Modeling.

[BibT_eX]

[DOI]

CoRR, 2024

Zamba: A Compact 7B SSM Hybrid Model.

[BibT_eX]

[DOI]

CoRR, 2024

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence.

[BibT_eX]

[DOI]

CoRR, 2024

BlackMamba: Mixture of Experts for State-Space Models.

[BibT_eX]

[DOI]

CoRR, 2024

Infer-HiRes: Accelerating Inference for High-Resolution Images with Quantization and Distributed Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Practice and Experience in Advanced Research Computing 2024: Human Powered Computing, 2024

Comparative Study of Large Language Model Architectures on Frontier.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

The Case for Co-Designing Model Architectures with Hardware.

[BibT_eX]

[DOI]

Proceedings of the 53rd International Conference on Parallel Processing, 2024

Demystifying the Communication Characteristics for Distributed Transformer Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on High-Performance Interconnects, 2024

Accelerating Large Language Model Training with Hybrid GPU-based Compression.

[BibT_eX]

[DOI]

Dhabaleswar K. D. K. Panda

Proceedings of the 24th IEEE International Symposium on Cluster, 2024

2023

Continual Pre-Training of Large Language Models: How to (re)warm your model?

[BibT_eX]

[DOI]

CoRR, 2023

RWKV: Reinventing RNNs for the Transformer Era.

[BibT_eX]

[DOI]

CoRR, 2023

Emergent and Predictable Memorization in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Accelerating Distributed Deep Learning Training with Compression Assisted Allgather and Reduce-Scatter Communication.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

MCR-DL: Mix-and-Match Communication Runtime for Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling.

[BibT_eX]

[DOI]

Stella Biderman

Hailey Schoelkopf

Quentin Gregory Anthony

Proceedings of the International Conference on Machine Learning, 2023

RWKV: Reinventing RNNs for the Transformer Era.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

trlX: A Framework for Large Scale Reinforcement Learning from Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

ScaMP: Scalable Meta-Parallelism for Deep Learning Search.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, 2023

2022

GPT-NeoX-20B: An Open-Source Autoregressive Language Model.

[BibT_eX]

[DOI]

CoRR, 2022

Accelerating MPI All-to-All Communication with Online Compression on Modern GPU Clusters.

[BibT_eX]

[DOI]

Qinghua Zhou

Pouya Kousha

Quentin Anthony

Kawthar Shafie Khorassani

Aamir Shafi

Hari Subramoni

Dhabaleswar K. Panda

Proceedings of the High Performance Computing - 37th International Conference, 2022

Hy-Fi: Hybrid Five-Dimensional Parallel DNN Training on High-Performance GPU Clusters.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 37th International Conference, 2022

Highly Efficient Alltoall and Alltoallv Communication Algorithms for GPU Systems.

[BibT_eX]

[DOI]

Chen-Chun Chen

Kawthar Shafie Khorassani

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Accelerating Broadcast Communication with GPU Compression for Deep Learning Workloads.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE International Conference on High Performance Computing, 2022

2021

Cross-layer Visualization and Profiling of Network and I/O Communication for HPC Clusters.

[BibT_eX]

[DOI]

CoRR, 2021

Evaluating Multi-Level Checkpointing for Distributed Deep Neural Network Training.

[BibT_eX]

[DOI]

Quentin Anthony

Donglai Dai

Proceedings of the 2021 SC Workshops Supplementary Proceedings, 2021

Scaling Single-Image Super-Resolution Training on Modern HPC Clusters: Early Experiences.

[BibT_eX]

[DOI]

Quentin Anthony

Lang Xu

Hari Subramoni

Dhabaleswar K. D. K. Panda

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

Adaptive and Hierarchical Large Message All-to-all Communication Algorithms for Large-scale Dense GPU Systems.

[BibT_eX]

[DOI]

Kawthar Shafie Khorassani

Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021

2020

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training with TensorFlow.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 35th International Conference, 2020

GEMS: GPU-enabled memory-aware model-parallelism system for distributed DNN training.

[BibT_eX]

[DOI]

Arpan Jain

Ammar Ahmad Awan

Asmaa M. Aljuhani

Jahanzeb Maqbool Hashmi

Proceedings of the International Conference for High Performance Computing, 2020

Accelerating GPU-based Machine Learning in Python using MPI Library: A Case Study with MVAPICH2-GDR.

[BibT_eX]

[DOI]

Seyedeh Mahdieh Ghazimirsaeed

Quentin Anthony

Aamir Shafi

Hari Subramoni

Dhabaleswar K. D. K. Panda

Proceedings of the 6th IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2020

Efficient Training of Semantic Image Segmentation on Summit using Horovod and MVAPICH2-GDR.

[BibT_eX]

[DOI]

Dhabaleswar K. D. K. Panda

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

2019

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow.

[BibT_eX]

[DOI]

CoRR, 2019

Performance Characterization of DNN Training using TensorFlow and PyTorch on Modern Clusters.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

Quentin Anthony

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...