Quentin Anthony
Orcid: 0000-0002-6823-9080
According to our database1,
Quentin Anthony
authored at least 36 papers
between 2019 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Trans. Mach. Learn. Res., 2024
CoRR, 2024
Infer-HiRes: Accelerating Inference for High-Resolution Images with Quantization and Distributed Deep Learning.
Proceedings of the Practice and Experience in Advanced Research Computing 2024: Human Powered Computing, 2024
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Proceedings of the 53rd International Conference on Parallel Processing, 2024
Proceedings of the IEEE Symposium on High-Performance Interconnects, 2024
Proceedings of the 24th IEEE International Symposium on Cluster, 2024
2023
CoRR, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Accelerating Distributed Deep Learning Training with Compression Assisted Allgather and Reduce-Scatter Communication.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, 2023
2022
Accelerating MPI All-to-All Communication with Online Compression on Modern GPU Clusters.
Proceedings of the High Performance Computing - 37th International Conference, 2022
Hy-Fi: Hybrid Five-Dimensional Parallel DNN Training on High-Performance GPU Clusters.
Proceedings of the High Performance Computing - 37th International Conference, 2022
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
Accelerating Broadcast Communication with GPU Compression for Deep Learning Workloads.
Proceedings of the 29th IEEE International Conference on High Performance Computing, 2022
2021
Cross-layer Visualization and Profiling of Network and I/O Communication for HPC Clusters.
CoRR, 2021
Proceedings of the 2021 SC Workshops Supplementary Proceedings, 2021
Scaling Single-Image Super-Resolution Training on Modern HPC Clusters: Early Experiences.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021
Adaptive and Hierarchical Large Message All-to-all Communication Algorithms for Large-scale Dense GPU Systems.
Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021
2020
HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training with TensorFlow.
Proceedings of the High Performance Computing - 35th International Conference, 2020
GEMS: GPU-enabled memory-aware model-parallelism system for distributed DNN training.
Proceedings of the International Conference for High Performance Computing, 2020
Accelerating GPU-based Machine Learning in Python using MPI Library: A Case Study with MVAPICH2-GDR.
Proceedings of the 6th IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2020
Efficient Training of Semantic Image Segmentation on Summit using Horovod and MVAPICH2-GDR.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020
2019
HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow.
CoRR, 2019
Performance Characterization of DNN Training using TensorFlow and PyTorch on Modern Clusters.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019