Bharath Ramesh

Nick Contini

Nawras Alnaasan

Mustafa Abduljabbar

Aamir Shafi

Goutham Kalikrishna Reddy Kuncham

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

OHIO: Improving RDMA Network Scalability in MPI_Alltoall Through Optimized Hierarchical and Intra/Inter-Node Communication Overlap Design.

[BibT_eX]

[DOI]

Tu Tran

Proceedings of the IEEE Symposium on High-Performance Interconnects, 2024

2023

High Performance MPI over the Slingshot Interconnect.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., February, 2023

Network-Assisted Noncontiguous Transfers for GPU-Aware MPI Libraries.

[BibT_eX]

[DOI]

IEEE Micro, 2023

A Novel Framework for Efficient Offloading of Communication Operations to Bluefield SmartNICs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

In-Depth Evaluation of a Lower-Level Direct-Verbs API on InfiniBand-based Clusters: Early Experiences.

[BibT_eX]

[DOI]

Benjamin Michalowicz

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Enabling Reconfigurable HPC through MPI-based Inter-FPGA Communication.

[BibT_eX]

[DOI]

Nicholas Contini

Proceedings of the 37th International Conference on Supercomputing, 2023

Designing In-network Computing Aware Reduction Collectives in MPI.

[BibT_eX]

[DOI]

Goutham Kalikrishna Reddy Kuncham

Proceedings of the IEEE Symposium on High-Performance Interconnects, 2023

2022

High Performance MPI over the Slingshot Interconnect: Early Experiences.

[BibT_eX]

[DOI]

Proceedings of the PEARC '22: Practice and Experience in Advanced Research Computing, Boston, MA, USA, July 10, 2022

Designing Hierarchical Multi-HCA Aware Allgather in MPI.

[BibT_eX]

[DOI]

Proceedings of the Workshop Proceedings of the 51st International Conference on Parallel Processing, 2022

Network Assisted Non-Contiguous Transfers for GPU-Aware MPI Libraries.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on High-Performance Interconnects, 2022

Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters.

[BibT_eX]

[DOI]

Akshay Paniraja Guptha

Proceedings of the 29th IEEE International Conference on High Performance Computing, 2022

Designing Efficient Pipelined Communication Schemes using Compression in MPI Libraries.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE International Conference on High Performance Computing, 2022

2021

Layout-aware Hardware-assisted Designs for Derived Data Types in MPI.

[BibT_eX]

[DOI]

Chen-Chun Chen

Aamir Shafi

Proceedings of the 28th IEEE International Conference on High Performance Computing, 2021

Large-Message Nonblocking MPI_Iallgather and MPI Ibcast Offload via BlueField-2 DPU.

[BibT_eX]

[DOI]

Nick Sarkauskas

Proceedings of the 28th IEEE International Conference on High Performance Computing, 2021

Towards Architecture-aware Hierarchical Communication Trees on Modern HPC Systems.

[BibT_eX]

[DOI]

Shulei Xu

Aamir Shafi

Proceedings of the 28th IEEE International Conference on High Performance Computing, 2021

2020

Communication-Aware Hardware-Assisted MPI Overlap Engine.

[BibT_eX]

[DOI]

Sourav Chakraborty

Proceedings of the High Performance Computing - 35th International Conference, 2020

Scalable MPI Collectives using SHARP: Large Scale Performance Evaluation on the TACC Frontera System.

[BibT_eX]

[DOI]

Nick Sarkauskas

Proceedings of the Workshop on Exascale MPI, 2020

Performance Characterization of Network Mechanisms for Non-Contiguous Data Transfers in MPI.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Machine-agnostic and Communication-aware Designs for MPI on Emerging Architectures.

[BibT_eX]

[DOI]

Shulei Xu

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

2019

Leveraging Network-level parallelism with Multiple Process-Endpoints for MPI Broadcast.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM Third Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware, 2019

Designing a Profiling and Visualization Tool for Scalable and In-depth Analysis of High-Performance GPU Clusters.

[BibT_eX]

[DOI]

Pouya Kousha