2019
Improved MPI Multi-Threaded Performance using OFI Scalable Endpoints.
Proceedings of the 2019 IEEE Symposium on High-Performance Interconnects, 2019

2018
Scaling collectives on large clusters using Intel(R) architecture processors and fabric.
Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018

2017
Host Software Stack Optimizations to Maximize Aggregate Fabric Throughput.
Proceedings of the 25th IEEE Annual Symposium on High-Performance Interconnects, 2017