Evaluating Asynchronous Parallel I/O on HPC Systems.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
Runway: In-transit Data Compression on Heterogeneous HPC Systems.
Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, 2023
Transparent Asynchronous Parallel I/O Using Background Threads.
IEEE Trans. Parallel Distributed Syst., 2022
HDF5 Cache VOL: Efficient and Scalable Parallel I/O through Caching Data on Node-local Storage.
Proceedings of the 22nd IEEE International Symposium on Cluster, 2022
PILOT: a Runtime System to Manage Multi-tenant GPU Unified Memory Footprint.
Proceedings of the 28th IEEE International Conference on High Performance Computing, 2021
GPU Direct I/O with HDF5.
Proceedings of the Fifth IEEE/ACM International Parallel Data Systems Workshop, 2020
Compiling SIMT Programs on Multi- and Many-Core Processors with Wide Vector Units: A Case Study with CUDA.
Proceedings of the 25th IEEE International Conference on High Performance Computing, 2018