Precision and Performance Analysis of C Standard Math Library Functions on GPUs.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023
OpenMP Reverse Offloading Using Shared Memory Remote Procedure Calls.
Proceedings of the OpenMP: Advanced Task-Based, Device and Compiler Programming, 2023
Direct GPU Compilation and Execution for Host Applications with OpenMP Parallelism.
Proceedings of the Eighth IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC, 2022
Just-in-Time Compilation and Link-Time Optimization for OpenMP Target Offloading.
Proceedings of the OpenMP in a Modern World: From Multi-device Support to Meta Programming, 2022
Co-Designing an OpenMP GPU Runtime and Optimizations for Near-Zero Overhead Execution.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
Efficient Execution of OpenMP on GPUs.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022
Breaking the Vendor Lock: Performance Portable Programming through OpenMP as Target Independent Runtime Layer.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022
A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation.
Proceedings of the OpenMP: Enabling Massive Node-Level Parallelism, 2021
Advancing OpenMP Offload Debugging Capabilities in LLVM.
Proceedings of the ICPP Workshops 2021: 50th International Conference on Parallel Processing, 2021