Johannes Doerfert

Orcid: 0000-0001-7870-8963

Affiliations:
  • Lawrence Livermore National Laboratory, CA, USA


According to our database1, Johannes Doerfert authored at least 59 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Input-Gen: Guided Generation of Stateful Inputs for Testing, Tuning, and Training.
CoRR, 2024

Performance Portable Monte Carlo Particle Transport on Intel, NVIDIA, and AMD GPUs.
CoRR, 2024

CI/CD Efforts for Validation, Verification and Benchmarking OpenMP Implementations.
Proceedings of the Advancing OpenMP for Future Accelerators, 2024

Automatic Parallelization and OpenMP Offloading of Fortran Array Notation.
Proceedings of the Advancing OpenMP for Future Accelerators, 2024

2023
Quantum Task Offloading with the OpenMP API.
CoRR, 2023

ComPile: A Large IR Dataset from Production Sources.
CoRR, 2023

GPU First - Execution of Legacy CPU Codes on GPUs.
CoRR, 2023

OpenMP Kernel Language Extensions for Performance Portable GPU Codes.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Precision and Performance Analysis of C Standard Math Library Functions on GPUs.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Memory Transfer Decomposition: Exploring Smart Data Movement Through Architecture-Aware Strategies.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay.
Proceedings of the International Conference for High Performance Computing, 2023

High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload.
Proceedings of the OpenMP: Advanced Task-Based, Device and Compiler Programming, 2023

The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned.
Proceedings of the OpenMP: Advanced Task-Based, Device and Compiler Programming, 2023

Maximizing Parallelism and GPU Utilization For Direct GPU Compilation Through Ensemble Execution.
Proceedings of the 52nd International Conference on Parallel Processing Workshops, 2023

Implementing OpenMP's SIMD Directive in LLVM's GPU Runtime.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

ORAQL - Optimistic Responses to Alias Queries in LLVM.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

SPLENDID: Supporting Parallel LLVM-IR Enhanced Natural Decompilation for Interactive Development.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
OpenMP application experiences: Porting to accelerated nodes.
Parallel Comput., 2022

MARTINI: The Little Match and Replace Tool for Automatic Code Rewriting.
J. Open Source Softw., 2022

Remote OpenMP Offloading.
Proceedings of the High Performance Computing - 37th International Conference, 2022

Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

Piper: Pipelining OpenMP Offloading Execution Through Compiler Optimization For Performance.
Proceedings of the IEEE/ACM International Workshop on Performance, 2022

Direct GPU Compilation and Execution for Host Applications with OpenMP Parallelism.
Proceedings of the Eighth IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC, 2022

Automatic Asynchronous Execution of Synchronously Offloaded OpenMP Target Regions.
Proceedings of the Eighth IEEE/ACM Workshop on the LLVM Compiler Infrastructure in HPC, 2022

Just-in-Time Compilation and Link-Time Optimization for OpenMP Target Offloading.
Proceedings of the OpenMP in a Modern World: From Multi-device Support to Meta Programming, 2022

Towards Automatic OpenMP-Aware Utilization of Fast GPU Memory.
Proceedings of the OpenMP in a Modern World: From Multi-device Support to Meta Programming, 2022

Towards Efficient Remote OpenMP Offloading.
Proceedings of the OpenMP in a Modern World: From Multi-device Support to Meta Programming, 2022

Co-Designing an OpenMP GPU Runtime and Optimizations for Near-Zero Overhead Execution.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

A Pipeline Pattern Detection Technique in Polly.
Proceedings of the Workshop Proceedings of the 51st International Conference on Parallel Processing, 2022

MARTINI: The Little Match and Replace Tool for Automatic Application Rewriting with Code Examples.
Proceedings of the Euro-Par 2022: Parallel Processing, 2022

Efficient Execution of OpenMP on GPUs.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2022

Breaking the Vendor Lock: Performance Portable Programming through OpenMP as Target Independent Runtime Layer.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021
Reverse-mode automatic differentiation and optimization of GPU kernels via enzyme.
Proceedings of the International Conference for High Performance Computing, 2021

Experience Report: Writing a Portable GPU Runtime with OpenMP 5.1.
Proceedings of the OpenMP: Enabling Massive Node-Level Parallelism, 2021

A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation.
Proceedings of the OpenMP: Enabling Massive Node-Level Parallelism, 2021



Spray: Sparse Reductions of Arrays in OPENMP.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

A Virtual GPU as Developer-Friendly OpenMP Offload Target.
Proceedings of the ICPP Workshops 2021: 50th International Conference on Parallel Processing, 2021

Towards Compile-Time-Reducing Compiler Optimization Selection via Machine Learning.
Proceedings of the ICPP Workshops 2021: 50th International Conference on Parallel Processing, 2021

Advancing OpenMP Offload Debugging Capabilities in LLVM.
Proceedings of the ICPP Workshops 2021: 50th International Conference on Parallel Processing, 2021

2020
Really Embedding Domain-Specific Languages into C++.
CoRR, 2020

Concurrent Execution of Deferred OpenMP Target Tasks with Hidden Helper Threads.
Proceedings of the Languages and Compilers for Parallel Computing, 2020

FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis.
Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020

Automated Partitioning of Data-Parallel Kernels using Polyhedral Compilation.
Proceedings of the ICPP Workshops '20: Workshops, Edmonton, AB, Canada, August 17-20, 2020, 2020

2019
Performance Exploration Through Optimistic Static Program Annotations.
Proceedings of the High Performance Computing - 34th International Conference, 2019

The TRegion Interface and Compiler Optimizations for OpenMP Target Regions.
Proceedings of the OpenMP: Conquering the Full Hardware Spectrum, 2019

2018
Applicable and sound polyhedral optimization of low-level programs.
PhD thesis, 2018

Compiler Optimizations for Parallel Programs.
Proceedings of the Languages and Compilers for Parallel Computing, 2018

Compiler Optimizations for OpenMP.
Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

Polyhedral expression propagation.
Proceedings of the 27th International Conference on Compiler Construction, 2018

2017
Optimistic loop optimization.
Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017

2016
Input space splitting for OpenCL.
Proceedings of the 25th International Conference on Compiler Construction, 2016

2015
Generalized Task Parallelism.
ACM Trans. Archit. Code Optim., 2015

Polly's Polyhedral Scheduling in the Presence of Reductions.
CoRR, 2015

Runtime pointer disambiguation.
Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, 2015

2014
Architecture-parametric timing analysis.
Proceedings of the 20th IEEE Real-Time and Embedded Technology and Applications Symposium, 2014

2013
Impact of Resource Sharing on Performance and Performance Prediction: A Survey.
Proceedings of the CONCUR 2013 - Concurrency Theory - 24th International Conference, 2013


  Loading...