2025

Proteus: Portable Runtime Optimization of GPU Kernel Execution with Just-in-Time Compilation.

[DOI]

Giorgis Georgakoudis

Konstantinos Parasyris

David Beckingsale

Proceedings of the 23rd ACM/IEEE International Symposium on Code Generation and Optimization, 2025

2024

An Exploration of Global Optimization Strategies for Autotuning OpenMP-based Codes.

[DOI]

Gregory Bolet

Giorgis Georgakoudis

Konstantinos Parasyris

Kirk W. Cameron

David Beckingsale

Todd Gamblin

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

2023

Machine Learning-Driven Adaptive OpenMP For Portable Performance on Heterogeneous Systems.

[DOI]

Giorgis Georgakoudis

Konstantinos Parasyris

Chunhua Liao

David Beckingsale

Todd Gamblin

Bronis R. de Supinski

CoRR, 2023

2021

Extending OpenMP for Machine Learning-Driven Adaptation.

[DOI]

Chunhua Liao

Anjia Wang

Giorgis Georgakoudis

Bronis R. de Supinski

Yonghong Yan

David Beckingsale

Todd Gamblin

Proceedings of the Accelerator Programming Using Directives - 8th International Workshop, 2021

Artemis: Automatic Runtime Tuning of Parallel Execution Parameters Using Machine Learning.

[DOI]

Proceedings of the High Performance Computing - 36th International Conference, 2021

2020

Umpire: Application-focused management and coordination of complex hierarchical memory.

[DOI]

IBM J. Res. Dev., 2020

CodeSeer: input-dependent code variants selection via machine learning.

[DOI]

Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

2019

Preparation and optimization of a diverse workload for a large-scale heterogeneous system.

[DOI]

Ian Karlin

Yoonho Park

Bronis R. de Supinski

Sara Kokkila Schumacher

Guillaume Thomas-Collignon

Proceedings of the International Conference for High Performance Computing, 2019

RAJA: Portable Performance for Large-Scale Scientific Applications.

[DOI]

David Beckingsale

Thomas R. W. Scogland

Proceedings of the 2019 IEEE/ACM International Workshop on Performance, 2019

Performance portable C++ programming with RAJA.

[DOI]

Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

FuncyTuner: Auto-tuning Scientific Applications With Per-loop Compilation.

[DOI]

Proceedings of the 48th International Conference on Parallel Processing, 2019

2018

Introduction.

[DOI]

Tom Scogland

David Beckingsale

Int. J. High Perform. Comput. Appl., 2018

2017

Apollo: Reusable Models for Fast, Dynamic Tuning of Input-Dependent Code.

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

TeaLeaf: A Mini-Application to Enable Design-Space Explorations for Iterative Sparse Linear Solvers.

[DOI]

Richard P. Smedley-Stevenson

David Beckingsale

Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

Flexible Data Aggregation for Performance Profiling.

[DOI]

David Böhme

David Beckingsale

Martin Schulz

Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

2016

Caliper: performance introspection for HPC software stacks.

[DOI]

Proceedings of the International Conference for High Performance Computing, 2016

Fast Multi-parameter Performance Modeling.

[DOI]

Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

2015

Towards scalable adaptive mesh refinement on future parallel architectures.

[DOI]

David Beckingsale

PhD thesis, 2015

Resident Block-Structured Adaptive Mesh Refinement on Thousands of Graphics Processing Units.

[DOI]

Proceedings of the 44th International Conference on Parallel Processing, 2015

2014

Achieving portability and performance through OpenACC.

[DOI]

Proceedings of the First Workshop on Accelerator Programming using Directives, 2014

2013

Towards Automated Memory Model Generation Via Event Tracing.

[DOI]

Comput. J., 2013

Analysing the influence of InfiniBand choice on OpenMPI memory consumption.

[DOI]

Proceedings of the International Conference on High Performance Computing & Simulation, 2013

2012

Accelerating Hydrocodes with OpenACC, OpeCL and CUDA.

[DOI]

Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Performance Modelling of Magnetohydrodynamics Codes.

[DOI]

Proceedings of the Computer Performance Engineering - 9th European Workshop, 2012

Optimisation of Patch Distribution Strategies for AMR Applications.

[DOI]

Proceedings of the Computer Performance Engineering - 9th European Workshop, 2012