Thomas Scogland

CoRR, 2024

Enabling RAJA on Intel GPUs with SYCL.

[BibT_eX]

[DOI]

Proceedings of the 12th International Workshop on OpenCL and SYCL, 2024

Shared Virtual Memory: Its Design and Performance Implications for Diverse Applications.

[BibT_eX]

[DOI]

Bennett Cooper

Rong Ge

Proceedings of the 38th ACM International Conference on Supercomputing, 2024

BLP: Block-Level Pipelining for GPUs.

[BibT_eX]

[DOI]

Xuewen Cui

Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

2023

GPU First - Execution of Legacy CPU Codes on GPUs.

[BibT_eX]

[DOI]

CoRR, 2023

OpenMP Kernel Language Extensions for Performance Portable GPU Codes.

[BibT_eX]

[DOI]

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Fluxion: A Scalable Graph-Based Resource Model for HPC Scheduling Challenges.

[BibT_eX]

[DOI]

Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

2022

OpenMP application experiences: Porting to accelerated nodes.

[BibT_eX]

[DOI]

Parallel Comput., 2022

An analytical performance model of generalized hierarchical scheduling.

[BibT_eX]

[DOI]

Michela Taufer

Int. J. High Perform. Comput. Appl., 2022

Reliabuild: Searching for High-Fidelity Builds Using Active Learning.

[BibT_eX]

[DOI]

Harshitha Menon

Todd Gamblin

CoRR, 2022

Mapping Out the HPC Dependency Chaos.

[BibT_eX]

[DOI]

Farid Zakaria

Todd Gamblin

Carlos Maltzahn

Proceedings of the SC22: International Conference for High Performance Computing, 2022

Piper: Pipelining OpenMP Offloading Execution Through Compiler Optimization For Performance.

[BibT_eX]

[DOI]

Giorgis Georgakoudis

Johannes Doerfert

Ignacio Laguna

Proceedings of the IEEE/ACM International Workshop on Performance, 2022

Searching for High-Fidelity Builds Using Active Learning.

[BibT_eX]

[DOI]

Harshitha Menon

Todd Gamblin

Proceedings of the 19th IEEE/ACM International Conference on Mining Software Repositories, 2022

Extending OpenMP to Support Automated Function Specialization Across Translation Units.

[BibT_eX]

[DOI]

Giorgis Georgakoudis

Chunhua Liao

Proceedings of the OpenMP in a Modern World: From Multi-device Support to Meta Programming, 2022

Breaking the Vendor Lock: Performance Portable Programming through OpenMP as Target Independent Runtime Layer.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021

Beyond Explicit Transfers: Shared and Managed Memory in OpenMP.

[BibT_eX]

[DOI]

Brandon Neth

Alejandro Duran

Proceedings of the OpenMP: Enabling Massive Node-Level Parallelism, 2021

Inter-loop optimization in RAJA using loop chains.

[BibT_eX]

[DOI]

Brandon Neth

Michelle Mills Strout

Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

2020

Flux: Overcoming scheduling challenges for exascale workflows.

[BibT_eX]

[DOI]

Becky Springmeyer

Michela Taufer

Future Gener. Comput. Syst., 2020

Unified Sequential Optimization Directives in OpenMP.

[BibT_eX]

[DOI]

Brandon Neth

Michelle Mills Strout

Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020

FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis.

[BibT_eX]

[DOI]

Giorgis Georgakoudis

Johannes Doerfert

Ignacio Laguna

Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020

2019

A massively parallel infrastructure for adaptive multiscale simulations: modeling RAS initiation pathway for cancer.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2019

RAJA: Portable Performance for Large-Scale Scientific Applications.

[BibT_eX]

[DOI]

David Beckingsale

Proceedings of the 2019 IEEE/ACM International Workshop on Performance, 2019

Performance portable C++ programming with RAJA.

[BibT_eX]

[DOI]

Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2019

A Framework for Enabling OpenMP Autotuning.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Conquering the Full Hardware Spectrum, 2019

Making OpenMP Ready for C++ Executors.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Conquering the Full Hardware Spectrum, 2019

Extending OpenMP Metadirective Semantics for Runtime Adaptation.

[BibT_eX]

[DOI]

Yonghong Yan

Anjia Wang

Chunhua Liao

Proceedings of the OpenMP: Conquering the Full Hardware Spectrum, 2019

2018

The Ongoing Evolution of OpenMP.

[BibT_eX]

[DOI]

Proc. IEEE, 2018

Introduction.

[BibT_eX]

[DOI]

David Beckingsale

Int. J. High Perform. Comput. Appl., 2018

Extending OpenMP to Facilitate Loop Optimization.

[BibT_eX]

[DOI]

Ian J. Bertolacci

Michelle Mills Strout

Eddie C. Davis

Catherine Olschanowsky

Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

2017

The Design and Implementation of OpenMP 4.5 and OpenACC Backends for the RAJA C++ Performance Portability Layer.

[BibT_eX]

[DOI]

Proceedings of the Accelerator Programming Using Directives - 4th International Workshop, 2017

Custom Data Mapping for Composable Data Management.

[BibT_eX]

[DOI]

Chris Earl

Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Directive-Based Partitioning and Pipelining for Graphics Processing Units.

[BibT_eX]

[DOI]

Xuewen Cui

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

2016

A Case for Extending Task Dependencies.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

Early Experiences Porting Three Applications to OpenMP 4.5.

[BibT_eX]

[DOI]

Gheorghe-Teodor Bercea

Carlo Bertolli

Alexandre E. Eichenberger

Erik W. Draeger

Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

Scalable I/O-Aware Job Scheduling for Burst Buffer Enabled HPC Clusters.

[BibT_eX]

[DOI]

Stephen Herbein

Dong H. Ahn

Don Lipari

Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016

Directive-Based Pipelining Extension for OpenMP.

[BibT_eX]

[DOI]

Xuewen Cui

Wu-Chun Feng

Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

2015

CoreTSAR: Core Task-Size Adapting Runtime.

[BibT_eX]

[DOI]

Barry Rountree

IEEE Trans. Parallel Distributed Syst., 2015

Design and Evaluation of Scalable Concurrent Queues for Many-Core Architectures.

[BibT_eX]

[DOI]

Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering, Austin, TX, USA, January 31, 2015

Node variability in large-scale power measurements: perspectives from the Green500, Top500 and EEHPCWG.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2015

Supporting Indirect Data Mapping in OpenMP.

[BibT_eX]

[DOI]

Jeff Keasler

John C. Gyllenhaal

Rich Hornung

Hal Finkel

Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Enabling Region Merging Optimizations in OpenMP.

[BibT_eX]

[DOI]

John C. Gyllenhaal

Jeff Keasler

Rich Hornung

Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

2014

Runtime Adaptation for Autonomic Heterogeneous Computing.

[BibT_eX]

[DOI]

PhD thesis, 2014

A power-measurement methodology for large-scale, high-performance computing.

[BibT_eX]

[DOI]

Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2014

CoreTSAR: Adaptive Worksharing for Heterogeneous Systems.

[BibT_eX]

[DOI]

Barry Rountree

Proceedings of the Supercomputing - 29th International Conference, 2014

Runtime Adaptation for Autonomic Heterogeneous Computing.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

Locality-aware memory association for multi-target worksharing in OpenMP.

[BibT_eX]

[DOI]

Wu-Chun Feng

Proceedings of the International Conference on Parallel Architectures and Compilation, 2014

2013

The Green500 list: escapades to exascale.

[BibT_eX]

[DOI]

Balaji Subramaniam

Comput. Sci. Res. Dev., 2013

On the Programmability and Performance of Heterogeneous Platforms.

[BibT_eX]

[DOI]

Konstantinos Krommydas

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Trends in energy-efficient computing: A perspective from the Green500.

[BibT_eX]

[DOI]

Proceedings of the International Green Computing Conference, 2013

2012

OpenCL and the 13 dwarfs: a work in progress.

[BibT_eX]

[DOI]

Proceedings of the Third Joint WOSP/SIPEW International Conference on Performance Engineering, 2012

Heterogeneous Task Scheduling for Accelerated OpenMP.

[BibT_eX]

[DOI]

Barry Rountree

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

2011

Emerging Trends on the Evolving Green500: Year Three.

[BibT_eX]

[DOI]

Balaji Subramaniam

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

StreamMR: An Optimized MapReduce Framework for AMD GPUs.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

Architecture-Aware Mapping and Optimization on a 1600-Core GPU.

[BibT_eX]

[DOI]

Mayank Daga

Proceedings of the 17th IEEE International Conference on Parallel and Distributed Systems, 2011

Towards accelerating molecular modeling via multi-scale approximation on a GPU.

[BibT_eX]

[DOI]

Mayank Daga

Proceedings of the IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences, 2011

2010

A first look at integrated GPUs for green high-performance computing.

[BibT_eX]

[DOI]

Heshan Lin

Comput. Sci. Res. Dev., 2010

2009

The Green500 List: Year one.

[BibT_eX]

[DOI]