2024

Scalable Training of Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN.

[DOI]

Massimiliano Lupo Pasini

,

,

,

,

David M. Rogers

,

,

Khaled Z. Ibrahim

,

,

,

,

Prasanna Balaprakash

CoRR, 2024

ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability.

[DOI]

,

,

Aristeidis Tsaris

,

,

,

,

,

,

Moetasim Ashfaq

,

,

Prasanna Balaprakash

Proceedings of the International Conference for High Performance Computing, 2024

2023

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies.

[DOI]

Shuaiwen Leon Song

,

,

,

,

,

Chengming Zhang

,

Masahiro Tanaka

,

,

,

Ammar Ahmad Awan

,

,

,

,

,

,

,

,

Jonathan A. Weyn

,

,

Sylwester Klocek

,

Volodymyr Vragov

,

Mohammed AlQuraishi

,

,

Christina Floristean

,

,

,

Venkatram Vishwanath

,

Arvind Ramanathan

,

,

,

,

,

,

Alexander Brace

,

,

Cindy Orozco Bohorquez

,

,

,

Danilo Perez-Rivera

,

,

,

Michael W. Irvin

,

J. Gregory Pauloski

,

,

Valérie Hayot-Sasson

,

,

,

,

,

,

,

Michael E. Papka

,

Thomas S. Brettin

,

Prasanna Balaprakash

,

,

,

Heidi A. Hanson

,

Thomas E. Potok

,

Massimiliano Lupo Pasini

,

,

,

Dalton D. Lunga

,

,

,

,

Mallikarjun Shankar

,

,

,

,

,

,

,

,

,

,

,

,

Alexey Svyatkovskiy

,

,

,

,

Michael J. Schulte

,

,

,

,

,

Christian Dallago

,

,

,

,

Anima Anandkumar

,

CoRR, 2023

A Research Retrospective on AMD's Exascale Computing Journey.

[DOI]

,

Michael J. Schulte

,

Mike Ignatowski

,

Vignesh Adhinarayanan

,

,

,

,

,

Johnathan Alsop

,

,

Bradford M. Beckmann

,

Majed Valad Beigi

,

Sergey Blagodurov

,

,

,

William C. Brantley

,

,

,

,

,

,

Nicholas Curtis

,

,

,

,

,

Christopher Erb

,

,

Joseph L. Greathouse

,

Sudhanva Gurumurthi

,

Anthony Gutierrez

,

Khaled Hamidouche

,

Sachin Hossamani

,

,

Mahzabeen Islam

,

,

John Kalamatianos

,

,

,

,

,

,

Abhinandan Majumdar

,

Nicholas Malaya

,

,

,

Damon McDougall

,

,

Michael Mishkin

,

,

,

Matthew Poremba

,

,

Kishore Punniyamurthy

,

,

Steven E. Raasch

,

,

Gregory Rodgers

,

,

Mohammad Seyedzadeh

,

,

Vilas Sridharan

,

René van Oostrum

,

Eric Van Tassell

,

,

Samuel Wasmundt

,

,

,

,

Adithya Yalavarti

,

Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2019

Optimizing Hyperplane Sweep Operations Using Asynchronous Multi-grain GPU Tasks.

[DOI]

Anirudh Mohan Kaushik

,

,

Muhammad Amber Hassaan

,

,

,

,

,

Bradford M. Beckmann

Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Adaptive Task Aggregation for High-Performance Sparse Solvers on GPUs.

[DOI]

,

,

,

Bradford M. Beckmann

,

Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018

Investigating Data Layout Transformations in Chapel.

[DOI]

,

,

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Taming irregular applications via advanced dynamic parallelism on GPUs.

[DOI]

,

,

,

,

Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

2017

Characterizing data organization effects on heterogeneous memory architectures.

[DOI]

,

,

Gregory Rodgers

Proceedings of the 2017 International Symposium on Code Generation and Optimization, 2017

2016

MPI-ACC: Accelerator-Aware MPI for Scientific Applications.

[DOI]

,

Lokendra S. Panwar

,

,

,

,

,

Keith R. Bisset

,

,

,

John M. Mellor-Crummey

,

,

IEEE Trans. Parallel Distributed Syst., 2016

MultiCL: Enabling automatic scheduling for task-parallel workloads in OpenCL.

[DOI]

,

Antonio J. Peña

,

,

Parallel Comput., 2016

Implementing directed acyclic graphs with the heterogeneous system architecture.

[DOI]

,

,

,

,

,

Bradford M. Beckmann

,

Gregory Rodgers

Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit, 2016

2015

Programming High-Performance Clusters with Heterogeneous Computing Devices.

[DOI]

PhD thesis, 2015

Automatic Command Queue Scheduling for Task-Parallel Workloads in OpenCL.

[DOI]

Ashwin Mandayam Aji

,

Antonio J. Peña

,

,

Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

2013

Synchronization and Ordering Semantics in Hybrid MPI+GPU Programming.

[DOI]

,

,

,

,

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Online Performance Projection for Clusters with Heterogeneous GPUs.

[DOI]

Lokendra S. Panwar

,

,

,

,

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

pVOCL: Power-Aware Dynamic Placement and Migration in Virtualized GPU Environments.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE 33rd International Conference on Distributed Computing Systems, 2013

On the efficacy of GPU-integrated MPI for scientific applications.

[DOI]

,

Lokendra S. Panwar

,

,

,

,

,

Keith R. Bisset

,

,

,

John M. Mellor-Crummey

,

,

Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Contagion Diffusion with EpiSimdemics.

Keith R. Bisset

,

,

,

,

Madhav V. Marathe

,

,

Proceedings of the Parallel Science and Engineering Applications - The Charm++ Approach., 2013

2012

Efficient Intranode Communication in GPU-Accelerated Systems.

[DOI]

,

,

,

Darius Buntinas

,

,

,

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Simulating the Spread of Infectious Disease over Large Realistic Social Networks Using Charm++.

[DOI]

Keith R. Bisset

,

,

,

Laxmikant V. Kalé

,

,

Madhav V. Marathe

,

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

DMA-Assisted, Intranode Communication in GPU Accelerated Systems.

[DOI]

,

,

,

Darius Buntinas

,

,

,

,

Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

MPI-ACC: An Integrated and Extensible Approach to Data Movement in Accelerator-based Systems.

[DOI]

,

,

Darius Buntinas

,

,

,

Keith R. Bisset

,

Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

2011

Poster: large-scale computational epidemiology modeling using charm++.

[DOI]

Keith R. Bisset

,

,

,

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

High-performance biocomputing for simulating the spread of contagion over large contact networks.

[DOI]

Keith R. Bisset

,

,

Madhav V. Marathe

,

Proceedings of the IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences, 2011

Bounding the effect of partition camping in GPU kernels.

[DOI]

,

,

Proceedings of the 8th Conference on Computing Frontiers, 2011

2010

GPU-RMAP: Accelerating Short-Read Mapping on Graphics Processors.

[DOI]

,

,

Proceedings of the 13th IEEE International Conference on Computational Science and Engineering, 2010

2009

On the Robust Mapping of Dynamic Programming onto a Graphics Processing Unit.

[DOI]

,

,

Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

2008

Cell-SWat: modeling and scheduling wavefront computations on the cell broadband engine.

[DOI]

,

,

Filip Blagojevic

,

Dimitrios S. Nikolopoulos

Proceedings of the 5th Conference on Computing Frontiers, 2008

Optimizing performance, cost, and sensitivity in pairwise sequence search on a cluster of PlayStations.

[DOI]

,

Proceedings of the 8th IEEE International Conference on Bioinformatics and Bioengineering, 2008