Klaus Nölp

Philipp Brauner

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

HIPS 2024 Preface and Committees.

[DOI]

Seyong Lee

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

2023

Simplifying non-contiguous data transfer with MPI for Python.

[DOI]

Klaus Nölp

J. Supercomput., November, 2023

Special issue on new trends in high-performance computing: Software systems and applications.

[DOI]

Sunita Chandrasekaran

Min Si

Jidong Zhai

Softw. Pract. Exp., 2023

2022

Improving cryptanalytic applications with stochastic runtimes on GPUs and multicores.

[DOI]

Jörg Keller

Parallel Comput., 2022

12th IEEE International Workshop on Accelerators and Hybrid Emerging Systems.

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Quality-aware scheduling of on-board and off-board data analysis in vehicle development.

[DOI]

Michaela Gaudszun

Bernhard Schlegel

Proceedings of the 5th IEEE International Conference on Industrial Cyber-Physical Systems, 2022

2021

Comparing Data Staging Techniques for Large Scale Brain Images.

[DOI]

IEEE Trans. Emerg. Top. Comput., 2021

Improving Cryptanalytic Applications with Stochastic Runtimes on GPUs.

[DOI]

Jörg Keller

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

2020

Fall-detection on a wearable micro controller using machine learning algorithms.

[DOI]

Thorsten Witt

Proceedings of the IEEE International Conference on Smart Computing, 2020

Lessons learned from comparing C-CUDA and Python-Numba for GPU-Computing.

[DOI]

Proceedings of the 28th Euromicro International Conference on Parallel, 2020

Workshop 8: AsHES Accelerators and Hybrid Exascale Systems.

[DOI]

Min Si

Simon Garcia De Gonzalo

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Implementation and Evaluation of CUDA-Unified Memory in Numba.

[DOI]

Tarek Saidi

Proceedings of the Euro-Par 2020: Parallel Processing Workshops, 2020

2019

IO Challenges for Human Brain Atlasing Using Deep Learning Methods - An In-Depth Analysis.

[DOI]

Proceedings of the 27th Euromicro International Conference on Parallel, 2019

Evaluating the Benefits of Key-Value Databases for Scientific Applications.

[DOI]

Proceedings of the Computational Science - ICCS 2019, 2019

2017

InfiniBand Verbs on GPU: a case study of controlling an InfiniBand network device from the GPU.

[DOI]

Int. J. High Perform. Comput. Appl., 2017

Why is MPI so slow?: analyzing the fundamental limits in implementing MPI-3.1.

[DOI]

Proceedings of the International Conference for High Performance Computing, 2017

Hexe: A Toolkit for Heterogeneous Memory Management.

[DOI]

Pavan Balaji

Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017

A Performance Study of UCX over InfiniBand.

[DOI]

Nikela Papadopoulou

Pavan Balaji

Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016

Analyzing GPU-controlled communication with dynamic parallelism in terms of performance and energy.

[DOI]

Parallel Comput., 2016

2015

Analyzing communication models for distributed thread-collaborative processors in terms of energy and time.

[DOI]

Proceedings of the 2015 IEEE International Symposium on Performance Analysis of Systems and Software, 2015

2014

Direct communication methods for distributed GPUs.

[DOI]

PhD thesis, 2014

Energy-efficient stencil computations on distributed GPUs using dynamic parallelism and GPU-controlled communication.

[DOI]

Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing, 2014

Infiniband-Verbs on GPU: A Case Study of Controlling an Infiniband Network Device from the GPU.

[DOI]

Franz-Josef Pfreundt

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Analyzing Put/Get APIs for Thread-Collaborative Processors.

[DOI]

Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

Energy-Efficient Collective Reduce and Allreduce Operations on Distributed GPUs.

[DOI]

Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014

2013

GPI2 for GPUs: A PGAS framework for efficient communication in hybrid clusters.

[DOI]

Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

GGAS: Global GPU address spaces for efficient communication in heterogeneous clusters.

[DOI]