Michael Klemm

Orcid: 0000-0002-8634-4634

According to our database1, Michael Klemm authored at least 46 papers between 1979 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Detrimental task execution patterns in mainstream OpenMP runtimes.
CoRR, 2024

Detrimental Task Execution Patterns in Mainstream OpenMP<sup>®</sup> Runtimes.
Proceedings of the Advancing OpenMP for Future Accelerators, 2024

2023
Quantum Task Offloading with the OpenMP API.
CoRR, 2023

2022
Evaluating GPU Programming Models for the LUMI Supercomputer.
Proceedings of the Supercomputing Frontiers - 7th Asian Conference, 2022

2019
Toward a Standard Interface for User-Defined Scheduling in OpenMP.
Proceedings of the OpenMP: Conquering the Full Hardware Spectrum, 2019

2018
The Ongoing Evolution of OpenMP.
Proc. IEEE, 2018

Assessing Task-to-Data Affinity in the LLVM OpenMP Runtime.
Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

Visualization of OpenMP* Task Dependencies Using Intel® Advisor - Flow Graph Analyzer.
Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

2017
KART - A Runtime Compilation Library for Improving HPC Application Performance.
Proceedings of the High Performance Computing, 2017

Performance Evaluation of NWChem Ab-Initio Molecular Dynamics (AIMD) Simulations on the Intel® Xeon Phi™ Processor.
Proceedings of the High Performance Computing, 2017

A Pattern for Overlapping Communication and Computation with OpenMP ^* Target Directives.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

OpenMP ^* SIMD Vectorization and Threading of the Elmer Finite Element Software.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Performance Optimization of OpenFOAM* on Clusters of Intel® Xeon Phi (TM) Processors.
Proceedings of the 24th IEEE International Conference on High Performance Computing Workshops, 2017

2016
Using the pyMIC Offload Module in PyFR.
CoRR, 2016

Approaches for Task Affinity in OpenMP.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

Recent Processor Technologies and Co-Scheduling.
Proceedings of the Co-Scheduling of HPC Applications [extended versions of all papers from COSH@HiPEAC 2016, 2016

Portable SIMD Performance with OpenMP* 4.x Compiler Directives.
Proceedings of the Euro-Par 2016: Parallel Processing, 2016

2015
Performance Evaluation of OpenFOAM* with MPI-3 RMA Routines on Intel® Xeon® Processors and Intel® Xeon Phi™ Coprocessors.
Proceedings of the 22nd European MPI Users' Group Meeting, 2015

On the Algorithmic Aspects of Using OpenMP Synchronization Mechanisms II: User-Guided Speculative Locks.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Packet-Oriented Streamline Tracing on Modern SIMD Architectures.
Proceedings of the 15th Eurographics Symposium on Parallel Graphics and Visualization, 2015

2014
Efficient Implementation of Many-Body Quantum Chemical Methods on the Intel® Xeon Phi Coprocessor.
Proceedings of the International Conference for High Performance Computing, 2014

A User-Guided Locking API for the OpenMP* Application Program Interface.
Proceedings of the Using and Improving OpenMP for Devices, Tasks, and More, 2014

2013
A Proposal for Task-Generating Loops in OpenMP.
Proceedings of the OpenMP in the Era of Low Power Devices and Accelerators, 2013

2012
From GPGPU to Many-Core: Nvidia Fermi and Intel Many Integrated Core Architecture.
Comput. Sci. Eng., 2012

OpenMP Programming on Intel Xeon Phi Coprocessors: An Early Performance Comparison.
Proceedings of the Many-core Applications Research Community (MARC) Symposium at RWTH Aachen University, 2012

Extending OpenMP* with Vector Constructs for Modern Multicore SIMD Architectures.
Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

Performance of a Structure-Detecting SpMV Using the CSR Matrix Representation.
Proceedings of the 11th International Symposium on Parallel and Distributed Computing, 2012

The Intel® Many Integrated Core Architecture.
Proceedings of the 2012 International Conference on High Performance Computing & Simulation, 2012

2011
Towards High-Performance Implementations of a Custom HPC Kernel Using ® Array Building Blocks.
Proceedings of the Facing the Multicore - Challenge II, 2011

Extending a Highly Parallel Data Mining Algorithm to the Intel ® Many Integrated Core Architecture.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

2010
Towards an Error Model for OpenMP.
Proceedings of the Beyond Loop Level Parallelism in OpenMP: Accelerators, 2010

A Proposal for User-Defined Reductions in OpenMP.
Proceedings of the Beyond Loop Level Parallelism in OpenMP: Accelerators, 2010

JCudaMP: OpenMP/Java on CUDA.
Proceedings of the 3rd International Workshop on Multicore Software Engineering, 2010

2009
Reparallelization and migration of OpenMP applications in grid environments.
PhD thesis, 2009

Reparallelization techniques for migrating OpenMP codes in computational grids.
Concurr. Comput. Pract. Exp., 2009

A meta-predictor framework for prefetching in object-based DSMs.
Concurr. Comput. Pract. Exp., 2009

Dynamic code footprint optimization for the IBM Cell Broadband Engine.
Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering, 2009

2008
Automatic Prefetching with Binary Code Rewriting in Object-Based DSMs.
Proceedings of the Euro-Par 2008, 2008

2007
JaMP: an implementation of OpenMP for a Java DSM.
Concurr. Comput. Pract. Exp., 2007

Esodyp+: Prefetching in the Jackal Software DSM.
Proceedings of the Euro-Par 2007, 2007

Reparallelization and Migration of OpenMP Programs.
Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007

2006
A Proposal for OpenMP for Java.
Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2006

1991
Über die Schnittzahlen mehrfach balancierter blockpläne.
J. Comb. Theory A, 1991

1986
Über den <i>p</i>-rang von inzidenzmatrizen.
J. Comb. Theory A, 1986

1984
Über die Wurzelschranke für das Minimalgewicht von Codes.
J. Comb. Theory A, 1984

1979
A matrix of combinatorial numbers related to the symmetric groups.
Discret. Math., 1979


  Loading...