Matthias S. Müller

Orcid: 0000-0003-2545-5258

Affiliations:
  • RWTH Aachen University, IT Center, Germany
  • TU Dresden, Center for Information Services and High Performance Computing, Germany
  • University of Stuttgart, High Performance Computing Center (HLRS)


According to our database1, Matthias S. Müller authored at least 164 papers between 1999 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Parallel Pattern Compiler for Automatic Global Optimizations.
Parallel Comput., 2024

AI-based Density Recognition.
CoRR, 2024

An Experimental Setup to Evaluate RAPL Energy Counters for Heterogeneous Memory.
Proceedings of the 15th ACM/SPEC International Conference on Performance Engineering, 2024

SPMD IR: Unifying SPMD and Multi-value IR Showcased for Static Verification of Collectives.
Proceedings of the Recent Advances in the Message Passing Interface, 2024

Parallel Pattern Language Code Generation.
Proceedings of the 15th International Workshop on Programming Models and Applications for Multicores and Manycores, 2024

Towards Locality-Aware Host-to-Device Offloading in OpenMP.
Proceedings of the Advancing OpenMP for Future Accelerators, 2024

RMASanitizer: Generalized Runtime Detection of Data Races in Remote Memory Access Applications.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

2023
RMARaceBench: A Microbenchmark Suite to Evaluate Race Detection Tools for RMA Programs.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Mapping High-Level Concurrency from OpenMP and MPI to ThreadSanitizer Fibers.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

RLP: Power Management Based on a Latency-Aware Roofline Model.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Power-aware Computing with Optane Persistent Memory Modules.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

A Hybrid Event Log Acquisition Technique in Distributed Systems.
Proceedings of the Future Technologies Conference, 2023

2022
MPI detach - Towards automatic asynchronous local completion.
Parallel Comput., 2022

Compiler-Aided Type Correctness of Hybrid MPI-OpenMP Applications.
IT Prof., 2022

An On-the-Fly Method to Exchange Vector Clocks in Distributed-Memory Programs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

On-the-Fly Data Race Detection for MPI RMA Programs with MUST.
Proceedings of the Sixth IEEE/ACM International Workshop on Software Correctness for HPC Applications, 2022

2021
Evaluating the Performance of OpenMP Offloading on the NEC SX-Aurora TSUBASA Vector Engine.
Supercomput. Front. Innov., 2021

PPIR: Parallel Pattern Intermediate Representation.
Proceedings of the IEEE/ACM International Workshop on Hierarchical Parallelism for Exascale Computing, 2021

DA4RDM: Data Analysis for Research Data Management Systems.
Proceedings of the 13th International Joint Conference on Knowledge Discovery, 2021

Automatic General Metadata Extraction and Mapping in an HDF5 Use-case.
Proceedings of the 13th International Joint Conference on Knowledge Discovery, 2021

Automatic Mapping of Parallel Pattern-Based Algorithms on Heterogeneous Architectures.
Proceedings of the Architecture of Computing Systems - 34th International Conference, 2021

Multi-SPMD Programming Model with YML and XcalableMP.
Proceedings of the XcalableMP PGAS Programming Language, 2021

2020
MYX: Runtime Correctness Analysis for Multi-Level Parallel Programming Paradigms.
Proceedings of the Software for Exascale Computing - SPPEXA 2016-2019, 2020

CHAMELEON: Reactive Load Balancing for Hybrid MPI+OpenMP Task-Parallel Applications.
J. Parallel Distributed Comput., 2020

Towards compiler-aided correctness checking of adjoint MPI applications.
Proceedings of the 4th IEEE/ACM International Workshop on Software Correctness for HPC Applications, 2020

MPI Detach - Asynchronous Local Completion.
Proceedings of the EuroMPI/USA '20: 27th European MPI Users' Group Meeting, 2020

Operation-Aware Power Capping.
Proceedings of the Euro-Par 2020: Parallel Processing, 2020

2019
Reactive Task Migration for Hybrid MPI+OpenMP Applications.
Proceedings of the Parallel Processing and Applied Mathematics, 2019

OpenMP Target Device Offloading for the SX-Aurora TSUBASA Vector Engine.
Proceedings of the Parallel Processing and Applied Mathematics, 2019

Dynamic Runtime and Energy Optimization for Power-Capped HPC Applications.
Proceedings of the Parallel Computing: Technology Trends, 2019

Performance Prediction for Power-Capped Applications based on Machine Learning Algorithms.
Proceedings of the 17th International Conference on High Performance Computing & Simulation, 2019

Managing Discipline-Specific Metadata Within an Integrated Research Data Management System.
Proceedings of the 21st International Conference on Enterprise Information Systems, 2019

DataRaceOnAccelerator - A Micro-benchmark Suite for Evaluating Correctness Tools Targeting Accelerators.
Proceedings of the Euro-Par 2019: Parallel Processing Workshops, 2019

2018
Applicability of the software cost model COCOMO II to HPC projects.
Int. J. Comput. Sci. Eng., 2018

Compiler-aided Type Tracking for Correctness Checking of MPI Applications.
Proceedings of the 2nd IEEE/ACM International Workshop on Software Correctness for HPC Applications, 2018

Assessing Task-to-Data Affinity in the LLVM OpenMP Runtime.
Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

Performance Prediction under Power Capping.
Proceedings of the 2018 International Conference on High Performance Computing & Simulation, 2018

Thread-local concurrency: a technique to handle data race detection at programming model abstraction.
Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, 2018

Estimating the Impact of External Interference on Application Performance.
Proceedings of the Euro-Par 2018: Parallel Processing, 2018

Modeling and Optimizing Data Transfer in GPU-Accelerated Optical Coherence Tomography.
Proceedings of the Euro-Par 2018: Parallel Processing Workshops, 2018

2017
Dynamic Application-aware Power Capping.
Proceedings of the 5th International Workshop on Energy Efficient Supercomputing, 2017

Runtime Correctness Checking for Emerging Programming Paradigms.
Proceedings of the First International Workshop on Software Correctness for HPC Applications, 2017

Evaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices.
Proceedings of the Accelerator Programming Using Directives - 4th International Workshop, 2017

Assessing the Performance of OpenMP Programs on the Knights Landing Architecture.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

OpenMP Tools Interface: Synchronization Information for Data Race Detection.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

A Pattern for Overlapping Communication and Computation with OpenMP ^* Target Directives.
Proceedings of the Scaling OpenMP for Exascale Performance and Portability, 2017

Operational Concepts of GPU Systems in HPC Centers: TCO and Productivity.
Proceedings of the Euro-Par 2017: Parallel Processing Workshops, 2017

Data Mining-Based Analysis of HPC Center Operations.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

2016
Editorial for the special issue on energy-aware high performance computing.
Comput. Sci. Res. Dev., 2016

Editorial for the special issue on Energy-aware high performance computing.
Comput. Sci. Res. Dev., 2016

Software Cost Analysis of GPU-Accelerated Aeroacoustics Simulations in C++ with OpenACC.
Proceedings of the High Performance Computing, 2016


Development effort estimation in HPC.
Proceedings of the International Conference for High Performance Computing, 2016

Using Directed Variance to Identify Meaningful Views in Call-Path Performance Profiles.
Proceedings of the Third Workshop on Visual Performance Analysis, 2016

Correlating sub-phenomena in performance data in the frequency domain.
Proceedings of the 6th IEEE Symposium on Large Data Analysis and Visualization, 2016

Visualizing Performance Data with Respect to the Simulated Geometry.
Proceedings of the High-Performance Scientific Computing, 2016

Performance Optimization of Parallel Applications in Diverse On-Demand Development Teams.
Proceedings of the High-Performance Scientific Computing, 2016

NUMA-Aware Task Performance Analysis.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

Testing Infrastructure for OpenMP Debugging Interface Implementations.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

ARCHER: Effectively Spotting Data Races in Large OpenMP Applications.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

An OpenMP Epoch Model for Correctness Checking.
Proceedings of the 45th International Conference on Parallel Processing Workshops, 2016

The Scientific Programming Integrated Degree Program - A Pioneering Approach to join Theory and Practice.
Proceedings of the International Conference on Computational Science 2016, 2016

2015
Editorial for the fifth international conference on energy-aware high performance computing.
Comput. Sci. Res. Dev., 2015

Modeling the Productivity of HPC Systems on a Computing Center Scale.
Proceedings of the High Performance Computing - 30th International Conference, 2015

Effective communication for a system of cluster-on-a-chip processors.
Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, 2015

Evaluating OpenMP Performance on Thousands of Cores on the Numascale Architecture.
Proceedings of the Parallel Computing: On the Road to Exascale, 2015

Evaluating the Energy Consumption of OpenMP Applications on Haswell Processors.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Lessons Learned from Implementing OMPD: A Debugging Interface for OpenMP.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

Performance Analysis for Target Devices with the OpenMP Tools Interface.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Event-Action Mappings for Parallel Tools Infrastructures.
Proceedings of the Euro-Par 2015: Parallel Processing, 2015

2014
Towards an accurate simulation of the crystallisation process in injection moulded plastic components by hybrid parallelisation.
Int. J. High Perform. Comput. Appl., 2014

Editorial for the Fourth International Conference on Energy-Aware High Performance Computing.
Comput. Sci. Res. Dev., 2014

Visualization of memory access behavior on hierarchical NUMA architectures.
Proceedings of the First Workshop on Visual Performance Analysis, 2014

Towards providing low-overhead data race detection for large OpenMP applications.
Proceedings of the 2014 LLVM Compiler Infrastructure in HPC, 2014

SPEC ACCEL: A Standard Application Suite for Measuring Hardware Accelerator Performance.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

An OpenMP Extension Library for Memory Affinity.
Proceedings of the Using and Improving OpenMP for Devices, Tasks, and More, 2014

Classification of Common Errors in OpenMP Applications.
Proceedings of the Using and Improving OpenMP for Devices, Tasks, and More, 2014

MPI Runtime Error Detection with MUST: A Scalable and Crash-Safe Approach.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

A Pattern-Based Comparison of OpenACC and OpenMP for Accelerator Computing.
Proceedings of the Euro-Par 2014 Parallel Processing, 2014

Analysis of Parallel Applications on a High Performance-Low Energy Computer.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

Memory Usage Optimizations for Online Event Analysis.
Proceedings of the Solving Software Challenges for Exascale, 2014

2013
MPI runtime error detection with MUST: Advances in deadlock detection.
Sci. Program., 2013

Performance and quality of service of data and video movement over a 100 Gbps testbed.
Future Gener. Comput. Syst., 2013

Accelerators for Technical Computing: Is It Worth the Pain? A TCO Perspective.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

Distributed wait state tracking for runtime MPI deadlock detection.
Proceedings of the International Conference for High Performance Computing, 2013

Runtime MPI collective checking with tree-based overlay networks.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Suitability of Performance Tools for OpenMP Task-Parallel Programs.
Proceedings of the Tools for High Performance Computing 2013, 2013

Towards a Performance Engineering Workflow for OpenMP 4.0.
Proceedings of the Parallel Computing: Accelerating Computational Science and Engineering (CSE), 2013

Performance Characteristics of Large SMP Machines.
Proceedings of the OpenMP in the Era of Low Power Devices and Accelerators, 2013

Accelerators, quo vadis? Performance vs. productivity.
Proceedings of the International Conference on High Performance Computing & Simulation, 2013

Intralayer Communication for Tree-Based Overlay Networks.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

Assessing the Performance of OpenMP Programs on the Intel Xeon Phi.
Proceedings of the Euro-Par 2013 Parallel Processing, 2013

2012
MPI Runtime Error Detection with MUST: Advanced Error Reports.
Proceedings of the Tools for High Performance Computing 2012, 2012

SPEC OMP2012 - An Application Benchmark Suite for Parallel Systems Using OpenMP.
Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

Holistic Debugging of MPI Derived Datatypes.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

HIPS Introduction.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

GTI: A Generic Tools Infrastructure for Event-Based Tools in Parallel Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

2011
SPEC Benchmarks.
Proceedings of the Encyclopedia of Parallel Computing, 2011

Trace-based performance analysis for the petascale simulation code FLASH.
Int. J. High Perform. Comput. Appl., 2011

The International Exascale Software Project roadmap.
Int. J. High Perform. Comput. Appl., 2011

Order Preserving Event Aggregation in TBONs.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Memory Performance and SPEC OpenMP Scalability on Quad-Socket x86_64 Systems.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2011

2010
A generic attribute extension to OTF and its use for MPI replay.
Proceedings of the International Conference on Computational Science, 2010

Guest Editors' Introduction.
Int. J. Parallel Program., 2010

Quantifying power consumption variations of HPC systems using SPEC MPI benchmarks.
Comput. Sci. Res. Dev., 2010

Implementation, performance, and science results from a 30.7 TFLOPS IBM BladeCenter cluster.
Concurr. Comput. Pract. Exp., 2010

Preface.
Concurr. Comput. Pract. Exp., 2010

SPEC MPI2007 - an application benchmark suite for parallel systems using MPI.
Concurr. Comput. Pract. Exp., 2010

Highly Scalable Dynamic Load Balancing in the Atmospheric Modeling System COSMO-SPECS+FD4.
Proceedings of the Applied Parallel and Scientific Computing, 2010

Characterizing the energy consumption of data transfers and arithmetic operations on x86-64 processors.
Proceedings of the International Green Computing Conference 2010, 2010

PROPER 2010: Third Workshop on Productivity and Performance - Tools for HPC Application Development.
Proceedings of the Euro-Par 2010 Parallel Processing Workshops, 2010

2009
MPI Correctness Checking for OpenMP/MPI Applications.
Int. J. Parallel Program., 2009

Performance at Exascale.
Int. J. High Perform. Comput. Appl., 2009

Tools for scalable parallel program analysis: Vampir NG, MARMOT, and DeWiz.
Int. J. Comput. Sci. Eng., 2009

MUST: A Scalable Approach to Runtime Error Detection in MPI Programs.
Proceedings of the Tools for High Performance Computing 2009, 2009

A framework for detailed multiphase cloud modeling on HPC systems.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

An Interface for Integrated MPI Correctness Checking.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

GeneIndex: An Open Source Parallel Program for Enumerating and Locating Words in a Genome.
Proceedings of the International Joint Conferences on Bioinformatics, 2009

A graph based approach for MPI deadlock detection.
Proceedings of the 23rd international conference on Supercomputing, 2009

PROPER 2009: Workshop on Productivity and Performance - Tools for HPC Application Development.
Proceedings of the Euro-Par 2009, 2009

Pattern Matching and I/O Replay for POSIX I/O in Parallel Programs.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

Memory Performance and Cache Coherency Effects on an Intel Nehalem Multiprocessor System.
Proceedings of the PACT 2009, 2009

2008
Performance evaluation of supercomputers using HPCC and IMB Benchmarks.
J. Comput. Syst. Sci., 2008

Internal Timer Synchronization for Parallel Event Tracing.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

MPI Correctness Checking with Marmot.
Proceedings of the Tools for High Performance Computing, 2008

The Vampir Performance Analysis Tool-Set.
Proceedings of the Tools for High Performance Computing, 2008

Detection of Violations to the MPI Standard in Hybrid OpenMP/MPI Applications.
Proceedings of the OpenMP in a New Era of Parallelism, 4th International Workshop, 2008

Workshop on Productivity and Performance (PROPER 2008).
Proceedings of the Euro-Par 2008 Workshops, 2008

Trace-Based Analysis and Optimization for the Semtex CFD Application - Hidden Remote Memory Accesses and I/O Performance.
Proceedings of the Euro-Par 2008 Workshops, 2008

2007
Introduction.
Int. J. Parallel Program., 2007

Special Issue on OpenMP - Guest Editors' Introduction.
Int. J. Parallel Program., 2007

Scalability and Usability of HPC Programming Tools.
Proceedings of the Parallel Computing: Architectures, 2007

Developing Scalable Applications with Vampir, VampirServer and VampirTrace.
Proceedings of the Parallel Computing: Architectures, 2007

Analyzing Mutual Influences of High Performance Computing Programs on SGI Altix 3700 and 4700 Systems with PARbench.
Proceedings of the Parallel Computing: Architectures, 2007

Memory Allocation Tracing with VampirTrace.
Proceedings of the Computational Science - ICCS 2007, 7th International Conference, Beijing, China, May 27, 2007

Quality Assurance for Clusters: Acceptance-, Stress-, and Burn-In Tests for General Purpose Clusters.
Proceedings of the High Performance Computing and Communications, 2007

I/O Induced Scalability Limits of Bioinformatics Applications.
Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, 2007

2006
Performance evaluation of supercomputers using HPCC and IMB benchmarks.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Progress Towards Petascale Applications in Biology: Status in 2006.
Proceedings of the Euro-Par 2006 Workshops: Parallel Processing, 2006

High Throughput Image Analysis on PetaFLOPS Systems.
Proceedings of the Euro-Par 2006 Workshops: Parallel Processing, 2006

2005
The Grid.
it Inf. Technol., 2005

Network Bandwidth Measurements and Ratio Analysis with the HPC Challenge Benchmark Suite (HPCC).
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

MPI Application Development with MARMOT.
Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

SPEC OpenMP Benchmarks on Four Generations of NEC SX Parallel Vector Systems.
Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2005

2004
SPEC HPG benchmarks for high-performance systems.
Int. J. High Perform. Comput. Netw., 2004

The emerging role of biogrids.
Commun. ACM, 2004

MPI I/O Analysis and Error Detection with MARMOT.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

MPI Application Development Using the Analysis Tool MARMOT.
Proceedings of the Computational Science, 2004

A Global Grid for Analysis of Arthropod Evolution.
Proceedings of the 5th International Workshop on Grid Computing (GRID 2004), 2004

2003
An OpenMP compiler benchmark.
Sci. Program., 2003

Towards Efficient Execution of MPI Applications on the Grid: Porting and Optimization Issues.
J. Grid Comput., 2003

MARMOT: An MPI Analysis and Checking Tool.
Proceedings of the Parallel Computing: Software Technology, 2003

SPEC HPG Benchmarks for Large Systems.
Proceedings of the High Performance Computing, 5th International Symposium, 2003

Software Development in the Grid: The DAMIEN Tool-Set.
Proceedings of the Computational Science - ICCS 2003, 2003

Performance Analysis of a Parallel Application in the GRID.
Proceedings of the Computational Science - ICCS 2003, 2003

Performance Prediction in a Grid Environment.
Proceedings of the Grid Computing, 2003

Grid enabled MPI solutions for Clusters.
Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

2002
A software development environment for Grid computing.
Concurr. Comput. Pract. Exp., 2002

Experiences Using OpenMP Based on Compiler Directed Software DSM on a PC Cluster.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2002

A Shared Memory Benchmark in OpenMP.
Proceedings of the High Performance Computing, 4th International Symposium, 2002

2001
Metacomputing across intercontinental networks.
Future Gener. Comput. Syst., 2001

Some Simple OpenMP Optimization Techniques.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2001

2000
The Problems and the Solutions of the Metacomputing Experiment in SC99.
Proceedings of the High-Performance Computing and Networking, 8th International Conference, 2000

1999
Parallel / High-Performance Object-Oriented Scientific Computing.
Proceedings of the Object-Oriented Technology, ECOOP'99 Workshop Reader, 1999


  Loading...