Felix Wolf

Dataset, August, 2018

Scalasca analysis report of the ASCI Sweep3D benchmark on 65,536 processes in virtual-node mode on IBM Blue Gene/P.

[BibT_eX]

[DOI]

Dataset, April, 2018

Scalasca analysis report of the ASCI Sweep3D benchmark on 294,912 processes in virtual-node mode on IBM Blue Gene/P.

[BibT_eX]

[DOI]

Dataset, April, 2018

Scalasca analysis report for SPEC MPI.2007 benchmark 132.zeump2 on 512 processes in virtual-node mode on Blue Gene/P.

[BibT_eX]

[DOI]

Dataset, April, 2018

A scalable algorithm for simulating the structural plasticity of the brain.

[BibT_eX]

[DOI]

Sebastian Rinke

Markus Butz-Ostendorf

Mikaël Naveau

J. Parallel Distributed Comput., 2018

Understanding the Scalability of Molecular Simulation Using Empirical Performance Modeling.

[BibT_eX]

[DOI]

Sergei Shudler

Jadran Vrabec

Proceedings of the Programming and Performance Visualization Tools, 2018

Using Deep Learning for Automated Communication Pattern Characterization: Little Steps and Big Challenges.

[BibT_eX]

[DOI]

Philip C. Roth

Kevin A. Huck

Ganesh Gopalakrishnan

Proceedings of the Programming and Performance Visualization Tools, 2018

Unveiling Thread Communication Bottlenecks Using Hardware-Independent Metrics.

[BibT_eX]

[DOI]

Arya Mazaheri

Proceedings of the 47th International Conference on Parallel Processing, 2018

Estimating the Impact of External Interference on Application Performance.

[BibT_eX]

[DOI]

Aamer Shah

Matthias S. Müller

Proceedings of the Euro-Par 2018: Parallel Processing, 2018

Exploring the Performance Envelope of the LLL Algorithm.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Computational Science and Engineering, 2018

Lightweight Requirements Engineering for Exascale Co-design.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2018

Efficient Fault Tolerance Through Dynamic Node Replacement.

[BibT_eX]

[DOI]

Suraj Prabhakaran

Marcel Neumann

Proceedings of the 18th IEEE/ACM International Symposium on Cluster, 2018

2017

Editorial of special issue on Software Engineering for Parallel Systems.

[BibT_eX]

[DOI]

Walter F. Tichy

J. Syst. Softw., 2017

Brief Announcement: Meeting the Challenges of Parallelizing Sequential Programs.

[BibT_eX]

[DOI]

Rohit Atre

Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, 2017

Isoefficiency in Practice: Configuring and Understanding the Performance of Task-based Applications.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017

Parallelizing Audio Analysis Applications - A Case Study.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE/ACM International Conference on Software Engineering: Software Engineering Education and Training Track, 2017

Following the Blind Seer - Creating Better Performance Models Using Less Information.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

Off-Road Performance Modeling - How to Deal with Segmented Data.

[BibT_eX]

[DOI]

M. Kashif Ilyas

Alexandru Calotoiu

Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

2016

Automatic Performance Modeling of HPC Applications.

[BibT_eX]

[DOI]

Proceedings of the Software for Exascale Computing - SPPEXA 2013-2015, 2016

Automated Performance Modeling of the UG4 Simulation Framework.

[BibT_eX]

[DOI]

Proceedings of the Software for Exascale Computing - SPPEXA 2013-2015, 2016

Identifying the Root Causes of Wait States in Large-Scale Parallel Applications.

[BibT_eX]

[DOI]

ACM Trans. Parallel Comput., 2016

Unveiling parallelization opportunities in sequential programs.

[BibT_eX]

[DOI]

J. Syst. Softw., 2016

Automatic Generation of Unit Tests for Correlated Variables in Parallel Programs.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2016

Automatic Parallel Pattern Detection in the Algorithm Structure Design Space.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Fast Multi-parameter Performance Modeling.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Cluster Computing, 2016

2015

Separating the wheat from the chaff: identifying relevant and similar performance data with visual analytics.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Visual Performance Analysis, 2015

Preventing the explosion of exascale profile data with smart thread-level aggregation.

[BibT_eX]

[DOI]

Daniel Lorenz

Sergei Shudler

Proceedings of the 4th Workshop on Extreme Scale Programming Tools, 2015

A Batch System with Efficient Adaptive Scheduling for Malleable and Evolving Applications.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

An Efficient Data-Dependence Profiler for Sequential and Parallel Programs.

[BibT_eX]

[DOI]

Zhen Li

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Exascaling Your Library: Will Your Implementation Meet Your Expectations?

[BibT_eX]

[DOI]

Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Characterizing Loop-Level Communication Patterns in Shared Memory.

[BibT_eX]

[DOI]

Proceedings of the 44th International Conference on Parallel Processing, 2015

Beyond Data Parallelism: Identifying Parallel Tasks in Sequential Programs.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

Fast Data-Dependence Profiling by Skipping Repeatedly Executed Memory Operations.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

10, 000 Performance Models per Minute - Scalability of the UG4 Simulation Framework.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

How Many Threads will be too Many? On the Scalability of OpenMP Implementations.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2015: Parallel Processing, 2015

Dependence-Based Code Transformation for Coarse-Grained Parallelism.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, 2015

The Basic Building Blocks of Parallel Tasks.

[BibT_eX]

[DOI]

Rohit Atre

Proceedings of the 2015 International Workshop on Code Optimisation for Multi and Many Cores, 2015

2014

Using Template Matching to Infer Parallel Design Patterns.

[BibT_eX]

[DOI]

Zia Ul Huda

ACM Trans. Archit. Code Optim., 2014

Special issue: Euro-Par 2013.

[BibT_eX]

[DOI]

Christian Lengauer

Luc Bougé

Concurr. Comput. Pract. Exp., 2014

Generating Classified Parallel Unit Tests.

[BibT_eX]

[DOI]

Proceedings of the Tests and Proofs - 8th International Conference, 2014

Down to earth: how to visualize traffic on high-dimensional torus networks.

[BibT_eX]

[DOI]

Lucas Theisen

Aamer Shah

Proceedings of the First Workshop on Visual Performance Analysis, 2014

Catching Idlers with Ease: A Lightweight Wait-State Profiler for MPI Programs.

[BibT_eX]

[DOI]

Proceedings of the 21st European MPI Users' Group Meeting, 2014

SEPS 2014: first international workshop on software engineering for parallel systems.

[BibT_eX]

[DOI]

Walter F. Tichy

Proceedings of the SPLASH'14, 2014

A Comparison between OPARI2 and the OpenMP Tools Interface in the Context of Score-P.

[BibT_eX]

[DOI]

Proceedings of the Using and Improving OpenMP for Devices, Tasks, and More, 2014

A Batch System with Fair Scheduling for Evolving Applications.

[BibT_eX]

[DOI]

Proceedings of the 43rd International Conference on Parallel Processing, 2014

Catwalk: A Quick Development Path for Performance Models.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

How file access patterns influence interference among cluster applications.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014

2013

A scalable infrastructure for the performance analysis of passive target synchronization.

[BibT_eX]

[DOI]

Sriram Krishnamoorthy

Parallel Comput., 2013

Parallel universal access layer: A scalable I/O library for integrated tokamak modeling.

[BibT_eX]

[DOI]

Comput. Phys. Commun., 2013

Extending the scope of the controlled logical clock.

[BibT_eX]

[DOI]

Clust. Comput., 2013

Using automated performance modeling to find scalability bugs in complex codes.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2013

Understanding the formation of wait states in applications with one-sided communication.

[BibT_eX]

[DOI]

Proceedings of the 20th European MPI Users's Group Meeting, 2013

Massively parallel loading.

[BibT_eX]

[DOI]

Bronis R. de Supinski

Proceedings of the International Conference on Supercomputing, 2013

Efficient Offloading of Parallel Kernels Using MPI_Comm_Spawn.

[BibT_eX]

[DOI]

Sebastian Rinke

Suraj Prabhakaran

Proceedings of the 42nd International Conference on Parallel Processing, 2013

A Dynamic Resource Management System for Network-Attached Accelerator Clusters.

[BibT_eX]

[DOI]

Proceedings of the 42nd International Conference on Parallel Processing, 2013

Discovery of Potential Parallelism in Sequential Programs.

[BibT_eX]

[DOI]

Zhen Li

Proceedings of the 42nd International Conference on Parallel Processing, 2013

Detecting Correlation Violations and Data Races by Inferring Non-deterministic Reads.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Predicting Parallelization of Sequential Programs Using Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the 12th International Conference on Machine Learning and Applications, 2013

Capturing inter-application interference on clusters.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

2012

Scalable detection of MPI-2 remote memory access inefficiency patterns.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2012

The HOPSA Workflow and Tools.

[BibT_eX]

[DOI]

Proceedings of the Tools for High Performance Computing 2012, 2012

Generic Support for Remote Memory Access Operations in Score-P and OTF2.

[BibT_eX]

[DOI]

Proceedings of the Tools for High Performance Computing 2012, 2012

Performance Analysis Techniques for Task-Based OpenMP Applications.

[BibT_eX]

[DOI]

Proceedings of the OpenMP in a Heterogeneous World - 8th International Workshop on OpenMP, 2012

Dynamic Load Balancing for Unstructured Meshes on Space-Filling Curves.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Scalable Critical-Path Based Performance Analysis.

[BibT_eX]

[DOI]

David Böhme

Bronis R. de Supinski

Martin Schulz

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Characterizing Load and Communication Imbalance in Large-Scale Parallel Applications.

[BibT_eX]

[DOI]

David Böhme

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

A Dynamic Accelerator-Cluster Architecture.

[BibT_eX]

[DOI]

Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Profiling of OpenMP Tasks with Score-P.

[BibT_eX]

[DOI]

Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Pattern-Independent Detection of Manual Collectives in MPI Programs.

[BibT_eX]

[DOI]

Alexandru Calotoiu

Christian Siebert

Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012

2011

Scalasca.

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Parallel Computing, 2011

Parallel Sorting with Minimal Data.

[BibT_eX]

[DOI]

Christian Siebert

Proceedings of the Recent Advances in the Message Passing Interface, 2011

Scaling Performance Tool MPI Communicator Management.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in the Message Passing Interface, 2011

Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir.

[BibT_eX]

[DOI]

Proceedings of the Tools for High Performance Computing 2011, 2011

Patterns of Inefficient Performance Behavior in GPU Applications.

[BibT_eX]

[DOI]

Dominic Eschweiler

Proceedings of the 19th International Euromicro Conference on Parallel, 2011

Open Trace Format 2: The Next Generation of Scalable Trace Formats and Support Libraries.

[BibT_eX]

[DOI]

Proceedings of the Applications, Tools and Techniques on the Road to Exascale Computing, Proceedings of the conference ParCo 2011, 31 August, 2011

Performance Analysis of Long-Running Applications.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Reconciling Sampling and Direct Instrumentation for Unintrusive Call-Path Profiling of MPI Programs.

[BibT_eX]

[DOI]

Todd Gamblin

Martin Schulz

Bronis R. de Supinski

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Reducing the Overhead of Direct Application Instrumentation Using Prior Static Analysis.

[BibT_eX]

[DOI]

Jan Mußler

Daniel Lorenz

Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

Score-P.

[BibT_eX]

Christian Rössel

Proceedings of the Entwicklung und Evolution von Forschungssoftware: Tagungsband des Workshops, 2011

Scalasca.

[BibT_eX]

David Böhme

Proceedings of the Entwicklung und Evolution von Forschungssoftware: Tagungsband des Workshops, 2011

2010

Large-Scale Performance Analysis of Sweep3D with the Scalasca Toolset.

[BibT_eX]

[DOI]

Parallel Process. Lett., 2010

Performance measurement and analysis tools for extremely scalable systems.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2010

The Scalasca performance toolset architecture.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2010

Further Improving the Scalability of the Scalasca Toolset.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel and Scientific Computing, 2010

How to Reconcile Event-Based Performance Analysis with Tasking in OpenMP.

[BibT_eX]

[DOI]

Proceedings of the Beyond Loop Level Parallelism in OpenMP: Accelerators, 2010

Performance analysis of Sweep3D on Blue Gene/P with the Scalasca toolset.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Proceedings of the 15<sup>th</sup> international workshop on high-level parallel programming models and supportive environments.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Improvements of common open Grid standards to increase High Throughput and High Performance Computing effectiveness on large-scale Grid and e-science infrastructures.

[BibT_eX]

[DOI]

Aleksandr Konstantinov

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Identifying the Root Causes of Wait States in Large-Scale Parallel Applications.

[BibT_eX]

[DOI]

Proceedings of the 39th International Conference on Parallel Processing, 2010

PROPER 2010: Third Workshop on Productivity and Performance - Tools for HPC Application Development.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2010 Parallel Processing Workshops, 2010

Synchronizing the Timestamps of Concurrent Events in Traces of Hybrid MPI/OpenMP Applications.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

Score-P: A Unified Performance Measurement System for Petascale Applications.

[BibT_eX]

[DOI]

Proceedings of the Competence in High Performance Computing 2010, 2010

Exploring the Potential of Using Multiple E-science Infrastructures with Emerging Open Standards-Based E-health Research Tools.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

Experiences and Requirements for Interoperability Between HTC and HPC-driven e-Science Infrastructure.

[BibT_eX]

[DOI]

Proceedings of the Future Application and Middleware Technology on e-Science, 2010

2009

Replay-based synchronization of timestamps in event traces of massively parallel applications.

[BibT_eX]

[DOI]

Scalable Comput. Pract. Exp., 2009

A scalable tool architecture for diagnosing wait states in massively parallel applications.

[BibT_eX]

[DOI]

Parallel Comput., 2009

Scalable timestamp synchronization for event traces of message-passing applications.

[BibT_eX]

[DOI]

Parallel Comput., 2009

Interoperation of world-wide production e-Science infrastructures.

[BibT_eX]

[DOI]

Philip M. Papadopoulos

Somsak Sriprayoonsakul

Anders Rhod Gregersen

Concurr. Comput. Pract. Exp., 2009

Research advances by using interoperable e-science infrastructures.

[BibT_eX]

[DOI]

Clust. Comput., 2009

Space-efficient time-series call-path profiling of parallel applications.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Scalable massively parallel I/O to task-local files.

[BibT_eX]

[DOI]

Wolfgang Frings

Ventsislav Petkov

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Recent Developments in the Scalasca Toolset.

[BibT_eX]

[DOI]

Proceedings of the Tools for High Performance Computing 2009, 2009

Verifying Causality between Distant Performance Phenomena in Large-Scale MPI Applications.

[BibT_eX]

[DOI]

Proceedings of the 17th Euromicro International Conference on Parallel, 2009

A Generic and Configurable Source-Code Instrumentation Component.

[BibT_eX]

[DOI]

Proceedings of the Computational Science, 2009

Introduction.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2009 Parallel Processing, 2009

PROPER 2009: Workshop on Productivity and Performance - Tools for HPC Application Development.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2009, 2009

Performance Simulation of Non-blocking Communication in Message-Passing Applications.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2009, 2009

Enabling Grid Interoperability by Extending HPC-driven Job Management with an Open Standard Information Model.

[BibT_eX]

[DOI]

Proceedings of the 8th IEEE/ACIS International Conference on Computer and Information Science, 2009

2008

Performance measurement and analysis of large-scale parallel applications on leadership computing systems.

[BibT_eX]

[DOI]

Sci. Program., 2008

SCALASCA Parallel Performance Analyses of SPEC MPI2007 Applications.

[BibT_eX]

[DOI]

Proceedings of the Performance Evaluation: Metrics, 2008

Usage of the SCALASCA toolset for scalable performance analysis of large-scale parallel applications.

[BibT_eX]

[DOI]

Proceedings of the Tools for High Performance Computing, 2008

Performance Evaluation and Optimization of Parallel Grid Computing Applications.

[BibT_eX]

[DOI]

Wolfgang Frings

Proceedings of the 16th Euromicro International Conference on Parallel, 2008

Extending the collaborative online visualization and steering framework for computational Grids with attribute-based authorization.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE/ACM International Conference on Grid Computing (Grid 2008), Tsukuba, Japan, September 29, 2008

Scalasca Parallel Performance Analyses of PEPC.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2008 Workshops, 2008

Classification of Different Approaches for e-Science Applications in Next Generation Computing Infrastructures.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Conference on e-Science, 2008

Grid-Based Workflow Management.

[BibT_eX]

[DOI]

Proceedings of the Grid and Services Evolution, 2008

Implications of non-constant clock drifts for the timestamps of concurrent events.

[BibT_eX]

[DOI]

Rolf Rabenseifner

Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

2007

Compensation of Measurement Overhead in Parallel Performance Profiling.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2007

Automatic analysis of inefficiency patterns in parallel applications.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2007

Timestamp Synchronization for Event Traces of Large-Scale Message-Passing Applications.

[BibT_eX]

[DOI]

Rolf Rabenseifner

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Scalability and Usability of HPC Programming Tools.

[BibT_eX]

Proceedings of the Parallel Computing: Architectures, 2007

Scalable Collation and Presentation of Call-Path Profile Data with CUBE.

[BibT_eX]

Proceedings of the Parallel Computing: Architectures, 2007

Automatic Trace-Based Performance Analysis of Metacomputing Applications.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Design and evaluation of a collaborative online visualization and steering framework implementation for computational grids.

[BibT_eX]

[DOI]

Proceedings of the 8th IEEE/ACM International Conference on Grid Computing (GRID 2007), 2007

Computational Steering and Online Visualization of Scientific Applications on Large-Scale HPC Systems within e-Science Infrastructures.

[BibT_eX]

[DOI]

Proceedings of the Third International Conference on e-Science and Grid Computing, 2007

2006

Performance Tools for Parallel Programming.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Scalable Parallel Trace-Based Performance Analysis.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Integrated Runtime Measurement Summarisation and Selective Event Tracing for Scalable Parallel Execution Performance Diagnosis.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Tools for Parallel Performance Analysis: Minisymposium Abstract.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

A Parallel Trace-Data Interface for Scalable Performance Analysis.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Performance Instrumentation and Compiler Optimizations for MPI/OpenMP Applications.

[BibT_eX]

[DOI]

Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2006

A systematic multi-step methodology for performance analysis of communication traces of distributed applications based on hierarchical clustering.

[BibT_eX]

[DOI]

Maria Gabriela Aguilera

Patricia J. Teller

Michela Taufer

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Specification of Inefficiency Patterns for MPI-2 One-Sided Communication.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28, 2006

Large Event Traces in Parallel Performance Analysis.

[BibT_eX]

[DOI]

Proceedings of the ARCS 2006, 2006

2005

Performance Profiling Overhead Compensation for MPI Programs.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

A Scalable Approach to MPI Application Performance Analysis.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Holistic Hardware Counter Performance Analysis of Parallel Programs.

[BibT_eX]

Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Performance Analysis of One-sided Communication Mechanisms.

[BibT_eX]

Proceedings of the Parallel Computing: Current & Future Issues of High-End Computing, 2005

Automatic Experimental Analysis of Communication Patterns in Virtual Topologies.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), 2005

Trace-Based Parallel Performance Overhead Compensation.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing and Communications, 2005

Event-Based Measurement and Analysis of One-Sided Communication.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30, 2005

2004

An Algebra for Cross-Experiment Performance Analysis.

[BibT_eX]

[DOI]

Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004

Efficient Pattern Search in Large Traces Through Successive Refinement.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2004 Parallel Processing, 2004

2003

Automatic performance analysis on parallel computers with SMP nodes.

[BibT_eX]

[DOI]

PhD thesis, 2003

Automatic performance analysis of hybrid MPI/OpenMP applications.

[BibT_eX]

[DOI]

J. Syst. Archit., 2003

Hardware-Counter Based Automatic Performance Analysis of Parallel Programs.

[BibT_eX]

Proceedings of the Parallel Computing: Software Technology, 2003

KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Programs.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2003. Parallel Processing, 2003

2002

Design and Prototype of a Performance Tool Interface for OpenMP.

[BibT_eX]

[DOI]

J. Supercomput., 2002

CATCH - A Call-Graph Based Automatic Tool for Capture of Hardware Performance Metrics for MPI and OpenMP Applications.

[BibT_eX]

[DOI]

Luiz De Rose

Proceedings of the Euro-Par 2002, 2002

2001

Specifying Performance Properties of Parallel Applications Using Compound Events.

[BibT_eX]

[DOI]

Parallel Distributed Comput. Pract., 2001

2000

Automatic Performance Analysis of MPI Applications Based on Event Traces.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2000, Parallel Processing, 6th International Euro-Par Conference, Munich, Germany, August 29, 2000

1999

Performance analysis on CRAY T3E.

[BibT_eX]

[DOI]

Proceedings of the Seventh Euromicro Workshop on Parallel and Distributed Processing. PDP'99, 1999

EARL - A Programmable and Extensible Toolkit for Analyzing Event Traces of Message Passing Programs.

[BibT_eX]

[DOI]