William Gropp

Orcid: 0000-0003-2905-3029

Affiliations:
  • Argonne National Laboratory, Lemont, Illinois, USA


According to our database1, William Gropp authored at least 290 papers between 1981 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2006, "For contributions to message passing protocols.".

IEEE Fellow

IEEE Fellow 2010, "For contributions to high performance computing and message passing".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Quantum-centric supercomputing for materials science: A perspective on challenges and future directions.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Future Gener. Comput. Syst., 2024

HiCCL: A Hierarchical Collective Communication Library.
CoRR, 2024

Proposal for a Flexible Benchmark for Agent Based Models.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

CommBench: Micro-Benchmarking Hierarchical Networks with Multi-GPU, Multi-NIC Nodes.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024

UniNet: Accelerating the Container Network Data Plane in IaaS Clouds.
Proceedings of the 17th IEEE International Conference on Cloud Computing, 2024

2023
Performance Analysis and Optimal Node-aware Communication for Enlarged Conjugate Gradient Methods.
ACM Trans. Parallel Comput., March, 2023

Characterizing the performance of node-aware strategies for irregular point-to-point communication on heterogeneous architectures.
Parallel Comput., 2023

Fine-grained Policy-driven I/O Sharing for Burst Buffers.
Proceedings of the International Conference for High Performance Computing, 2023

2022
EMPRESS: Accelerating Scientific Discovery through Descriptive Metadata Management.
ACM Trans. Storage, 2022

Realizing the Vision of CFD in 2030.
Comput. Sci. Eng., 2022

Our Opportunities to Collaborate: In-Person and Virtual, Local and Global, the Society Welcomes New Connections.
Computer, 2022

Succeeding Together.
Computer, 2022

Exploring Spatial Indexing for Accelerated Feature Retrieval in HPC.
Proceedings of the 22nd IEEE International Symposium on Cluster, 2022

2021
Translational research in the MPICH project.
J. Comput. Sci., 2021

Performance Portability for Advanced Architectures.
Comput. Sci. Eng., 2021

A National Discovery Cloud: Preparing the US for Global Competitiveness in the New Era of 21st Century Digital Transformation.
CoRR, 2021

Advancing Computing's Foundation of US Industry & Society.
CoRR, 2021

Modeling Data Movement Performance on Heterogeneous Architectures.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

2020
FFT, FMM, and multigrid on the road to exascale: Performance challenges and opportunities.
J. Parallel Distributed Comput., 2020

Convergence of artificial intelligence and high performance computing on NSF-supported cyberinfrastructure.
J. Big Data, 2020

Reducing communication in algebraic multigrid with multi-step node aware communication.
Int. J. High Perform. Comput. Appl., 2020

Infrastructure for Artificial Intelligence, Quantum and High Performance Computing.
CoRR, 2020

Pandemic Informatics: Preparation, Robustness, and Resilience.
CoRR, 2020

HAL: Computer System for Scalable Deep Learning.
Proceedings of the PEARC '20: Practice and Experience in Advanced Research Computing, 2020

2019
Guest editor's introduction: Special issue on best papers from EuroMPI/USA 2017.
Parallel Comput., 2019

Using node and socket information to implement MPI Cartesian topologies.
Parallel Comput., 2019

Node aware sparse matrix-vector multiplication.
J. Parallel Distributed Comput., 2019

Managing code transformations for better performance portability.
Int. J. High Perform. Comput. Appl., 2019

Exploring the feasibility of lossy compression for PDE simulations.
Int. J. High Perform. Comput. Appl., 2019

Enabling real-time multi-messenger astrophysics discoveries with deep learning.
CoRR, 2019

Node-Aware Improvements to Allreduce.
CoRR, 2019

Deep Learning for Multi-Messenger Astrophysics: A Gateway for Discovery in the Big Data Era.
CoRR, 2019

Learning with Analytical Models.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

Using performance models to understand scalable Krylov solver performance at scale for structured grid problems.
Proceedings of the ACM International Conference on Supercomputing, 2019

Locus: A System and a Language for Program Optimization.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019

2018
DAME: Runtime-compilation for data movement.
Int. J. High Perform. Comput. Appl., 2018

Big data and extreme-scale computing.
Int. J. High Perform. Comput. Appl., 2018

Using Node Information to Implement MPI Cartesian Topologies.
Proceedings of the 25th European MPI Users' Group Meeting, 2018

Improving Performance Models for Irregular Point-to-Point Communication.
Proceedings of the 25th European MPI Users' Group Meeting, 2018

2017
Eliminating contention bottlenecks in multithreaded MPI.
Parallel Comput., 2017

Rethinking key-value store for parallel I/O optimization.
Int. J. High Perform. Comput. Appl., 2017

Moya - A JIT Compiler for HPC.
Proceedings of the Programming and Performance Visualization Tools, 2017

Towards a More Complete Understanding of SDC Propagation.
Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, 2017

A DSL for Performance Orchestration.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
Reducing Parallel Communication in Algebraic Multigrid through Sparsification.
SIAM J. Sci. Comput., 2016

A hybrid format for better performance of sparse matrix-vector multiplication on a GPU.
Int. J. High Perform. Comput. Appl., 2016

Performance Modeling of Distributed Deep Neural Networks.
CoRR, 2016

TAPSpMV: Topology-Aware Parallel Sparse Matrix Vector Multiplication.
CoRR, 2016

An implementation and evaluation of the MPI 3.0 one-sided communication interface.
Concurr. Comput. Pract. Exp., 2016

Rethinking High Performance Computing System Architecture for Scientific Big Data Applications.
Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016

Scalable non-blocking preconditioned conjugate gradient methods.
Proceedings of the International Conference for High Performance Computing, 2016

Modeling MPI Communication Performance on SMP Nodes: Is it Time to Retire the Ping Pong Test.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

Towards millions of communicating threads.
Proceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016, 2016

Scalability Challenges in Current MPI One-Sided Implementations.
Proceedings of the 15th International Symposium on Parallel and Distributed Computing, 2016

2015
Collective Algorithms for Multiported Torus Networks.
ACM Trans. Parallel Comput., 2015

Remote Memory Access Programming in MPI-3.
ACM Trans. Parallel Comput., 2015

Towards a more fault resilient multigrid solver.
Proceedings of the Symposium on High Performance Computing, 2015

Efficient disk-to-disk sorting: a case study in the decoupled execution paradigm.
Proceedings of the 2015 International Workshop on Data-Intensive Scalable Computing Systems, 2015

DAME: A Runtime-Compiled Engine for Derived Datatypes.
Proceedings of the 22nd European MPI Users' Group Meeting, 2015

Composing Low-Overhead Scheduling Strategies for Improving Performance of Scientific Applications.
Proceedings of the OpenMP: Heterogenous Execution and Data Movements, 2015

A Multiplatform Study of I/O Behavior on Petascale Supercomputers.
Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, 2015

Runtime Support for Irregular Computation in MPI-Based Applications.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014
Toward Exascale Resilience: 2014 update.
Supercomput. Front. Innov., 2014

Special issue: SC13 - The International Conference for High Performance Computing, Networking, Storage and Analysis.
Sci. Program., 2014

Applications of the streamed storage format for sparse matrix operations.
Int. J. High Perform. Comput. Appl., 2014

Nonblocking Epochs in MPI One-Sided Communication.
Proceedings of the International Conference for High Performance Computing, 2014

Rethinking key-value store for parallel I/O optimization.
Proceedings of the 2014 International Workshop on Data Intensive Scalable Computing Systems, 2014

Algebraic Multigrid on a Dragonfly Network: First Experiences on a Cray XC30.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2014

Locality-Optimized Mixed Static/Dynamic Scheduling for Improving Load Balancing on SMPs.
Proceedings of the 21st European MPI Users' Group Meeting, 2014

Decoupled I/O for Data-Intensive High Performance Computing.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

Using MPI - Portable Parallel Programming with the Message-Passing Interface, 3rd Edition.
Scientific and engineering computation, MIT Press, ISBN: 978-0-262-52739-2, 2014

2013
Parallel Adaptive Deflated GMRES.
Proceedings of the Domain Decomposition Methods in Science and Engineering XX, 2013

Multiphysics simulations: Challenges and opportunities.
Int. J. High Perform. Comput. Appl., 2013

Programming for Exascale Computers.
Comput. Sci. Eng., 2013

MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory.
Computing, 2013

Analysis of topology-dependent MPI performance on Gemini networks.
Proceedings of the 20th European MPI Users's Group Meeting, 2013

Performance Analysis of the Lattice Boltzmann Model Beyond Navier-Stokes.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Systematic Reduction of Data Movement in Algebraic Multigrid Solvers.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

MPI-Interoperable Generalized Active Messages.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Optimization Strategies for MPI-Interoperable Active Messages.
Proceedings of the IEEE 11th International Conference on Dependable, 2013

Runtime system design of decoupled execution paradigm for data-intensive high-end computing.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013

Toward Asynchronous and MPI-Interoperable Active Messages.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013

2012
Best algorithms + best computers = powerful match.
Commun. ACM, 2012

Abstract: Slack-Conscious Lightweight Loop Scheduling for Improving Scalability of Bulk-synchronous MPI Applications.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Performance Modeling of Algebraic Multigrid on Blue Gene/Q: Lessons Learned.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

A Case for Optimistic Coordination in HPC Storage Systems.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Adaptive Strategy for One-Sided Communication in MPICH2.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

Leveraging MPI's One-Sided Communication Interface for Shared-Memory Programming.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

Advanced MPI Including New MPI-3 Features.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

MPI 3 and Beyond: Why MPI Is Successful and What Challenges It Faces.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

Efficient Multithreaded Context ID Allocation in MPI.
Proceedings of the Recent Advances in the Message Passing Interface, 2012

Faster topology-aware collective algorithms through non-minimal communication.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012

Hybrid Static/dynamic Scheduling for Already Optimized Dense Matrix Factorization.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Modeling the Performance of an Algebraic Multigrid Cycle Using Hybrid MPI/OpenMP.
Proceedings of the 41st International Conference on Parallel Processing, 2012

A Decoupled Execution Paradigm for Data-Intensive High-End Computing.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

2011
MPI (Message Passing Interface).
Proceedings of the Encyclopedia of Parallel Computing, 2011

Mpi on millions of Cores.
Parallel Process. Lett., 2011

Optimizing Sparse Data Structures for Matrix-vector Multiply.
Int. J. High Perform. Comput. Appl., 2011

The International Exascale Software Project roadmap.
Int. J. High Perform. Comput. Appl., 2011

EcoG: A Power-Efficient GPU Cluster Architecture for Scientific Computing.
Comput. Sci. Eng., 2011

Formal analysis of MPI-based parallel programs.
Commun. ACM, 2011

Performance modeling for systematic performance tuning.
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011

Avoiding hot-spots on two-level direct networks.
Proceedings of the Conference on High Performance Computing Networking, 2011

Multi-core and Network Aware MPI Topology Functions.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Performance Expectations and Guidelines for MPI Derived Datatypes.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Scalable Memory Use in MPI: A Case Study with MPICH2.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

LACIO: A New Collective I/O Strategy for Parallel I/O Systems.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Architectural Constraints to Attain 1 Exaflop/s for Three Scientific Application Classes.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Performance modeling as the key to extreme scale computing.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Modeling the performance of an algebraic multigrid cycle on HPC platforms.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Weighted locality-sensitive scheduling for mitigating noise on multi-core clusters.
Proceedings of the 18th International Conference on High Performance Computing, 2011

2010
Self-Consistent MPI Performance Guidelines.
IEEE Trans. Parallel Distributed Syst., 2010

Formal methods applied to high-performance computing software design: a case study of MPI one-sided communication-based locking.
Softw. Pract. Exp., 2010

A Pipelined Algorithm for Large, Irregular All-Gather Problems.
Int. J. High Perform. Comput. Appl., 2010

The Importance of Non-Data-Communication Overheads in MPI.
Int. J. High Perform. Comput. Appl., 2010

Fine-Grained Multithreading Support for Hybrid Threaded MPI Programming.
Int. J. High Perform. Comput. Appl., 2010

Teaching parallel programming: a roundtable discussion.
XRDS, 2010

A Scalable MPI_Comm_split Algorithm for Exascale Computing.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Load Balancing for Regular Meshes on SMPs with MPI.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Toward Performance Models of MPI Implementations for Understanding Application Scaling Issues.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems.
Proceedings of the Recent Advances in the Message Passing Interface, 2010

An adaptive performance modeling tool for GPU architectures.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

An introductory exascale feasibility study for FFTs and multigrid.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Keynote.
Proceedings of the 2010 International Conference on High Performance Computing & Simulation, 2010

Extreme scale computing: Challenges and opportunities.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

Minimizing MPI Resource Contention in Multithreaded Multicore Environments.
Proceedings of the 2010 IEEE International Conference on Cluster Computing, 2010

Enabling the Next Generation of Scalable Clusters.
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010

2009
Test suite for evaluating performance of multithreaded MPI communication.
Parallel Comput., 2009

On the Need for a Consortium of Capability Centers.
Int. J. High Perform. Comput. Appl., 2009

Toward Exascale Resilience.
Int. J. High Perform. Comput. Appl., 2009

Toward message passing for a million processes: characterizing MPI on a massive scale blue gene/P.
Comput. Sci. Res. Dev., 2009

Software for Petascale Computing Systems.
Comput. Sci. Eng., 2009

Hierarchical Collectives in MPICH2.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Processing MPI Datatypes Outside MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

MPI at Exascale: Challenges for Data Structures and Algorithms.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

MPI on a Million Processors.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2009

Investigating High Performance RMA Interfaces for the MPI-3 Standard.
Proceedings of the ICPP 2009, 2009

Natively Supporting True One-Sided Communication in.
Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009

2008
An efficient format for nearly constant-time access to arbitrary time intervals in large trace files.
Sci. Program., 2008

Hiding I/O latency with pre-execution prefetching for parallel applications.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Parallel I/O prefetching using MPI file caching and I/O signatures.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008

Implementing Efficient Dynamic Formal Verification Methods for MPI Programs.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

A Simple, Pipelined Algorithm for Large, Irregular All-gather Problems.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

A Formal Approach to Detect Functionally Irrelevant Barriers in MPI Programs.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

EuroPVM/MPI Full-Day Tutorial. Using MPI-2: A Problem-Based Approach.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Self-consistent MPI-IO Performance Requirements and Expectations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

MPI and Hybrid Programming Models for Petascale Computing.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Non-data-communication Overheads in MPI: Analysis on Blue Gene/P.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

Toward Efficient Support for Multithreaded MPI Communication.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2008

2008 International Conference on Parallel Processing September 8-12, 2008 Portland, Oregon Exploring Parallel I/O Concurrency with Speculative Prefetching.
Proceedings of the 2008 International Conference on Parallel Processing, 2008

Improving the Performance of Tensor Matrix Vector Multiplication in Cumulative Reaction Probability Based Quantum Chemistry Codes.
Proceedings of the High Performance Computing, 2008

Communication Analysis of Parallel 3D FFT for Flat Cartesian Meshes on Large Blue Gene Systems.
Proceedings of the High Performance Computing, 2008

2007
Thread-safety in an MPI implementation: Requirements and analysis.
Parallel Comput., 2007

Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the Nemesis communication subsystem.
Parallel Comput., 2007

A Portable Method for Finding User Errors in the Usage of MPI Collective Operations.
Int. J. High Perform. Comput. Appl., 2007

Analyzing the impact of supporting out-of-order communication on in-order performance with iWARP.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

Self-consistent MPI Performance Requirements.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Test Suite for Evaluating Performance of MPI Implementations That Support MPI_THREAD_MULTIPLE.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Practical Model-Checking Method for Verifying Correctness of MPI Programs.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Extending the MPI-2 Generalized Request Interface.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Revealing the Performance of MPI RMA Implementations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Using MPI-2: A Problem-Based Approach.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 14th European PVM/MPI User's Group Meeting, Paris, France, September 30, 2007

Scaling Science Applications on Blue Gene.
Proceedings of the Parallel Computing: Architectures, 2007

Nonuniformly Communicating Noncontiguous Data: A Case Study with PETSc and MPI.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

Advanced Flow-control Mechanisms for the Sockets Direct Protocol over InfiniBand.
Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Open Issues in MPI Implementation.
Proceedings of the Advances in Computer Systems Architecture, 2007

MPI - eine Einführung: portable parallele Programmierung mit dem Message-Passing Interface.
Oldenbourg, 2007

2006
Multi-core issues - Multi-Core for HPC: breakthrough or breakdown?
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Awards & video - Awards session.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

M01 - Application supercomputing and multiscale simulation techniques.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

S01 - Advanced MPI: I/O and one-sided communication.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Formal Verification of Programs That Use MPI One-Sided Communication.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Issues in Developing a Thread-Safe MPI Implementation.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Where Does MPI Need to Grow?
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

An Interface to Support the Identification of Dynamic MPI 2 Processes for Scalable Parallel Debugging.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Automatic Memory Optimizations for Improving MPI Derived Datatype Performance.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Implementation and Shared-Memory Evaluation of MPICH2 over the Nemesis Communication Subsystem.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

Collective communication on architectures that support simultaneous communication over multiple links.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2006

Grid-based Image Registration.
Proceedings of the Grid-Based Problem Solving Environments, 2006

Observations on WoCo9.
Proceedings of the Grid-Based Problem Solving Environments, 2006

Data Transfers between Processes in an SMP System: Performance Study and Application to MPI.
Proceedings of the 2006 International Conference on Parallel Processing (ICPP 2006), 2006

High performance file I/O for the Blue Gene/L supercomputer.
Proceedings of the 12th International Symposium on High-Performance Computer Architecture, 2006

Design and Evaluation of Nemesis, a Scalable, Low-Latency, Message-Passing Communication Subsystem.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

Parallel Tools and Environments: A Survey.
Proceedings of the Parallel Processing for Scientific Computing, 2006

2005
Optimization of Collective Communication Operations in MPICH.
Int. J. High Perform. Comput. Appl., 2005

Optimizing the Synchronization Operations in Message Passing Interface One-Sided Communication.
Int. J. High Perform. Comput. Appl., 2005

Design and implementation of message-passing services for the Blue Gene/L supercomputer.
IBM J. Res. Dev., 2005

An Evaluation of Implementation Options for MPI One-Sided Communication.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Towards a Productive MPI Environment.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Collective Error Detection for MPI Collective Operations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Designing a Common Communication Subsystem.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005

Implementing MPI-IO atomic mode without file system support.
Proceedings of the 5th International Symposium on Cluster Computing and the Grid (CCGrid 2005), 2005

12. Issues in Accurate and Reliable Use of Parallel Computing in Numerical Programs.
Proceedings of the Accuracy and Reliability in Scientific Computing, 2005

2004
Evaluating structured I/O methods for parallel file systems.
Int. J. High Perform. Comput. Netw., 2004

Fault Tolerance in Message Passing Interface Programs.
Int. J. High Perform. Comput. Appl., 2004

Minimizing Synchronization Overhead in the Implementation of MPI One-Sided Communication.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Efficient Implementation of MPI-2 Passive One-Sided Communication on InfiniBand Clusters.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Providing Efficient I/O Redundancy in MPI Environments.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

MPI and High Productivity Programming.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2004

Design and Implementation of MPICH2 over InfiniBand with RDMA Support.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004


Predicting memory-access cost based on data-access patterns.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

High performance MPI-2 one-sided communication over InfiniBand.
Proceedings of the 4th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2004), 2004

2003
Parallel netCDF: A Scientific High-Performance I/O Interface
CoRR, 2003

Parallel netCDF: A High-Performance Scientific I/O Interface.
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003

Improving the Performance of Collective Operations in MPICH.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003

Implementing Fast and Reusable Datatype Processing.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003

High-Level Programming in MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003

Future Developments in MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003

MPI on BlueGene/L: Designing an Efficient General Purpose Messaging Solution for a Large Cellular System.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface,10th European PVM/MPI Users' Group Meeting, Venice, Italy, September 29, 2003

Exploring the Relationship Between Parallel Application Run-Time Variability and Network Performance in Clusters.
Proceedings of the 28th Annual IEEE Conference on Local Computer Networks (LCN 2003), 2003

Toward Understanding Soft Faults in High Performance Cluster Networks.
Proceedings of the Integrated Network Management VII, Managing It All, 2003

Using MPI-2: Advanced Features of the Message Passing Interface.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

Efficient Structured Data Access in Parallel File Systems.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

Improving the Performance of MPI Derived Datatypes by Optimizing Memory-Access Cost.
Proceedings of the 2003 IEEE International Conference on Cluster Computing (CLUSTER 2003), 2003

Noncontiguous I/O Accesses Through MPI-IO.
Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

2002
Optimizing noncontiguous accesses in MPI-IO.
Parallel Comput., 2002

A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids
CoRR, 2002

MPI on the Grid.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, Linz, Austria, September 29, 2002

Building Library Components that Can Use Any MPI Implementation.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, Linz, Austria, September 29, 2002

MPICH2: A New Start for MPI Implementations.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 9th European PVM/MPI Users' Group Meeting, Linz, Austria, September 29, 2002

Prototype of AM3: Active Mapper and Monitoring Module for Myrinet Environments.
Proceedings of the 27th Annual IEEE Conference on Local Computer Networks (LCN 2002), 2002

High Performance Wide Area Data Transfers over High Performance Networks.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

An Evaluation of Object-Based Data Transfers on High Performance Networks.
Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002

Goals Guiding Design: PVM and MP.
Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002

Noncontiguous I/O through PVFS.
Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002

2001
High-performance parallel implicit CFD.
Parallel Comput., 2001

Components and interfaces of a process management system for parallel programs.
Parallel Comput., 2001

Scalable Unix Commands for Parallel Processors: A High-Performance Implementation.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2001

Challenges and Successes in Achieving the Potential of MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2001

Interfacing Parallel Jobs to Process Managers.
Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10 2001), 2001

Learning from the Success of MPI.
Proceedings of the High Performance Computing - HiPC 2001, 8th International Conference, 2001

Advanced Cluster Programming with MP.
Proceedings of the 2001 IEEE International Conference on Cluster Computing (CLUSTER 2001), 2001

2000
Globalized Newton-Krylov-Schwarz Algorithms and Software for Parallel Implicit CFD.
Int. J. High Perform. Comput. Appl., 2000

From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems.
Proceedings of the Proceedings Supercomputing 2000, 2000

MPICH-GQ: Quality-of-Service for Message Passing Programs.
Proceedings of the Proceedings Supercomputing 2000, 2000

Performance Modeling and Tuning of an Unstructured Mesh CFD Application.
Proceedings of the Proceedings Supercomputing 2000, 2000

Runtime Checking of Datatype Signatures in MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2000

A Scalable Process-Management Environment for Parallel Programs.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2000

Solving CFD Problems with Open Source Parallel Libraries.
Proceedings of the Applied Parallel Computing, 2000

Exploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

PETSc and Overture: Lessons Learned Developing an Interface between Components.
Proceedings of the Architecture of Scientific Software, 2000

Analyzing the Parallel Scalability of an Implicit Unstructured Mesh CFD Code.
Proceedings of the High Performance Computing, 2000

1999
Toward Scalable Performance Visualization with Jumpshot.
Int. J. High Perform. Comput. Appl., 1999

Parallel computation of three-dimensional nonlinear magnetostatic problems.
Concurr. Pract. Exp., 1999

Achieving High Sustained Performance in an Unstructured Mesh CFD Application.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1999

Reproducible Measurements of MPI Performance Characteristics.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 1999

A Standard Interface for Debugger Access to Message Queue Information in MPI.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 1999

Infrastructure and Interfaces for Large-Scale Numerical Software.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1999

On Implementing MPI-IO Portably and with High Performance.
Proceedings of the Sixth Workshop on I/O in Parallel and Distributed Systems, 1999

Using MPI: portable parallel programming with the message-passing interface, 2nd Edition.
Scientific and engineering computation series, MIT Press, ISBN: 026257134X, 1999

1998
Parallel Newton-Krylov-Schwarz Algorithms for the Transonic Full Potential Equation.
SIAM J. Sci. Comput., 1998

Wide-Area Implementation of the Message Passing Interface.
Parallel Comput., 1998

I/O in Parallel Applications: the Weakest Link.
Int. J. High Perform. Comput. Appl., 1998

A Case for Using MPI's Derived Datatypes to Improve I/O Performance.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1998

1997
A High-Performance MPI Implementation on a Shared-Memory Vector Supercomputer.
Parallel Comput., 1997

Sowing Mpich: a Case Study in the Dissemination of a Portable Environment for Parallel Scientific Computing.
Int. J. High Perform. Comput. Appl., 1997

Why Are PVM and MPI So Different?
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 1997

Parallel Implicit PDE Computations.
Proceedings of the Conference on Parallel Computational Fluid Dynamics 1997, 1997

1996
The Design of Data-Structure-Neutral Libraries for the Iterative Solution of Sparse Linear Systems.
Sci. Program., 1996

A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard.
Parallel Comput., 1996

Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries.
Proceedings of the Modern Software Tools for Scientific Computing, 1996

Why we couldn't use numerical libraries for PETSc.
Proceedings of the Quality of Numerical Software, 1996

MPI-2: Extending the Message-Passing Interface.
Proceedings of the Euro-Par '96 Parallel Processing, 1996

An Experimental Evaluation of the Parallel I/O Systems of the IBM SP and Intel Paragon Using a Production Application.
Proceedings of the Parallel Computation, 1996

1995
Early Applications in the Message-Passing Interface (Mpi).
Int. J. High Perform. Comput. Appl., 1995

Experiences with the IBM SP1.
IBM Syst. J., 1995

Dynamic process management in an MPI setting.
Proceedings of the Seventh IEEE Symposium on Parallel and Distributed Processing, 1995

Computational Electromagnetics and Parallel Dense Matrix Computations.
Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995

6. Parallel Domain Decomposition Software.
Proceedings of the Domain-Based Parallelism and Problem Decomposition Methods in Computational Science and Engineering, 1995

1994
A comparison of some domain decomposition and ILU preconditioned iterative methods for nonsymmetric elliptic problems.
Numer. Linear Algebra Appl., 1994

Using MPI - portable parallel programming with the message-parsing interface.
MIT Press, ISBN: 978-0-262-57104-3, 1994

1993
Applications-driven parallel I/O.
Proceedings of the Proceedings Supercomputing '93, 1993

Parallel Solution of the Three-Dimensional, Time-Dependent Ginzburg-Landau Equation.
Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993

Panel - Software Tools for High-Performance Distributed Computing.
Proceedings of the Second International Symposium on High Performance Distributed Computing, 1993

1992
Domain Decomposition with Local Mesh Refinement.
SIAM J. Sci. Comput., 1992

Parallel Performance of Domain-Decomposed Preconditioned Krylov Methods for PDEs with Locally Uniform Refinement.
SIAM J. Sci. Comput., 1992

1991
Parallel Scalability of the Spectral Transform Method.
Proceedings of the Fifth SIAM Conference on Parallel Processing for Scientific Computing, 1991

1990
Krylov Methods Preconditioned with Incompletely Factored Matrices on the CM-2.
J. Parallel Distributed Comput., 1990

CLAM and CLAMShell: An Interactive Front-End for Parallel Computing and Visualization.
Proceedings of the 1990 International Conference on Parallel Processing, 1990

1989
Domain decomposition on parallel computers.
IMPACT Comput. Sci. Eng., 1989

Recursive Mesh Refinement on Hypercubes.
BIT, 1989

Balanced Divide-and-Conquer Algorithms for the Fine-Grained Parallel Direct Solution of Dense and Banded Triangular Linear Systems and their Connection Machine Implementation.
Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, 1989

Parallel Domain Decomposition with Local Mesh Refinement.
Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, 1989

1987
Solving PDEs on loosely-coupled parallel processors.
Parallel Comput., 1987

A Gray-Code Schmee for Mesh Refinement on Hypercubes.
Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing, 1987

A Parallel Version of the Fast Multipole Method-Invited Talk.
Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing, 1987

1985
A comparison of domain decomposition techniques for elliptic partial differential equations and their parallel implementation.
Proceedings of the Selected Papers from the Second Conference on Parallel Processing for Scientific Computing, 1985

1981
Numerical solution of transport equations.
PhD thesis, 1981


  Loading...