Nathan R. Tallent

IEEE Trans. Parallel Distributed Syst., 2022

QuaL<sup>2</sup> M: Learning Quantitative Performance of Latency-Sensitive Code.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

ReWorDs 2022 Keynote: Towards Orchestrating Distributed & Data-Intensive Workflows.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Conference on e-Science, 2022

MemGaze: Rapid and Effective Load-Level Memory Trace Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021

EXAGRAPH: Graph and combinatorial methods for enabling exascale applications.

[BibT_eX]

[DOI]

Sivasankaran Rajamanickam

Oguz Selvitopi

Antonino Tumeo

Int. J. High Perform. Comput. Appl., 2021

Single-node partitioned-memory for huge graph analytics: cost and performance trade-offs.

[BibT_eX]

[DOI]

Sayan Ghosh

Marco Minutoli

Ramesh Peri

Ananth Kalyanaraman

Proceedings of the International Conference for High Performance Computing, 2021

Diolkos: improving ethernet throughput through dynamic port selection.

[BibT_eX]

[DOI]

Proceedings of the CF '21: Computing Frontiers Conference, 2021

WinnowML: Stable feature selection for maximizing prediction accuracy of time-based system modeling.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021

2020

Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2020

Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2020

Rapid Memory Footprint Access Diagnostics.

[BibT_eX]

[DOI]

Ozgur O. Kilic

Ryan D. Friese

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

Geomancy: Automated Performance Enhancement through Data Layout Optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

Vertex Reordering for Real-World Graphs and Applications: An Empirical Evaluation.

[BibT_eX]

[DOI]

Reet Barik

Marco Minutoli

Ananth Kalyanaraman

Proceedings of the IEEE International Symposium on Workload Characterization, 2020

Effectively Using Remote I/O For Work Composition in Distributed Workflows.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

2019

Rapidly Measuring Loop Footprints.

[BibT_eX]

[DOI]

Ozgur O. Kilic

Ryan D. Friese

Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

TAZeR: Hiding the Cost of Remote I/O in Distributed Scientific Workflows.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

2018

Stochastic Programming Approach for Resource Selection Under Demand Uncertainty.

[BibT_eX]

[DOI]

Tanveer Hossain Bhuiyan

Proceedings of the Job Scheduling Strategies for Parallel Processing, 2018

Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018

Optimizing Distributed Data-Intensive Workflows.

[BibT_eX]

[DOI]

Ryan D. Friese

Malachi Schram

Kevin J. Barker

Proceedings of the IEEE International Conference on Cluster Computing, 2018

Deep Learning for Enhancing Fault Tolerant Capabilities of Scientific Workflows.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

2017

Representative paths analysis.

[BibT_eX]

[DOI]

Darren J. Kerbyson

Adolfy Hoisie

Proceedings of the International Conference for High Performance Computing, 2017

Evaluating On-Node GPU Interconnects for Deep Learning Workloads.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2017

Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Generating Performance Models for Irregular Applications.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

2016

Assessing Advanced Technology in CENATE.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Networking, 2016

Modeling the Impact of Silicon Photonics on Graph Analytics.

[BibT_eX]

[DOI]

Daniel G. Chavarría-Miranda

Kevin J. Barker

Antonino Tumeo

Andrès Márquez

Darren J. Kerbyson

Adolfy Hoisie

Proceedings of the IEEE International Conference on Networking, 2016

Fault Modeling of Extreme Scale Applications Using Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Algorithm and Architecture Independent Benchmarking with SEAK.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

2015

A case for application-oblivious energy-efficient MPI runtime.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2015

Towards efficient scheduling of data intensive high energy physics workflows.

[BibT_eX]

[DOI]

Proceedings of the 10th Workshop on Workflows in Support of Large-Scale Science, 2015

Diagnosing the causes and severity of one-sided message contention.

[BibT_eX]

[DOI]

Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Power and performance trade-offs for Space Time Adaptive Processing.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Conference on Application-specific Systems, 2015

2014

Palm: easing the burden of analytical performance modeling.

[BibT_eX]

[DOI]

Adolfy Hoisie

Proceedings of the 2014 International Conference on Supercomputing, 2014

2011

Using Sampling to Understand Parallel Program Performance.

[BibT_eX]

[DOI]

Proceedings of the Tools for High Performance Computing 2011, 2011

Scalable fine-grained call path tracing.

[BibT_eX]

[DOI]

Michael Franco

Reed Landrum

Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

2010

HPCTOOLKIT: tools for performance analysis of optimized parallel programs.

[BibT_eX]

[DOI]

Concurr. Comput. Pract. Exp., 2010

Scalable Identification of Load Imbalance in Parallel Executions Using Call Path Profiles.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2010

Analyzing lock contention in multithreaded applications.

[BibT_eX]

[DOI]

Allan Porterfield

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Effectively Presenting Call Path Profiles of Application Performance.

[BibT_eX]

[DOI]

Proceedings of the 39th International Conference on Parallel Processing, 2010

2009

Identifying Performance Bottlenecks in Work-Stealing Computations.

[BibT_eX]

[DOI]

Computer, 2009

Diagnosing performance bottlenecks in emerging petascale applications.

[BibT_eX]

[DOI]

Michael W. Fagan

Mark Krentel

Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Effective performance measurement and analysis of multithreaded applications.

[BibT_eX]

[DOI]

Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009

Binary analysis for measurement and attribution of program performance.

[BibT_eX]

[DOI]

Michael W. Fagan

Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2009

2008

OpenAD/F: A Modular Open-Source Tool for Automatic Differentiation of Fortran Codes.

[BibT_eX]

[DOI]

Michelle Mills Strout

Patrick Heimbach

Chris Hill

Carl Wunsch

ACM Trans. Math. Softw., 2008

2002

HPCVIEW: A Tool for Top-down Analysis of Node Performance.

[BibT_eX]

[DOI]

Robert J. Fowler

Gabriel Marin