Sudharshan S. Vazhkudai

IEEE Access, 2021

Exploiting user activeness for data retention in HPC systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2021

Interpreting Write Performance of Supercomputer I/O Systems with Regression Models.

[BibT_eX]

[DOI]

Feiyi Wang

Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

An Analysis of System Balance and Architectural Trends Based on Top500 Supercomputers.

[BibT_eX]

[DOI]

Proceedings of the HPC Asia 2021: The International Conference on High Performance Computing in Asia-Pacific Region, 2021

2020

An Integrated Indexing and Search Service for Distributed File Systems.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2020

Persistent Memory Object Storage and Indexing for Scientific Computing.

[BibT_eX]

[DOI]

Jin-Suk Ma

Myeong-Hoon Oh

Proceedings of the IEEE/ACM Workshop on Memory Centric High Performance Computing, 2020

Understanding the Interplay between Hardware Errors and User Job Characteristics on the Titan Supercomputer.

[BibT_eX]

[DOI]

Ross G. Miller

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

MARBLE: A Multi-GPU Aware Job Scheduler for Deep Learning on HPC Systems.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

2019

An Analysis Workflow-Aware Storage System for Multi-Core Active Flash Arrays.

[BibT_eX]

[DOI]

Geoffroy Vallée

IEEE Trans. Parallel Distributed Syst., 2019

A programmable shared-memory system for an array of processing-in-memory devices.

[BibT_eX]

[DOI]

Sangkeun Lee

Clust. Comput., 2019

Applying Machine Learning to Understand Write Performance of Large-scale Parallel Filesystems.

[BibT_eX]

[DOI]

Feiyi Wang

Proceedings of the IEEE/ACM Fourth International Parallel Data Systems Workshop, 2019

End-to-end I/O portfolio for the summit supercomputing ecosystem.

[BibT_eX]

[DOI]

Sarp Oral

Verónica G. Vergara Larrea

Proceedings of the International Conference for High Performance Computing, 2019

Profiling the Usage of an Extreme-Scale Archival Storage System.

[BibT_eX]

[DOI]

Proceedings of the 27th IEEE International Symposium on Modeling, 2019

Data Jockey: Automatic Data Management for HPC Multi-tiered Storage Systems.

[BibT_eX]

[DOI]

Woong Shin

Christopher Brumgard

Bing Xie

Devarshi Ghoshal

Sarp Oral

Lavanya Ramakrishnan

Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

AHEAD: A Tool for Projecting Next-Generation Hardware Enhancements on GPU-Accelerated Systems.

[BibT_eX]

[DOI]

Hazem A. Abdelhafez

Christopher Zimmer

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

Evaluating Burst Buffer Placement in HPC Systems.

[BibT_eX]

[DOI]

Misbah Mubarak

Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

2018

GPU age-aware scheduling to improve the reliability of leadership jobs on Titan.

[BibT_eX]

[DOI]

Christopher Zimmer

Don Maxwell

Stephen Taylor McNally

Scott Atchley

Proceedings of the International Conference for High Performance Computing, 2018

The design, deployment, and evaluation of the CORAL pre-exascale systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2018

Exploring the Optimal Platform Configuration for Power-Constrained HPC Workflows.

[BibT_eX]

[DOI]

Kun Tang

Xubin He

Proceedings of the 27th International Conference on Computer Communication and Networks, 2018

2017

GUIDE: a scalable information directory service to collect, federate, and analyze logs for operational insights into a leadership HPC facility.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2017

Tagit: an integrated indexing and search service for file systems.

[BibT_eX]

[DOI]

Geoffroy R. Vallée

Proceedings of the International Conference for High Performance Computing, 2017

Scientific user behavior and data-sharing trends in a petascale file system.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2017

Understanding object-level memory access patterns across the spectrum.

[BibT_eX]

[DOI]

Wei Xue

Daniel Sánchez

Proceedings of the International Conference for High Performance Computing, 2017

Toward Managing HPC Burst Buffers Effectively: Draining Strategy to Regulate Bursty I/O Behavior.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Modeling, 2017

Applying Graph Analytics to Understand Compute Core Usage and Publication Trends in a Petascale Supercomputing Facility.

[BibT_eX]

[DOI]

Sangkeun Lee

Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

Effective Running of End-to-End HPC Workflows on Emerging Heterogeneous Architectures.

[BibT_eX]

[DOI]

Kun Tang

Xubin He

Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

AnalyzeThat: A Programmable Shared-Memory System for an Array of Processing-In-Memory Devices.

[BibT_eX]

[DOI]

Sangkeun Lee

Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016

A multi-faceted approach to job placement for improved performance on extreme-scale systems.

[BibT_eX]

[DOI]

Christopher Zimmer

Scott Atchley

Carl Albing

Proceedings of the International Conference for High Performance Computing, 2016

Server-side log data analytics for I/O workload characterization and coordination on large shared storage systems.

[BibT_eX]

[DOI]

Yang Liu

Proceedings of the International Conference for High Performance Computing, 2016

Using Balanced Data Placement to Address I/O Contention in Production Environments.

[BibT_eX]

[DOI]

Proceedings of the 28th International Symposium on Computer Architecture and High Performance Computing, 2016

Constellation: A science graph network for scalable data and knowledge discovery in extreme-scale scientific collaborations.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015

Realizing Accelerated Cost-Effective Distributed RAID.

[BibT_eX]

[DOI]

Aleksandr Khasymski

M. Mustafa Rafique

Dimitrios S. Nikolopoulos

Proceedings of the Handbook on Data Centers, 2015

A practical approach to reconciling availability, performance, and capacity in provisioning extreme-scale storage systems.

[BibT_eX]

[DOI]

Qing Cao

Proceedings of the International Conference for High Performance Computing, 2015

AnalyzeThis: an analysis workflow-aware storage system.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2015

Understanding GPU errors on large-scale HPC systems and the implications for system design and operation.

[BibT_eX]

[DOI]

Philippe Olivier Alexandre Navaux

Daniel Oliveira

Dave Londo

Nathan DeBardeleben

Luigi Carro

Arthur S. Bland

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

2014

Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2014

Improving large-scale storage system performance via topology-aware and balanced data placement.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Automatic identification of application I/O signatures from noisy server-side traces.

[BibT_eX]

[DOI]

Yang Liu

Proceedings of the 12th USENIX conference on File and Storage Technologies, 2014

Lazy Checkpointing: Exploiting Temporal Locality in Failures to Mitigate Checkpointing Overheads on Extreme-Scale Systems.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014

2013

On Timely Staging of HPC Job Input Data.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2013

Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines.

[BibT_eX]

[DOI]

Simona Boboila

Proceedings of the 11th USENIX conference on File and Storage Technologies, 2013

2012

Reducing Data Movement Costs Using Energy-Efficient, Active Computation on SSD.

[BibT_eX]

[DOI]

Proceedings of the 2012 Workshop on Power-Aware Computing Systems, HotPower'12, 2012

Active Flash: Out-of-core data analytics on flash storage.

[BibT_eX]

[DOI]

Simona Boboila

Peter Desnoyers

Galen M. Shipman

Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies, 2012

On the Use of GPUs in Realizing Cost-Effective Distributed RAID.

[BibT_eX]

[DOI]

Aleksandr Khasymski

M. Mustafa Rafique

Dimitrios S. Nikolopoulos

Proceedings of the 20th IEEE International Symposium on Modeling, 2012

NVMalloc: Exposing an Aggregate SSD Store as a Memory Partition in Extreme-Scale Machines.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

2011

Timely Result-Data Offloading for Improved HPC Center Scratch Provisioning and Serviceability.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2011

CATCH: A Cloud-Based Adaptive Data Transfer Service for HPC.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Provisioning a Multi-tiered Data Staging Area for Extreme-Scale Machines.

[BibT_eX]

[DOI]

Ramya Prabhakar

Proceedings of the 2011 International Conference on Distributed Computing Systems, 2011

2010

Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures.

[BibT_eX]

[DOI]

Min Li

Proceedings of the Conference on High Performance Computing Networking, 2010

Reconciling scratch space consumption, exposure, and volatility to achieve timely staging of job input data.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

2009

Improving the availability of supercomputer job input data using temporal replication.

[BibT_eX]

[DOI]

Frank Mueller

Comput. Sci. Res. Dev., 2009

Improving Data Availability for Better Access Performance: A Study on Caching Scientific Data on Distributed Desktop Workstations.

[BibT_eX]

[DOI]

J. Grid Comput., 2009

Beyond Music Sharing: An Evaluation of Peer-to-Peer Data Dissemination Techniques in Large Scientific Collaborations.

[BibT_eX]

[DOI]

J. Grid Comput., 2009

/scratch as a cache: rethinking HPC center scratch storage.

[BibT_eX]

[DOI]

Proceedings of the 23rd international conference on Supercomputing, 2009

2008

Virtual Organizations [Guest Editors' Introduction].

[BibT_eX]

[DOI]

Munindar P. Singh

IEEE Internet Comput., 2008

Timely offloading of result-data in HPC centers.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

On-the-Fly Recovery of Job Input Data in Supercomputers.

[BibT_eX]

[DOI]

Frank Mueller

Proceedings of the 2008 International Conference on Parallel Processing, 2008

stdchk: A Checkpoint Storage System for Desktop Grid Computing.

[BibT_eX]

[DOI]

Samer Al-Kiswany

Abdullah Gharaibeh

Proceedings of the 28th IEEE International Conference on Distributed Computing Systems (ICDCS 2008), 2008

2007

Recovering transient data: automated on-demand data reconstruction and offloading for supercomputers.

[BibT_eX]

[DOI]

ACM SIGOPS Oper. Syst. Rev., 2007

A Checkpoint Storage System for Desktop Grid Computing

[BibT_eX]

[DOI]

Samer Al-Kiswany

CoRR, 2007

The Neutron Science TeraGrid Gateway: a TeraGrid science gateway to support the Spallation Neutron Source.

[BibT_eX]

[DOI]

Nithya N. Vijayakumar

Concurr. Comput. Pract. Exp., 2007

Optimizing center performance through coordinated data staging, scheduling and recovery.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

A result-data offloading service for HPC centers.

[BibT_eX]

[DOI]

Proceedings of the 2nd International Petascale Data Storage Workshop (PDSW '07), 2007

A Java-based science portal for neutron scattering experiments.

[BibT_eX]

[DOI]

James Arthur Kohl

Jens Schwidder

Proceedings of the 5th International Symposium on Principles and Practice of Programming in Java, 2007

Are P2P Data-Dissemination Techniques Viable in Today's Data-Intensive Scientific Collaborations?

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2007, 2007

2006

Constructing collaborative desktop storage caches for large scientific datasets.

[BibT_eX]

[DOI]

Jonathan W. Strickland

Nandan Tammineedi

Tyler A. Simon

Stephen L. Scott

ACM Trans. Storage, 2006

Coupling prefix caching and collective downloads for remote dataset access.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Positioning Dynamic Storage Caches for Transient Data.

[BibT_eX]

[DOI]

Douglas Thain

Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

2005

FreeLoader: Scavenging Desktop Storage Resources for Scientific Data.

[BibT_eX]

[DOI]

Jonathan W. Strickland

Nandan Tammineedi

Stephen L. Scott

Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Governor: Autonomic Throttling for Aggressive Idle Resource Scavenging.

[BibT_eX]

[DOI]

Jonathan W. Strickland

Proceedings of the Second International Conference on Autonomic Computing (ICAC 2005), 2005

2004

Distributed Downloads of Bulk, Replicated Grid Data.

[BibT_eX]

[DOI]

J. Grid Comput., 2004

On-demand Grid Storage Using Scavenging.

[BibT_eX]

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2004

2003

Using Regression Techniques to Predict Large Data Transfers.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2003

Enabling the Co-Allocation of Grid Data Transfers.

[BibT_eX]

[DOI]

Proceedings of the 4th International Workshop on Grid Computing (GRID 2003), 2003

2002

PODOS -- The design and implementation of a performance oriented Linux cluster.

[BibT_eX]

[DOI]

Jeelani Syed

P. Tobin Maginnis

Future Gener. Comput. Syst., 2002

Predicting the Performance of Wide Area Data Transfers.

[BibT_eX]

[DOI]

Ian T. Foster

Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Predicting Sporadic Grid Data Transfers.

[BibT_eX]

[DOI]

Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002

Using Disk Throughput Data in Predictions of End-to-End Grid Data Transfers.

[BibT_eX]

[DOI]

Proceedings of the Grid Computing, 2002

2001

A Greedy Grid: The Grid Economic Engine Directive.

[BibT_eX]

[DOI]

Gregor von Laszewski

Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

Replica Selection in the Globus Data Grid.

[BibT_eX]

[DOI]

Steven Tuecke

Ian T. Foster

Proceedings of the First IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2001), 2001

Compute Power Market: Towards a Market-Oriented Grid.

[BibT_eX]

[DOI]

Rajkumar Buyya

Proceedings of the First IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2001), 2001

2000

The PODOS File System - Exploiting the High-Speed Communication Subsystem.

[BibT_eX]

P. Tobin Maginnis

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2000

A Reusable Software Framework for Distributed Decision-Making Protocols.

[BibT_eX]

H. Conrad Cunningham

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2000

1999

A High Performance Communication Subsystem for PODOS.

[BibT_eX]

[DOI]