Garth A. Gibson

Orcid: 0000-0002-6656-7080

Affiliations:
  • Carnegie Mellon University, Pittsburgh, USA


According to our database1, Garth A. Gibson authored at least 127 papers between 1986 and 2021.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2012, "For contributions to the performance and reliability of storage systems.".

IEEE Fellow

IEEE Fellow 2014, "For contributions to the performance and reliability of transformative storage systems".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2021
DeltaFS: a scalable no-ground-truth filesystem for massively-parallel computing.
Proceedings of the International Conference for High Performance Computing, 2021

2020
Streaming Data Reorganization at Scale with DeltaFS Indexed Massive Directories.
ACM Trans. Storage, 2020

Mochi: Composing Data Services for High-Performance Computing Environments.
J. Comput. Sci. Technol., 2020

2019
SysML: The New Frontier of Machine Learning Systems.
CoRR, 2019

STRADS-AP: Simplifying Distributed Machine Learning Programming without Introducing a New Programming Model.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Priority-based Parameter Propagation for Distributed DNN Training.
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

Automating Dependence-Aware Parallelization of Machine Learning Training on Distributed Shared Memory.
Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, 2019

Compact Filters for Fast Online Data Partitioning.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

2018
Litz: Elastic Framework for High-Performance Distributed Machine Learning.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

On the diversity of cluster workloads and its impact on research results.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

Scaling embedded in-situ indexing with deltaFS.
Proceedings of the International Conference for High Performance Computing, 2018

A Case for Packing and Indexing in Cloud File Systems.
Proceedings of the 10th USENIX Workshop on Hot Topics in Cloud Computing, 2018

2017
Evolving Ext4 for Shingled Disks.
login Usenix Mag., 2017

SlimDB: A Space-Efficient Key-Value Storage Engine For Semi-Sorted Data.
Proc. VLDB Endow., 2017

Software-defined storage for fast trajectory queries using a deltaFS indexed massive directory.
Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems, 2017

2016
Stateless model checking with data-race preemption points.
Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, 2016

STRADS: a distributed framework for scheduled model parallel machine learning.
Proceedings of the Eleventh European Conference on Computer Systems, 2016

Addressing the straggler problem for iterative convergent parallel ML.
Proceedings of the Seventh ACM Symposium on Cloud Computing, 2016

2015
DeltaFS: exascale file systems scale better without dedicated servers.
Proceedings of the 10th Parallel Data Storage Workshop, 2015

Caveat-Scriptor: Write Anywhere Shingled Disks.
Proceedings of the 7th USENIX Workshop on Hot Topics in Storage and File Systems, 2015

ShardFS vs. IndexFS: replication vs. caching strategies for distributed metadata management in cloud storage systems.
Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015

Managed communication and consistency for fast data-parallel iterative analytics.
Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015

High-Performance Distributed ML at Scale through Parameter Server Consistency Models.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Primitives for Dynamic Big Model Parallelism.
CoRR, 2014

Exploiting Bounded Staleness to Speed Up Big Data Analytics.
Proceedings of the 2014 USENIX Annual Technical Conference, 2014

BatchFS: scaling the file system control plane with client-funded metadata servers.
Proceedings of the 9th Parallel Data Storage Workshop, 2014

IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion.
Proceedings of the International Conference for High Performance Computing, 2014

On Model Parallelization and Scheduling Strategies for Distributed Machine Learning.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Will They Blend?: Exploring Big Data Computation Atop Traditional HPC NAS Storage.
Proceedings of the IEEE 34th International Conference on Distributed Computing Systems, 2014

Exploiting iterative-ness for parallel ML computations.
Proceedings of the ACM Symposium on Cloud Computing, 2014

2013
PRObE: A Thousand-Node Experimental Cluster for Computer Systems Research.
login Usenix Mag., 2013

Shingled Magnetic Recording: Areal Density Increase Requires New Data Management.
login Usenix Mag., 2013

Structure-Aware Dynamic Scheduler for Parallel Machine Learning.
CoRR, 2013

TABLEFS: Enhancing Metadata Efficiency in the Local File System.
Proceedings of the 2013 USENIX Annual Technical Conference, 2013

Parrot: a practical runtime for deterministic, stable, and reliable threads.
Proceedings of the ACM SIGOPS 24th Symposium on Operating Systems Principles, 2013

Structuring PLFS for extensibility.
Proceedings of the 8th Parallel Data Storage Workshop, 2013

More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

I/O acceleration with pattern detection.
Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, 2013

Solving the Straggler Problem with Bounded Staleness.
Proceedings of the 14th Workshop on Hot Topics in Operating Systems, 2013

2012
File system virtual appliances: Portable file system implementations.
ACM Trans. Storage, 2012

Poster: Hadoop's Adolescence; A Comparative Workloads Analysis from Three Research Clusters.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Hadoop's Adolescence; A Comparative Workloads Analysis from Three Research Clusters.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

A Case for Scaling HPC Metadata Performance through De-specialization.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Discovering Structure in Unstructured I/O.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Poster: PLFS/HDFS: HPC Applications on Cloud Storage.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Scalable Dynamic Partial Order Reduction.
Proceedings of the Runtime Verification, Third International Conference, 2012

The Power and Challenges of Transformative I/O.
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012

Fast Approximate Matching of Astronomical Objects.
Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops, 2012

2011
Recipes for Baking Black Forest Databases - Building and Querying Black Hole Merger Trees from Cosmological Simulations.
Proceedings of the Scientific and Statistical Database Management, 2011

dBug: Systematic Testing of Unmodified Distributed and Multi-threaded Systems.
Proceedings of the Model Checking Software, 2011

On the duality of data-intensive file system design: reconciling HDFS and PVFS.
Proceedings of the Conference on High Performance Computing Networking, 2011

Six degrees of scientific data: reading patterns for extreme scale science IO.
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011

Scale and Concurrency of GIGA+: File System Directories with Millions of Files.
Proceedings of the 9th USENIX Conference on File and Storage Technologies, 2011

YCSB++: benchmarking and performance debugging advanced features in scalable table stores.
Proceedings of the ACM Symposium on Cloud Computing in conjunction with SOSP 2011, 2011

2010
A Large-Scale Study of Failures in High-Performance Computing Systems.
IEEE Trans. Dependable Secur. Comput., 2010

dBug: Systematic Evaluation of Distributed Systems.
Proceedings of the 5th International Workshop on Systems Software Verification, 2010

DiscFinder: a data-intensive scalable cluster finder for astrophysics.
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010

2009
Safe and effective fine-grained TCP retransmissions for datacenter communication.
Proceedings of the ACM SIGCOMM 2009 Conference on Applications, 2009

PLFS: a checkpoint filesystem for parallel applications.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

In Search of an API for Scalable File Systems: Under the Table or Above It?
Proceedings of the Workshop on Hot Topics in Cloud Computing, 2009

Parallel Data Storage and Access.
Proceedings of the Scientific Data Management - Challenges, Technology, and Deployment., 2009

2008
Scalable Performance of the Panasas Parallel File System.
Proceedings of the 6th USENIX Conference on File and Storage Technologies, 2008

Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems.
Proceedings of the 6th USENIX Conference on File and Storage Technologies, 2008

2007
Understanding disk failure rates: What does an MTTF of 1, 000, 000 hours mean to you?
ACM Trans. Storage, 2007

GIGA+: scalable directories for shared file systems.
Proceedings of the 2nd International Petascale Data Storage Workshop (PDSW '07), 2007

On application-level approaches to avoiding TCP throughput collapse in cluster-based storage systems.
Proceedings of the 2nd International Petascale Data Storage Workshop (PDSW '07), 2007

Disk Failures in the Real World: What Does an MTTF of 1, 000, 000 Hours Mean to You?
Proceedings of the 5th USENIX Conference on File and Storage Technologies, 2007

2006
Poster reception - The Computer Failure Data Repository (CFDR): collecting, sharing and analyzing failure data.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

High performance NFS - High performance NFS: facts and fictions.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Exotic technologies II - HPC storage systems of 2020.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Storage solutions I - Advances in RAID and HPC storage reliability.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Petascale data storage - Petascale data storage.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

PaScal - a new parallel and scalable server IO networking infrastructure for supporting global storage/file systems in large-size Linux clusters.
Proceedings of the 25th IEEE International Performance Computing and Communications Conference, 2006

2005
Scheduling speculative tasks in a compute farm.
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

2004
Managing Scalability in Object Storage Systems for HPC Linux Clusters.
Proceedings of the 21st IEEE Conference on Mass Storage Systems and Technologies / 12th NASA Goddard Conference on Mass Storage Systems and Technologies, 2004

Cluster scheduling for explicitly-speculative tasks.
Proceedings of the 18th Annual International Conference on Supercomputing, 2004

Scaling File Service Up and Out.
Proceedings of the FAST '04 Conference on File and Storage Technologies, March 31, 2004

2001
Active Disks for Large-Scale Data Processing.
Computer, 2001

Hinting for Goodness' Sake.
Proceedings of HotOS-VIII: 8th Workshop on Hot Topics in Operating Systems, 2001

2000
A mobile agent's effects on file service.
IEEE Concurr., 2000

Network attached storage architecture.
Commun. ACM, 2000

Dynamic Function Placement for Data-Intensive Cluster Computing.
Proceedings of the General Track: 2000 USENIX Annual Technical Conference, 2000

Easing the management of data-parallel systems via adaptation.
Proceedings of the 9th ACM SIGOPS European Workshop, 2000

Highly Concurrent Shared Storage.
Proceedings of the 20th International Conference on Distributed Computing Systems, 2000

1999
Implementing Lottery Scheduling: Matching the Specializations in Traditional Schedulers.
Proceedings of the 1999 USENIX Annual Technical Conference, 1999

Informed Prefetching of Collective Input/Output Requests.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1999

Automatic I/O Hint Generation Through Speculative Execution.
Proceedings of the Third USENIX Symposium on Operating Systems Design and Implementation (OSDI), 1999

The Effects of a Mobile Agent on File Service.
Proceedings of the 1st International Symposium on Agent Systems and Applications / 3rd International Symposium on Mobile Agents (ASA/MA '99), 1999

Integrity and Performance in Network Attached Storage.
Proceedings of the High Performance Computing, Second International Symposium, 1999

1998
Active Storage for Large-Scale Data Mining and Multimedia.
Proceedings of the VLDB'98, 1998

A Cost-Effective, High-Bandwidth Storage Architecture.
Proceedings of the ASPLOS-VIII Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems, 1998

1997
Prefetching Over a Network: Early Experience With CTIP.
SIGMETRICS Perform. Evaluation Rev., 1997

Informed Multi-Process Prefetching and Caching.
Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1997

File Server Scaling with Network-Attached Secure Disks.
Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1997

Task Force on Network Storage Architecture: Abstracting the storage interface.
Proceedings of the 30th Annual Hawaii International Conference on System Sciences (HICSS-30), 1997

1996
Self-Managing Network-Attached Storage.
ACM Comput. Surv., 1996

Strategic Directions in Storage I/O Issues in Large-Scale Computing.
ACM Comput. Surv., 1996

RAIDframe: Rapid Prototyping for Disk Arrays.
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 1996

A Trace-Driven Comparison of Algorithms for Parallel Prefetching and Caching.
Proceedings of the Second USENIX Symposium on Operating Systems Design and Implementation (OSDI), 1996

1995
Informed Prefetching and Caching.
Proceedings of the Fifteenth ACM Symposium on Operating System Principles, 1995

Storage Technology: RAID and Beyond.
Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, 1995

The Scotch Parallel Storage Systems.
Proceedings of the COMPCON '95: Technologies for the Information Superhighway, 1995

1994
Parity-Logging Disk Arrays.
ACM Trans. Comput. Syst., 1994

Architectures and Algorithms for On-Line Failure Recovery in Redundant Disk Arrays.
Distributed Parallel Databases, 1994

RAID: High-Performance, Reliable Secondary Storage
ACM Comput. Surv., 1994

Coding Techniques for Handling Failures in Large Disk Arrays.
Algorithmica, 1994

Exposing I/O Concurrency with Informed Prefetching.
Proceedings of the Third International Conference on Parallel and Distributed Information Systems (PDIS 94), 1994

RAID-II: A High-Bandwidth Network File Server.
Proceedings of the 21st Annual International Symposium on Computer Architecture. Chicago, 1994

Backward Error Recovery in Redundant Disk Arrays.
Proceedings of the 20st International Computer Measurement Group Conference, 1994

1993
A Status Report on Research in Transparent Informed Prefetching.
ACM SIGOPS Oper. Syst. Rev., 1993

Designing Disk Arrays for High Data Reliability.
J. Parallel Distributed Comput., 1993

Performance and Reliability in Disk Arrays - Tutorial.
Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems (PDIS 1993), 1993

Parity Logging Overcoming the Small Write Problem in Redundant Disk Arrays.
Proceedings of the 20th Annual International Symposium on Computer Architecture, 1993

Fast, On-Line Failure Recovery in Redundant Disk Arrays.
Proceedings of the Digest of Papers: FTCS-23, 1993

1992
Parity Declustering for Continuous Operation in Redundant Disk Arrays.
Proceedings of the ASPLOS-V Proceedings, 1992

Redundant disk arrays - reliable, parallel secondary storage.
ACM distinguished dissertations, MIT Press, ISBN: 978-0-262-07142-0, 1992

1991
Are Disk Arrays Useful for Database Systems? (Panel).
Proceedings of the First International Conference on Parallel and Distributed Information Systems (PDIS 1991), 1991

1990
Verifying a Multiprocessor Cache Controller Using Random Test Generation.
IEEE Des. Test Comput., 1990

An Evaluation of Redundant Arrays of Disks Using an Amdahl 5890.
Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems, 1990

1989
A VLSI chip set for a multiprocessor workstation. II. A memory management unit and cache controller.
IEEE J. Solid State Circuits, December, 1989

Disk system architectures for high performance computing.
Proc. IEEE, 1989

How reliable is a RAID?
Proceedings of the Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage, 1989

Introduction to redundant arrays of inexpensive disks (RAID).
Proceedings of the Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage, 1989

Performance and Reliability in Redundant Arrays of Inexpensive Disks.
Proceedings of the 15th International Computer Measurement Group Conference, 1989

Failure Correction Techniques for Large Disk Arrays.
Proceedings of the ASPLOS-III Proceedings, 1989

1988
A Case for Redundant Arrays of Inexpensive Disks (RAID).
Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data, 1988

1986
An In-Cache Address Translation Mechanism.
Proceedings of the 13th Annual Symposium on Computer Architecture, Tokyo, Japan, June 1986, 1986


  Loading...