Saurabh Gupta

Affiliations:

Intel Labs
Oak Ridge National Laboratory, USA

According to our database¹, Saurabh Gupta authored at least 23 papers between 2013 and 2021.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2021

Study of interconnect errors, network congestion, and applications characteristics for throttle prediction on a large scale HPC system.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2021

2018

Exploring the Optimal Platform Configuration for Power-Constrained HPC Workflows.

[BibT_eX]

[DOI]

Kun Tang

Xubin He

Saurabh Gupta

Sudharshan S. Vazhkudai

Devesh Tiwari

Proceedings of the 27th International Conference on Computer Communication and Networks, 2018

Machine Learning Models for GPU Error Prediction in a Large Scale HPC System.

[BibT_eX]

[DOI]

Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2018

Understanding and Analyzing Interconnect Errors and Network Congestion on a Large Scale HPC System.

[BibT_eX]

[DOI]

Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2018

2017

Failures in large scale systems: long-term measurement, analysis, and implications.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2017

Characterizing Temperature, Power, and Soft-Error Behaviors in Data Center Systems: Insights, Challenges, and Opportunities.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Modeling, 2017

Effective Running of End-to-End HPC Workflows on Emerging Heterogeneous Architectures.

[BibT_eX]

[DOI]

Kun Tang

Devesh Tiwari

Saurabh Gupta

Sudharshan S. Vazhkudai

Xubin He

Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

2016

A multi-faceted approach to job placement for improved performance on extreme-scale systems.

[BibT_eX]

[DOI]

Christopher Zimmer

Saurabh Gupta

Scott Atchley

Sudharshan S. Vazhkudai

Carl Albing

Proceedings of the International Conference for High Performance Computing, 2016

Reducing Waste in Extreme Scale Systems through Introspective Analysis.

[BibT_eX]

[DOI]

Leonardo Arturo Bautista-Gomez

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Adaptive Power Profiling for Many-Core HPC Architectures.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Autonomic Computing, 2016

A large-scale study of soft-errors on GPUs in the field.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Power-Capping Aware Checkpointing: On the Interplay Among Power-Capping, Temperature, Reliability, Performance, and Energy.

[BibT_eX]

[DOI]

Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2016

A model-driven approach to warp/thread-block level GPU cache bypassing.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual Design Automation Conference, 2016

2015

Reliability lessons learned from GPU experience with the Titan supercomputer at Oak Ridge leadership computing facility.

[BibT_eX]

[DOI]

Proceedings of the International Conference for High Performance Computing, 2015

Spatial Locality-Aware Cache Partitioning for Effective Cache Sharing.

[BibT_eX]

[DOI]

Saurabh Gupta

Huiyang Zhou

Proceedings of the 44th International Conference on Parallel Processing, 2015

Understanding GPU errors on large-scale HPC systems and the implications for system design and operation.

[BibT_eX]

[DOI]

Sudharshan S. Vazhkudai

Daniel Oliveira

Dave Londo

Nathan DeBardeleben

Philippe Olivier Alexandre Navaux

Luigi Carro

Arthur S. Bland

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Understanding and Exploiting Spatial Properties of System Failures on Extreme-Scale HPC Systems.

[BibT_eX]

[DOI]

Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015

2014

Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems.

[BibT_eX]

[DOI]

Sudharshan S. Vazhkudai

Proceedings of the International Conference for High Performance Computing, 2014

Improving large-scale storage system performance via topology-aware and balanced data placement.

[BibT_eX]

[DOI]

Sudharshan S. Vazhkudai

Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Lazy Checkpointing: Exploiting Temporal Locality in Failures to Mitigate Checkpointing Overheads on Extreme-Scale Systems.

[BibT_eX]

[DOI]

Devesh Tiwari

Saurabh Gupta

Sudharshan S. Vazhkudai

Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014

2013

Locality principle revisited: A probability-based quantitative approach.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2013

Analyzing locality of memory references in GPU architectures.

[BibT_eX]

[DOI]

Saurabh Gupta

Ping Xiang

Huiyang Zhou

Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, 2013

Adaptive Cache Bypassing for Inclusive Last Level Caches.

[BibT_eX]

[DOI]

Saurabh Gupta

Hongliang Gao

Huiyang Zhou

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Saurabh Gupta

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...