Sudharshan S. Vazhkudai

According to our database1, Sudharshan S. Vazhkudai authored at least 85 papers between 1999 and 2022.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2022
Exploiting CXL-based Memory for Distributed Deep Learning.
Proceedings of the 51st International Conference on Parallel Processing, 2022

2021
MOSIQS: Persistent Memory Object Storage With Metadata Indexing and Querying for Scientific Computing.
IEEE Access, 2021

Exploiting user activeness for data retention in HPC systems.
Proceedings of the International Conference for High Performance Computing, 2021

Interpreting Write Performance of Supercomputer I/O Systems with Regression Models.
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021

An Analysis of System Balance and Architectural Trends Based on Top500 Supercomputers.
Proceedings of the HPC Asia 2021: The International Conference on High Performance Computing in Asia-Pacific Region, 2021

2020
An Integrated Indexing and Search Service for Distributed File Systems.
IEEE Trans. Parallel Distributed Syst., 2020

Persistent Memory Object Storage and Indexing for Scientific Computing.
Proceedings of the IEEE/ACM Workshop on Memory Centric High Performance Computing, 2020

Understanding the Interplay between Hardware Errors and User Job Characteristics on the Titan Supercomputer.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

MARBLE: A Multi-GPU Aware Job Scheduler for Deep Learning on HPC Systems.
Proceedings of the 20th IEEE/ACM International Symposium on Cluster, 2020

2019
An Analysis Workflow-Aware Storage System for Multi-Core Active Flash Arrays.
IEEE Trans. Parallel Distributed Syst., 2019

A programmable shared-memory system for an array of processing-in-memory devices.
Clust. Comput., 2019

Applying Machine Learning to Understand Write Performance of Large-scale Parallel Filesystems.
Proceedings of the IEEE/ACM Fourth International Parallel Data Systems Workshop, 2019

End-to-end I/O portfolio for the summit supercomputing ecosystem.
Proceedings of the International Conference for High Performance Computing, 2019

Profiling the Usage of an Extreme-Scale Archival Storage System.
Proceedings of the 27th IEEE International Symposium on Modeling, 2019

Data Jockey: Automatic Data Management for HPC Multi-tiered Storage Systems.
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019

AHEAD: A Tool for Projecting Next-Generation Hardware Enhancements on GPU-Accelerated Systems.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

Evaluating Burst Buffer Placement in HPC Systems.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

2018
GPU age-aware scheduling to improve the reliability of leadership jobs on Titan.
Proceedings of the International Conference for High Performance Computing, 2018


Exploring the Optimal Platform Configuration for Power-Constrained HPC Workflows.
Proceedings of the 27th International Conference on Computer Communication and Networks, 2018

2017
GUIDE: a scalable information directory service to collect, federate, and analyze logs for operational insights into a leadership HPC facility.
Proceedings of the International Conference for High Performance Computing, 2017

Tagit: an integrated indexing and search service for file systems.
Proceedings of the International Conference for High Performance Computing, 2017

Scientific user behavior and data-sharing trends in a petascale file system.
Proceedings of the International Conference for High Performance Computing, 2017

Understanding object-level memory access patterns across the spectrum.
Proceedings of the International Conference for High Performance Computing, 2017

Toward Managing HPC Burst Buffers Effectively: Draining Strategy to Regulate Bursty I/O Behavior.
Proceedings of the 25th IEEE International Symposium on Modeling, 2017

Applying Graph Analytics to Understand Compute Core Usage and Publication Trends in a Petascale Supercomputing Facility.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

Effective Running of End-to-End HPC Workflows on Emerging Heterogeneous Architectures.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

AnalyzeThat: A Programmable Shared-Memory System for an Array of Processing-In-Memory Devices.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016
A multi-faceted approach to job placement for improved performance on extreme-scale systems.
Proceedings of the International Conference for High Performance Computing, 2016

Server-side log data analytics for I/O workload characterization and coordination on large shared storage systems.
Proceedings of the International Conference for High Performance Computing, 2016

Using Balanced Data Placement to Address I/O Contention in Production Environments.
Proceedings of the 28th International Symposium on Computer Architecture and High Performance Computing, 2016

Constellation: A science graph network for scalable data and knowledge discovery in extreme-scale scientific collaborations.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015
Realizing Accelerated Cost-Effective Distributed RAID.
Proceedings of the Handbook on Data Centers, 2015

A practical approach to reconciling availability, performance, and capacity in provisioning extreme-scale storage systems.
Proceedings of the International Conference for High Performance Computing, 2015

AnalyzeThis: an analysis workflow-aware storage system.
Proceedings of the International Conference for High Performance Computing, 2015

Understanding GPU errors on large-scale HPC systems and the implications for system design and operation.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

2014
Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems.
Proceedings of the International Conference for High Performance Computing, 2014

Improving large-scale storage system performance via topology-aware and balanced data placement.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Automatic identification of application I/O signatures from noisy server-side traces.
Proceedings of the 12th USENIX conference on File and Storage Technologies, 2014

Lazy Checkpointing: Exploiting Temporal Locality in Failures to Mitigate Checkpointing Overheads on Extreme-Scale Systems.
Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014

2013
On Timely Staging of HPC Job Input Data.
IEEE Trans. Parallel Distributed Syst., 2013

Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines.
Proceedings of the 11th USENIX conference on File and Storage Technologies, 2013

2012
Reducing Data Movement Costs Using Energy-Efficient, Active Computation on SSD.
Proceedings of the 2012 Workshop on Power-Aware Computing Systems, HotPower'12, 2012

Active Flash: Out-of-core data analytics on flash storage.
Proceedings of the IEEE 28th Symposium on Mass Storage Systems and Technologies, 2012

On the Use of GPUs in Realizing Cost-Effective Distributed RAID.
Proceedings of the 20th IEEE International Symposium on Modeling, 2012

NVMalloc: Exposing an Aggregate SSD Store as a Memory Partition in Extreme-Scale Machines.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

2011
Timely Result-Data Offloading for Improved HPC Center Scratch Provisioning and Serviceability.
IEEE Trans. Parallel Distributed Syst., 2011

CATCH: A Cloud-Based Adaptive Data Transfer Service for HPC.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Provisioning a Multi-tiered Data Staging Area for Extreme-Scale Machines.
Proceedings of the 2011 International Conference on Distributed Computing Systems, 2011

2010
Functional Partitioning to Optimize End-to-End Performance on Many-core Architectures.
Proceedings of the Conference on High Performance Computing Networking, 2010

Reconciling scratch space consumption, exposure, and volatility to achieve timely staging of job input data.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

2009
Improving the availability of supercomputer job input data using temporal replication.
Comput. Sci. Res. Dev., 2009

Improving Data Availability for Better Access Performance: A Study on Caching Scientific Data on Distributed Desktop Workstations.
J. Grid Comput., 2009

Beyond Music Sharing: An Evaluation of Peer-to-Peer Data Dissemination Techniques in Large Scientific Collaborations.
J. Grid Comput., 2009

/scratch as a cache: rethinking HPC center scratch storage.
Proceedings of the 23rd international conference on Supercomputing, 2009

2008
Virtual Organizations [Guest Editors' Introduction].
IEEE Internet Comput., 2008

Timely offloading of result-data in HPC centers.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

On-the-Fly Recovery of Job Input Data in Supercomputers.
Proceedings of the 2008 International Conference on Parallel Processing, 2008

stdchk: A Checkpoint Storage System for Desktop Grid Computing.
Proceedings of the 28th IEEE International Conference on Distributed Computing Systems (ICDCS 2008), 2008

2007
Recovering transient data: automated on-demand data reconstruction and offloading for supercomputers.
ACM SIGOPS Oper. Syst. Rev., 2007

A Checkpoint Storage System for Desktop Grid Computing
CoRR, 2007

The Neutron Science TeraGrid Gateway: a TeraGrid science gateway to support the Spallation Neutron Source.
Concurr. Comput. Pract. Exp., 2007

Optimizing center performance through coordinated data staging, scheduling and recovery.
Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, 2007

A result-data offloading service for HPC centers.
Proceedings of the 2nd International Petascale Data Storage Workshop (PDSW '07), 2007

A Java-based science portal for neutron scattering experiments.
Proceedings of the 5th International Symposium on Principles and Practice of Programming in Java, 2007

Are P2P Data-Dissemination Techniques Viable in Today's Data-Intensive Scientific Collaborations?
Proceedings of the Euro-Par 2007, 2007

2006
Constructing collaborative desktop storage caches for large scientific datasets.
ACM Trans. Storage, 2006

Coupling prefix caching and collective downloads for remote dataset access.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Positioning Dynamic Storage Caches for Transient Data.
Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

2005
FreeLoader: Scavenging Desktop Storage Resources for Scientific Data.
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Governor: Autonomic Throttling for Aggressive Idle Resource Scavenging.
Proceedings of the Second International Conference on Autonomic Computing (ICAC 2005), 2005

2004
Distributed Downloads of Bulk, Replicated Grid Data.
J. Grid Comput., 2004

On-demand Grid Storage Using Scavenging.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2004

2003
Using Regression Techniques to Predict Large Data Transfers.
Int. J. High Perform. Comput. Appl., 2003

Enabling the Co-Allocation of Grid Data Transfers.
Proceedings of the 4th International Workshop on Grid Computing (GRID 2003), 2003

2002
PODOS -- The design and implementation of a performance oriented Linux cluster.
Future Gener. Comput. Syst., 2002

Predicting the Performance of Wide Area Data Transfers.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

Predicting Sporadic Grid Data Transfers.
Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002

Using Disk Throughput Data in Predictions of End-to-End Grid Data Transfers.
Proceedings of the Grid Computing, 2002

2001
A Greedy Grid: The Grid Economic Engine Directive.
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

Replica Selection in the Globus Data Grid.
Proceedings of the First IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2001), 2001

Compute Power Market: Towards a Market-Oriented Grid.
Proceedings of the First IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2001), 2001

2000
The PODOS File System - Exploiting the High-Speed Communication Subsystem.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2000

A Reusable Software Framework for Distributed Decision-Making Protocols.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 2000

1999
A High Performance Communication Subsystem for PODOS.
Proceedings of the International Workshop on Cluster Computing (IWCC '99), 1999


  Loading...