Devesh Tiwari
Orcid: 0000-0002-7253-2458
According to our database1,
Devesh Tiwari
authored at least 129 papers
between 2009 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
StarShip: Mitigating I/O Bottlenecks in Serverless Computing for Scientific Workflows.
Proc. ACM Meas. Anal. Comput. Syst., 2024
The globus compute dataset: An open function-as-a-service dataset from the edge to the cloud.
Future Gener. Comput. Syst., 2024
OrganiQ: Mitigating Classical Resource Bottlenecks of Quantum Generative Adversarial Networks on NISQ-Era Machines.
CoRR, 2024
Qompose: A Technique to Select Optimal Algorithm- Specific Layout for Neutral Atom Quantum Architectures.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference.
CoRR, 2024
Interpretable Analysis of Production GPU Clusters Monitoring Data via Association Rule Mining.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 2024 ACM Symposium on Cloud Computing, 2024
RainbowCake: Mitigating Cold-starts in Serverless with Layer-wise Container Caching and Sharing.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
CodeCrunch: Improving Serverless Performance via Function Compression and Cost-Aware Warmup Location Optimization.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
Toward Privacy in Quantum Program Execution On Untrusted Quantum Cloud Computing Machines for Business-sensitive Quantum Needs.
CoRR, 2023
Sustainable HPC: Modeling, Characterization, and Implications of Carbon Footprint in Modern HPC Systems.
CoRR, 2023
Green Carbon Footprint for Model Inference Serving via Exploiting Mixed-Quality Models and GPU Partitioning.
CoRR, 2023
Experimental Evaluation of Xanadu X8 Photonic Quantum Computer: Error Measurement, Characterization and Implications.
Proceedings of the International Conference for High Performance Computing, 2023
GRAPHINE: Enhanced Neutral Atom Quantum Computing using Application-Specific Rydberg Atom Arrangement.
Proceedings of the International Conference for High Performance Computing, 2023
Proceedings of the International Conference for High Performance Computing, 2023
Toward Sustainable HPC: Carbon Footprint Estimation and Environmental Implications of HPC Systems.
Proceedings of the International Conference for High Performance Computing, 2023
SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
MosaiQ: Quantum Generative Adversarial Networks for Image Generation on NISQ Computers.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2023
Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, 2023
Kairos: Building Cost-Efficient Machine Learning Inference Systems with Heterogeneous Cloud Resources.
Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, 2023
Proceedings of the 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2023
Invited: Building Robust Quantum System Software for Technology-Specific Characteristics.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations.
Proceedings of the IEEE International Conference on Cluster Computing, 2023
Proceedings of the 2023 ACM Symposium on Cloud Computing, SoCC 2023, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Characterizing and Exploiting Soft Error Vulnerability Phase Behavior in GPU Applications.
IEEE Trans. Dependable Secur. Comput., 2022
CoRR, 2022
DayDream: Executing Dynamic Scientific Workflows on Serverless Platforms with Hot Starts.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
Charter: Identifying the Most-Critical Gate Operations in Quantum Circuits via Amplified Gate Reversibility.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
DASH: Scheduling Deep Learning Workloads on Multi-Generational GPU-Accelerated Clusters.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2022
Proceedings of the IEEE High Performance Extreme Computing Conference, 2022
Proceedings of the HiPS@HPDC 2022: Proceedings of the 2nd Workshop on High Performance Serverless Computing, 2022
AI-Enabling Workloads on Large-Scale GPU-Accelerated System: Characterization, Opportunities, and Implications.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022
Proceedings of the IEEE International Conference on Cluster Computing, 2022
Proceedings of the 13th Symposium on Cloud Computing, SoCC 2022, 2022
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022
QUILT: Effective Multi-Class Classification on Quantum Computers Using an Ensemble of Diverse Quantum Classifiers.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
2021
Study of interconnect errors, network congestion, and applications characteristics for throttle prediction on a large scale HPC system.
J. Parallel Distributed Comput., 2021
RIBBON: cost-effective and qos-aware deep learning model inference using a diverse pool of cloud computing instances.
Proceedings of the International Conference for High Performance Computing, 2021
Systematically inferring I/O performance variability by examining repetitive job behavior.
Proceedings of the International Conference for High Performance Computing, 2021
Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models.
Proceedings of the PLDI '21: 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021
SATORI: Efficient and Fair Resource Partitioning by Sacrificing Short-Term Benefits for Long-Term Gains<sup>*</sup>.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Characterizing and Mitigating the I/O Scalability Challenges for Serverless Applications.
Proceedings of the IEEE International Symposium on Workload Characterization, 2021
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021
Operating Liquid-Cooled Large-Scale Systems: Long-Term Monitoring, Reliability Analysis, and Efficiency Measures.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021
Proceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2021
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
2020
Soc. Netw. Anal. Min., 2020
Comparing Performances of Five Distinct Automatic Classifiers for Fin Whale Vocalizations in Beamformed Spectrograms of Coherent Hydrophone Array.
Remote. Sens., 2020
UREQA: Leveraging Operation-Aware Error Rates for Effective Quantum Circuit Mapping on NISQ-Era Quantum Computers.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020
Exploring the Potential of using Power as a First Class Parameter for Resource Allocation in Apache Mesos Managed Clouds.
Proceedings of the 13th IEEE/ACM International Conference on Utility and Cloud Computing, 2020
Veritas: accurately estimating the correct output on noisy intermediate-scale quantum computers.
Proceedings of the International Conference for High Performance Computing, 2020
Experimental evaluation of NISQ quantum computers: error measurement, characterization, and implications.
Proceedings of the International Conference for High Performance Computing, 2020
Job characteristics on large-scale systems: long-term analysis, quantification, and implications.
Proceedings of the International Conference for High Performance Computing, 2020
What does Power Consumption Behavior of HPC Jobs Reveal? : Demystifying, Quantifying, and Predicting Power Consumption Characteristics.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020
Proceedings of the IEEE International Symposium on Workload Characterization, 2020
DisQ: A Novel Quantum Output State Classification Method on IBM Quantum Computers using OpenPulse.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020
CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020
Uncovering Access, Reuse, and Sharing Characteristics of I/O-Intensive Files on Large-Scale Production HPC Systems.
Proceedings of the 18th USENIX Conference on File and Storage Technologies, 2020
GIFT: A Coupon Based Throttle-and-Reward Mechanism for Fair and Efficient I/O Bandwidth Management on Parallel Storage Systems.
Proceedings of the 18th USENIX Conference on File and Storage Technologies, 2020
Proceedings of the 18th USENIX Conference on File and Storage Technologies, 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
Revisiting I/O behavior in large-scale storage systems: the expected and the unexpected.
Proceedings of the International Conference for High Performance Computing, 2019
Characterizing Disk Health Degradation and Proactively Protecting Against Disk Failures for Reliable Storage Systems.
Proceedings of the 2019 IEEE International Conference on Autonomic Computing, 2019
PERQ: Fair and Efficient Power Management of Power-Constrained Large-Scale Computing Systems.
Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing, 2019
PCFI: Program Counter Guided Fault Injection for Accelerating GPU Reliability Assessment.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Towards Enabling Dynamic Resource Estimation and Correction for Improving Utilization in an Apache Mesos Cloud Environment.
Proceedings of the 19th IEEE/ACM International Symposium on Cluster, 2019
Exploring Potential for Non-Disruptive Vertical Auto Scaling and Resource Estimation in Kubernetes.
Proceedings of the 12th IEEE International Conference on Cloud Computing, 2019
2018
Proceedings of the 27th International Conference on Computer Communication and Networks, 2018
Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2018
Understanding and Analyzing Interconnect Errors and Network Congestion on a Large Scale HPC System.
Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2018
Shiraz: Exploiting System Reliability and Application Resilience Characteristics to Improve Large Scale System Throughput.
Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2018
Reliability Characterization of Solid State Drives in a Scalable Production Datacenter.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018
Proceedings of the IEEE/ACM 2018 International Conference on Advances in Social Networks Analysis and Mining, 2018
2017
ACM Trans. Model. Perform. Evaluation Comput. Syst., 2017
Compiler-Directed Soft Error Detection and Recovery to Avoid DUE and SDC via Tail-DMR.
ACM Trans. Embed. Comput. Syst., 2017
GUIDE: a scalable information directory service to collect, federate, and analyze logs for operational insights into a leadership HPC facility.
Proceedings of the International Conference for High Performance Computing, 2017
Proceedings of the International Conference for High Performance Computing, 2017
Combining architectural fault-injection and neutron beam testing approaches toward better understanding of GPU soft-error resilience.
Proceedings of the IEEE 60th International Midwest Symposium on Circuits and Systems, 2017
Toward Managing HPC Burst Buffers Effectively: Draining Strategy to Regulate Bursty I/O Behavior.
Proceedings of the 25th IEEE International Symposium on Modeling, 2017
Characterizing Temperature, Power, and Soft-Error Behaviors in Data Center Systems: Insights, Challenges, and Opportunities.
Proceedings of the 25th IEEE International Symposium on Modeling, 2017
Effective Running of End-to-End HPC Workflows on Emerging Heterogeneous Architectures.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017
2016
Application configuration selection for energy-efficient execution on multicore systems.
J. Parallel Distributed Comput., 2016
Compiler-directed lightweight checkpointing for fine-grained guaranteed soft error recovery.
Proceedings of the International Conference for High Performance Computing, 2016
Proceedings of the International Conference for High Performance Computing, 2016
Low-cost soft error resilience with unified data verification and fine-grained recovery for acoustic sensor based detection.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016
Proceedings of the 2016 IEEE International Conference on Autonomic Computing, 2016
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016
Power-Capping Aware Checkpointing: On the Interplay Among Power-Capping, Temperature, Reliability, Performance, and Energy.
Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2016
2015
A practical approach to reconciling availability, performance, and capacity in provisioning extreme-scale storage systems.
Proceedings of the International Conference for High Performance Computing, 2015
Reliability lessons learned from GPU experience with the Titan supercomputer at Oak Ridge leadership computing facility.
Proceedings of the International Conference for High Performance Computing, 2015
Proceedings of the International Conference for High Performance Computing, 2015
Proceedings of the 16th ACM SIGPLAN/SIGBED Conference on Languages, 2015
Proceedings of the 2015 IEEE International Conference on Autonomic Computing, 2015
Understanding GPU errors on large-scale HPC systems and the implications for system design and operation.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Understanding and Exploiting Spatial Properties of System Failures on Extreme-Scale HPC Systems.
Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2015
Proceedings of the IEEE International Conference on Data Science and Data Intensive Systems, 2015
2014
Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems.
Proceedings of the International Conference for High Performance Computing, 2014
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Improving large-scale storage system performance via topology-aware and balanced data placement.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014
Lazy Checkpointing: Exploiting Temporal Locality in Failures to Mitigate Checkpointing Overheads on Extreme-Scale Systems.
Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014
2013
Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines.
Proceedings of the 11th USENIX conference on File and Storage Technologies, 2013
2012
Proceedings of the 2012 Workshop on Power-Aware Computing Systems, HotPower'12, 2012
Architectural characterization and similarity analysis of sunspider and Google's V8 Javascript benchmarks.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2012
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012
2011
HAQu: Hardware-accelerated queueing for fine-grained threading on a chip multiprocessor.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011
2010
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
2009
Proceedings of the 10th workshop on MEmory performance, 2009