Saurabh Jha

Orcid: 0000-0003-0926-0776

According to our database1, Saurabh Jha authored at least 52 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
One Queue Is All You Need: Resolving Head-of-Line Blocking in Large Language Model Serving.
CoRR, 2024

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction.
CoRR, 2024

Blue Waters system and component reliability.
Concurr. Comput. Pract. Exp., 2024

Power-aware Deep Learning Model Serving with μ-Serve.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024

When Green Computing Meets Performance and Resilience SLOs.
Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024

Fault Localization Using Interventional Causal Learning for Cloud-Native Applications.
Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024

iPrism: Characterize and Mitigate Risk by Quantifying Change in Escape Routes.
Proceedings of the 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2024

Optimizing IT FinOps and Sustainability through Unsupervised Workload Characterization.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

SAM: Subseries Augmentation-Based Meta-Learning for Generalizing AIOps Models in Multi-Cloud Migration.
Proceedings of the 17th IEEE International Conference on Cloud Computing, 2024

2023
Meta-learning Generalized AIOps Models for Multi-cloud Computer using Digital Twins.
Proceedings of the 33rd Annual International Conference on Computer Science and Software Engineering, 2023

Fault Injection Based Interventional Causal Learning for Distributed Applications.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Data-Driven Application-Oriented Reliability Model of a High-Performance Computing System.
IEEE Trans. Reliab., 2022

Watch Out for the Safety-Threatening Actors: Proactively Mitigating Safety Hazards.
CoRR, 2022

An evolutionary algorithm based feature selection and fuzzy rule reduction technique for the prediction of skin cancer.
Concurr. Comput. Pract. Exp., 2022

A fault injection platform for learning AIOps models.
Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022

Evaluating Hardware Memory Disaggregation under Delay and Contention.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Exploiting Temporal Data Diversity for Detecting Safety-critical Faults in AV Compute Systems.
Proceedings of the 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2022

Localizing and Explaining Faults in Microservices Using Distributed Tracing.
Proceedings of the IEEE 15th International Conference on Cloud Computing, 2022

2021
Watch out for the risky actors: Assessing risk in dynamic environments for safe driving.
CoRR, 2021

Is Function-as-a-Service a Good Fit for Latency-Critical Services?
Proceedings of the WoSC '21: Proceedings of the Seventh International Workshop on Serverless Computing (WoSC7) 2021, 2021

Decoding Radiology: A Brief History.
Proceedings of the Medical Imaging 2021: Computer-Aided Diagnosis, 2021

Delay sensitivity-driven congestion mitigation for HPC systems.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Computer-Aided Segmentation of Polyps Using Mask R-CNN and Approach to Reduce False Positives.
Proceedings of the Intelligent Data Engineering and Analytics, 2021

BayesPerf: minimizing performance monitoring errors using Bayesian statistics.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

2020
A fuzzy logic based approach for prediction of basal cell carcinoma and squamous cell carcinoma among the data of skin cancer.
EAI Endorsed Trans. Pervasive Health Technol., 2020

Application-aware Congestion Mitigation forHigh-Performance Computing Systems.
CoRR, 2020

ML-driven Malware that Targets AV Safety.
CoRR, 2020

Live forensics for HPC systems: a case study on distributed storage systems.
Proceedings of the International Conference for High Performance Computing, 2020

FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices.
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

Measuring Congestion in High-Performance Datacenter Interconnects.
Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation, 2020

AV-FUZZER: Finding Safety Violations in Autonomous Driving Systems.
Proceedings of the 31st IEEE International Symposium on Software Reliability Engineering, 2020

Inductive-bias-driven Reinforcement Learning For Efficient Schedules in Heterogeneous Clusters.
Proceedings of the 37th International Conference on Machine Learning, 2020

ML-Driven Malware that Targets AV Safety.
Proceedings of the 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2020

2019
Inductive Bias-driven Reinforcement Learning For Efficient Schedules in Heterogeneous Clusters.
CoRR, 2019

Live Forensics for Distributed Storage Systems.
CoRR, 2019

Kayotee: A Fault Injection-based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors.
CoRR, 2019

Understanding Fault Scenarios and Impacts through Fault Injection Experiments in Cielo.
CoRR, 2019

A Study of Network Congestion in Two Supercomputing High-Speed Interconnects.
Proceedings of the 2019 IEEE Symposium on High-Performance Interconnects, 2019

ML-Based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection.
Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2019

Towards a Bayesian Approach for Assessing Fault Tolerance of Deep Neural Networks.
Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2019

2018
Resiliency of HPC Interconnects: A Case Study of Interconnect Failures and Recovery in Blue Waters.
IEEE Trans. Dependable Secur. Comput., 2018

AVFI: Fault Injection for Autonomous Vehicles.
Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2018

Hands Off the Wheel in Autonomous Vehicles?: A Systems Perspective on over a Million Miles of Field Data.
Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2018

Characterizing Supercomputer Traffic Networks Through Link-Level Analysis.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
Holistic Measurement-Driven System Assessment.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

2015
Improving Main Memory Hash Joins on Intel Xeon Phi Processors: An Experimental Approach.
Proc. VLDB Endow., 2015

LogDiver: A Tool for Measuring Resilience of Extreme-Scale Systems and Applications.
Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale, 2015

2014
BbmTTP: beat-based parallel simulated annealing algorithm on GPGPUs for the mirrored traveling tournament problem.
Proceedings of the 2014 Spring Simulation Multiconference, 2014

2013
A Parallel Simulated Annealing Approach for the Mirrored Traveling Tournament Problem.
CoRR, 2013

P-HGRMS: A Parallel Hypergraph Based Root Mean Square Algorithm for Image Denoising.
CoRR, 2013

Exploiting data parallelism in the yConvex hypergraph algorithm for image representation using GPGPUs.
Proceedings of the International Conference on Supercomputing, 2013

An automated video surveillance system using Viewpoint Feature Histogram and CUDA-enabled GPUs.
Proceedings of the International Conference on Advances in Computing, 2013


  Loading...