Michael B. Sullivan

Orcid: 0000-0001-6537-2065

Affiliations:
  • NVIDIA, Santa Clara, CA, USA


According to our database1, Michael B. Sullivan authored at least 44 papers between 2011 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Unity ECC: Unified Memory Protection Against Bit and Chip Errors.
Proceedings of the International Conference for High Performance Computing, 2023

Implicit Memory Tagging: No-Overhead Memory Safety Using Alias-Free Tagged ECC.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

2022
Making Convolutions Resilient Via Algorithm-Based Error Detection Techniques.
IEEE Trans. Dependable Secur. Comput., 2022

Reduced Precision DWC: An Efficient Hardening Strategy for Mixed-Precision Architectures.
IEEE Trans. Computers, 2022

Characterizing and Mitigating Soft Errors in GPU DRAM.
IEEE Micro, 2022

SEC-BADAEC: An Efficient ECC With No Vacancy for Strong Memory Protection.
IEEE Access, 2022

Saving PAM4 Bus Energy with SMOREs: Sparse Multi-level Opportunistic Restricted Encodings.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

Exploiting Temporal Data Diversity for Detecting Safety-critical Faults in AV Compute Systems.
Proceedings of the 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2022

Zhuyi: perception processing rate estimation for safety in autonomous vehicles.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Suraksha: A Framework to Analyze the Safety Implications of Perception Design Choices in AVs.
Proceedings of the 32nd IEEE International Symposium on Software Reliability Engineering, 2021

Optimizing Selective Protection for CNN Resilience.
Proceedings of the 32nd IEEE International Symposium on Software Reliability Engineering, 2021

Suraksha: A Quantitative AV Safety Evaluation Framework to Analyze Safety Implications of Perception Design Choices.
Proceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2021

NVBitFI: Dynamic Fault Injection for GPUs.
Proceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2021

2020
Estimating Silent Data Corruption Rates Using a Two-Level Model.
CoRR, 2020

HarDNN: Feature Map Vulnerability Evaluation in CNNs.
CoRR, 2020

GPU-trident: efficient modeling of error propagation in GPU programs.
Proceedings of the International Conference for High Performance Computing, 2020

AV-FUZZER: Finding Safety Violations in Autonomous Driving Systems.
Proceedings of the 31st IEEE International Symposium on Software Reliability Engineering, 2020

Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

2019
Kayotee: A Fault Injection-based System to Assess the Safety and Reliability of Autonomous Vehicles to Faults and Errors.
CoRR, 2019

GPU snapshot: checkpoint offloading for GPU-dense systems.
Proceedings of the ACM International Conference on Supercomputing, 2019

On the Trend of Resilience for GPU-Dense Systems.
Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2019

ML-Based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection.
Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2019

2018
Optimizing software-directed instruction replication for GPU error detection.
Proceedings of the International Conference for High Performance Computing, 2018

Evaluating and accelerating high-fidelity error injection for HPC.
Proceedings of the International Conference for High Performance Computing, 2018

SwapCodes: Error Codes for Hardware-Software Cooperative GPU Pipeline Error Detection.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

DUO: Exposing On-Chip Redundancy to Rank-Level ECC for High Reliability.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

Modeling Soft-Error Propagation in Programs.
Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2018

Hamartia: A Fast and Accurate Error Injection Framework.
Proceedings of the 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2018

CRUM: Checkpoint-Restart Support for CUDA's Unified Memory.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
Understanding error propagation in deep learning neural network (DNN) accelerators and applications.
Proceedings of the International Conference for High Performance Computing, 2017

2016
All-Inclusive ECC: Thorough End-to-End Protection for Reliable Computer Memory.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Bit-Plane Compression: Transforming Data for Better Compression in Many-Core Architectures.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

2015
Frugal ECC: efficient and versatile memory error protection through fine-grained compression.
Proceedings of the International Conference for High Performance Computing, 2015

Bamboo ECC: Strong, safe, and flexible codes for reliable computer memory.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Low-Cost Duplicate Multiplication.
Proceedings of the 22nd IEEE Symposium on Computer Arithmetic, 2015

2013
Containment domains: A scalable, efficient and flexible resilience scheme for exascale systems.
Sci. Program., 2013

A locality-aware memory hierarchy for energy-efficient GPU architectures.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

Truncated Logarithmic Approximation.
Proceedings of the 21st IEEE Symposium on Computer Arithmetic, 2013

On separable error detection for addition.
Proceedings of the 2013 Asilomar Conference on Signals, 2013

2012
The dynamic granularity memory system.
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012

Balancing DRAM locality and parallelism in shared memory CMP systems.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

Long Residue Checking for Adders.
Proceedings of the 23rd IEEE International Conference on Application-Specific Systems, 2012

Truncated error correction for flexible approximate multiplication.
Proceedings of the Conference Record of the Forty Sixth Asilomar Conference on Signals, 2012

2011
Hybrid residue generators for increased efficiency.
Proceedings of the Conference Record of the Forty Fifth Asilomar Conference on Signals, 2011


  Loading...