Understanding and Mitigating Hardware Failures in Deep Learning Training Systems.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Understanding Permanent Hardware Failures in Deep Learning Training Accelerator Systems.
Proceedings of the IEEE European Test Symposium, 2023
Special Session: On the Reliability of Conventional and Quantum Neural Network Hardware.
Proceedings of the 40th IEEE VLSI Test Symposium, 2022
Achieving Automotive Safety Requirements through Functional In-Field Self-Test for Deep Learning Accelerators.
Proceedings of the IEEE International Test Conference, 2022
Efficient Functional In-Field Self-Test for Deep Learning Accelerators.
Proceedings of the IEEE International Test Conference, 2021
FIdelity: Efficient Resilience Analysis Framework for Deep Learning Accelerators.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
Time-Slicing Soft Error Resilience in Microprocessors for Reliable and Energy-Efficient Execution.
Proceedings of the IEEE International Test Conference, 2019