Bo Fang

Orcid: 0000-0001-9721-3982

Affiliations:
  • Pacific Northwest National Laboratory, Richland, WA, USA
  • University of British Columbia, Vancouver, Canada (PhD)


According to our database1, Bo Fang authored at least 38 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Final Report for CHESS: Cloud, High-Performance Computing, and Edge for Science and Security.
CoRR, 2024

Overcoming Memory Constraints in Quantum Circuit Simulation with a High-Fidelity Compression Framework.
CoRR, 2024

Light-Weight Fault Tolerant Attention for Large Language Model Training.
CoRR, 2024

Understanding Mixed Precision GEMM with MPGemmFI: Insights into Fault Resilience.
Proceedings of the IEEE International Conference on Cluster Computing, 2024

Discovery of Floating-Point Differences Between NVIDIA and AMD GPUs.
Proceedings of the 24th IEEE International Symposium on Cluster, 2024

FTTN: Feature-Targeted Testing for Numerical Properties of NVIDIA & AMD Matrix Accelerators.
Proceedings of the 24th IEEE International Symposium on Cluster, 2024

Red-QAOA: Efficient Variational Optimization through Circuit Reduction.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Fault Injection for TensorFlow Applications.
IEEE Trans. Dependable Secur. Comput., 2023

MPGemmFI: A Fault Injection Technique for Mixed Precision GEMM in ML Applications.
CoRR, 2023

MEMQSim: Highly Memory-Efficient and Modularized Quantum State-Vector Simulation.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

AMRIC: A Novel In Situ Lossy Compression Framework for Efficient I/O in Adaptive Mesh Refinement Applications.
Proceedings of the International Conference for High Performance Computing, 2023

Design and Evaluation of GPU-FPX: A Low-Overhead tool for Floating-Point Exception Detection in NVIDIA GPUs.
Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing, 2023

2022
Improving the Accuracy of IR-Level Fault Injection.
IEEE Trans. Dependable Secur. Comput., 2022

Pinpointing the System Reliability Degradation in NISQ Machines.
Proceedings of the IEEE International Conference on Quantum Computing and Engineering, 2022

MARS: Malleable Actor-Critic Reinforcement Learning Scheduler.
Proceedings of the IEEE International Performance, 2022

ASAP: automatic synthesis of area-efficient and precision-aware CGRAs.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Towards Precision-Aware Fault Tolerance Approaches for Mixed-Precision Applications.
Proceedings of the 12th IEEE/ACM Workshop on Fault Tolerance for HPC at eXtreme Scale, 2022

Efficient Hierarchical State Vector Simulation of Quantum Circuits via Acyclic Graph Partitioning.
Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021
SV-sim: scalable PGAS-based state vector simulation of quantum circuits.
Proceedings of the International Conference for High Performance Computing, 2021

QuGAN: A Quantum State Fidelity based Generative Adversarial Network.
Proceedings of the IEEE International Conference on Quantum Computing and Engineering, 2021

A Hybrid System for Learning Classical Data in Quantum States.
Proceedings of the IEEE International Performance, 2021

TQEA: Temporal Quantum Error Analysis.
Proceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2021

Characterizing Impacts of Storage Faults on HPC Applications: A Methodology and Insights.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

2020
TensorFI: A Flexible Fault Injection Framework for TensorFlow Applications.
Proceedings of the 31st IEEE International Symposium on Software Reliability Engineering, 2020

CQNN: a CGRA-based QNN Framework.
Proceedings of the 2020 IEEE High Performance Extreme Computing Conference, 2020

Chaser: An Enhanced Fault Injection Tool for Tracing Soft Errors in MPI Applications.
Proceedings of the 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2020

2019
A Tale of Two Injectors: End-to-End Comparison of IR-Level and Assembly-Level Fault Injection.
Proceedings of the 30th IEEE International Symposium on Software Reliability Engineering, 2019

BonVoision: leveraging spatial data smoothness for recovery from memory soft errors.
Proceedings of the ACM International Conference on Supercomputing, 2019

Towards Predicting the Impact of Roll-Forward Failure Recovery for HPC Applications.
Proceedings of the 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2019

2017
LetGo: A Lightweight Continuous Framework for HPC Applications Under Failures.
Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, 2017

2016
A Systematic Methodology for Evaluating the Error Resilience of GPGPU Applications.
IEEE Trans. Parallel Distributed Syst., 2016

SDC is in the Eye of the Beholder: A Survey and Preliminary Study.
Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2016

ePVF: An Enhanced Program Vulnerability Factor Methodology for Cross-Layer Resilience Analysis.
Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2016

2014
GPU-Qin: A methodology for evaluating the error resilience of GPGPU applications.
Proceedings of the 2014 IEEE International Symposium on Performance Analysis of Systems and Software, 2014

Evaluating the Error Resilience of Parallel Programs.
Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014

GPGPUs: How to combine high computational power with high reliability.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

2012
Poster: Evaluating Error Resiliency of GPGPU Applications.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Evaluating Error Resiliency of GPGPU Applications.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012


  Loading...