Nathan DeBardeleben
Orcid: 0000-0002-5593-9205Affiliations:
- Los Alamos National Laboratory
- Clemson University, Electrical and Computer Engineering department
According to our database1,
Nathan DeBardeleben
authored at least 74 papers
between 2000 and 2023.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
On csauthors.net:
Bibliography
2023
IEEE Trans. Dependable Secur. Comput., 2023
Incorporating Staggered Planned Maintenance Reservations to Improve Performance in Computational Clusters.
Proceedings of the IEEE International Conference on Cluster Computing, 2023
2022
Int. J. High Perform. Comput. Appl., 2022
Proceedings of the 8th IEEE/ACM International Workshop on Data Analysis and Reduction for Big Scientific Data, 2022
Online Detection and Classification of State Transitions of Multivariate Shock and Vibration Data.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2022
2021
J. Supercomput., 2021
Quantifying Server Memory Frequency Margin and Using It to Improve Performance in HPC Systems.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021
Statistical Framework for Two-Party Acceptance Testing of HPC Systems for Reliability.
Proceedings of the 11th IEEE/ACM Workshop on Fault Tolerance for HPC at eXtreme Scale, 2021
Proceedings of the IEEE International Conference on Cluster Computing, 2021
2020
Proceedings of the 31st IEEE International Symposium on Software Reliability Engineering, 2020
Thermal Neutrons: a Possible Threat for Supercomputers and Safety Critical Applications.
Proceedings of the IEEE European Test Symposium, 2020
An Overview of the Risk Posed by Thermal Neutrons to the Reliability of Computing Devices.
Proceedings of the 50th Annual IEEE-IFIP International Conference on Dependable Systems and Networks, 2020
Proceedings of the 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2020
Chaser: An Enhanced Fault Injection Tool for Tracing Soft Errors in MPI Applications.
Proceedings of the 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2020
2019
CoRR, 2019
CoRR, 2019
<i>BinFI</i>: an efficient fault injector for safety-critical machine learning systems.
Proceedings of the International Conference for High Performance Computing, 2019
Quantifying Memory Underutilization in HPC Systems and Using it to Improve Performance via Architecture Support.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
Do Solar Proton Events Reduce the Number of Faults in Supercomputers?: A Comparative Analysis of Faults During and without Solar Proton Events.
Proceedings of the IEEE International Reliability Physics Symposium, 2019
Proceedings of the ACM International Conference on Supercomputing, 2019
2018
Using virtualization to quantify power conservation via near-threshold voltage reduction for inherently resilient applications.
Parallel Comput., 2018
Characterization and Comparison of Application Resilience for Serial and Parallel Executions.
CoRR, 2018
Proceedings of the 2018 USENIX Annual Technical Conference, 2018
Improving Application Resilience by Extending Error Correction with Contextual Information.
Proceedings of the IEEE/ACM 8th Workshop on Fault Tolerance for HPC at eXtreme Scale, 2018
Proceedings of the International Conference for High Performance Computing, 2018
Proceedings of the IEEE/ACM 8th Workshop on Fault Tolerance for HPC at eXtreme Scale, 2018
Proceedings of the 2018 IEEE International Symposium on Software Reliability Engineering Workshops, 2018
Proceedings of the 2018 IEEE International Symposium on Software Reliability Engineering Workshops, 2018
Proceedings of the 47th International Conference on Parallel Processing, 2018
Proceedings of the 2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 2018
2017
Addressing statistical significance of fault injection: empirical studies of the soft error susceptibility.
Int. J. High Perform. Comput. Netw., 2017
Proceedings of the International Conference for High Performance Computing, 2017
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2017
Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, 2017
Proceedings of the 13th European Dependable Computing Conference, 2017
RSVP: Soft Error Resilient Power Savings at Near-Threshold Voltage Using Register Vulnerability.
Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2017
Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2017
Proceedings of the IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, 2017
2016
Design, Use and Evaluation of P-FSEFI: A Parallel Soft Error Fault Injection Framework for Emulating Soft Errors in Parallel Applications.
Proceedings of the 9th EAI International Conference on Simulation Tools and Techniques, 2016
Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, 2016
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016
Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2016
Proceedings of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2016
2015
Field, experimental, and analytical data on large-scale HPC systems and evaluation of the implications for exascale system design.
Proceedings of the 33rd IEEE VLSI Test Symposium, 2015
Proceedings of the 21st IEEE Pacific Rim International Symposium on Dependable Computing, 2015
Empirical Studies of the Soft Error Susceptibility ofSorting Algorithms to Statistical Fault Injection.
Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale, 2015
Understanding GPU errors on large-scale HPC systems and the implications for system design and operation.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Proceedings of the 7th USENIX Workshop on Hot Topics in Storage and File Systems, 2015
Towards Building Resilient Scientific Applications: Resilience Analysis on the Impact of Soft Error and Transient Error Tolerance with the CLAMR Hydrodynamics Mini-App.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015
Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015
2014
An investigation of the effects of hard and soft errors on graphics processing unit-accelerated molecular dynamics simulations.
Concurr. Comput. Pract. Exp., 2014
Proceedings of the 25th IEEE International Symposium on Software Reliability Engineering Workshops, 2014
F-SEFI: A Fine-Grained Soft Error Fault Injection Tool for Profiling Application Vulnerability.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Harnessing Unreliable Cores in Heterogeneous Architecture: The PyDac Programming Model and Runtime.
Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2014
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014
2013
Proceedings of the International Conference for High Performance Computing, 2013
Proceedings of the IEEE 19th Pacific Rim International Symposium on Dependable Computing, 2013
Exploring Time and Frequency Domains for Accurate and Automated Anomaly Detection in Cloud Computing Systems.
Proceedings of the IEEE 19th Pacific Rim International Symposium on Dependable Computing, 2013
PyDac: A Resilient Run-Time Framework for Divide-and-Conquer Applications on a Heterogeneous Many-Core Architecture.
Proceedings of the Euro-Par 2013: Parallel Processing Workshops, 2013
Proceedings of the Euro-Par 2013: Parallel Processing Workshops, 2013
2012
Proceedings of the 50th Annual Southeast Regional Conference, 2012
2011
Experimental Framework for Injecting Logic Errors in a Virtual Machine to Profile Applications for Soft Error Resilience.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011
2010
Impact of sub-optimal checkpoint intervals on application efficiency in computational clusters.
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, 2010
Proceedings of the 2010 IEEE/IFIP International Conference on Dependable Systems and Networks, 2010
2009
J. Syst. Softw., 2009
2008
Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008
2006
Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006
2004
Proceedings of the 9th International Workshop on High-Level Programming Models and Supportive Environments (HIPS 2004), 2004
2002
Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11 2002), 2002
2000
Parallelization Techniques for Spatial-Temporal Occupancy Maps from Multiple Video Streams.
Proceedings of the Parallel and Distributed Processing, 2000