Jim M. Brandt
Orcid: 0000-0002-8605-5795Affiliations:
- Sandia National Laboratories
According to our database1,
Jim M. Brandt
authored at least 54 papers
between 2005 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Runtime Performance Anomaly Diagnosis in Production HPC Systems Using Active Learning.
IEEE Trans. Parallel Distributed Syst., April, 2024
Toward Sustainable HPC: In-Production Deployment of Incentive-Based Power Efficiency Mechanism on the Fugaku Supercomputer.
Proceedings of the International Conference for High Performance Computing, 2024
Proceedings of the 18th ACM International Conference on Distributed and Event-based Systems, 2024
Proceedings of the IEEE International Conference on Cluster Computing, 2024
2023
Driving HPC Operations With Holistic Monitoring and Operational Data Analytics (Dagstuhl Seminar 23171).
Dagstuhl Reports, 2023
Proceedings of the International Conference for High Performance Computing, 2023
Proceedings of the 17th ACM International Conference on Distributed and Event-based Systems, 2023
Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations.
Proceedings of the IEEE International Conference on Cluster Computing, 2023
2022
Proceedings of the 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2022
Proceedings of the IEEE International Conference on Cluster Computing, 2022
2021
Proctor: A Semi-Supervised Performance Anomaly Diagnosis Framework for Production HPC Systems.
Proceedings of the High Performance Computing - 36th International Conference, 2021
Systematically inferring I/O performance variability by examining repetitive job behavior.
Proceedings of the International Conference for High Performance Computing, 2021
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021
Proceedings of the Euro-Par 2021: Parallel Processing, 2021
Proceedings of the IEEE International Conference on Cluster Computing, 2021
2020
CoRR, 2020
Proceedings of the Driving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI, 2020
Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation, 2020
HPC System Data Pipeline to Enable Meaningful Insights through Analysis-Driven Visualizations.
Proceedings of the IEEE International Conference on Cluster Computing, 2020
Proceedings of the IEEE International Conference on Cluster Computing, 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
ACM Trans. Model. Perform. Evaluation Comput. Syst., 2019
Understanding Fault Scenarios and Impacts through Fault Injection Experiments in Cielo.
CoRR, 2019
Proceedings of the 48th International Conference on Parallel Processing, 2019
Proceedings of the 2019 IEEE Symposium on High-Performance Interconnects, 2019
2018
Proceedings of the 37th IEEE International Performance Computing and Communications Conference, 2018
Proceedings of the 47th International Conference on Parallel Processing, 2018
Proceedings of the Euro-Par 2018: Parallel Processing, 2018
Proceedings of the IEEE International Conference on Cluster Computing, 2018
Proceedings of the IEEE International Conference on Cluster Computing, 2018
2017
Proceedings of the High Performance Computing - 32nd International Conference, 2017
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017
2016
Continuous whole-system monitoring toward rapid understanding of production HPC applications and systems.
Parallel Comput., 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
2015
Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, 2015
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015
2014
The Lightweight Distributed Metric Service: A Scalable Infrastructure for Continuous Monitoring of Large Scale Computing Systems and Applications.
Proceedings of the International Conference for High Performance Computing, 2014
Demonstrating improved application performance using dynamic monitoring and task mapping.
Proceedings of the 2014 IEEE International Conference on Cluster Computing, 2014
2012
Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks, 2012
2011
Comput. Sci. Res. Dev., 2011
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011
2010
Combining Virtualization, resource characterization, and Resource management to enable efficient high performance compute platforms through intelligent dynamic resource allocation.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
Quantifying effectiveness of failure prediction and response in HPC systems: Methodology and example.
Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W 2010), Chicago, Illinois, USA, June 28, 2010
Using Cloud Constructs and Predictive Analysis to Enable Pre-Failure Process Migration in HPC Systems.
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010
2009
Resource monitoring and management with OVIS to enable HPC in cloud computing environments.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009
2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), 2008
2006
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
2005
Proceedings of the 2005 IEEE International Conference on Cluster Computing (CLUSTER 2005), September 26, 2005