Douglas Doerfler
Orcid: 0000-0001-5016-8854
According to our database1,
Douglas Doerfler
authored at least 27 papers
between 2004 and 2022.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2022
Concurr. Comput. Pract. Exp., 2022
SU3_Bench on a Programmable Integrated Unified Memory Architecture (PIUMA) and How that Differs from Standard NUMA CPUs.
Proceedings of the High Performance Computing - 37th International Conference, 2022
2021
Achieving performance portability in Gaussian basis set density functional theory on accelerator based architectures in NWChemEx.
Parallel Comput., 2021
Performance Optimization of SU3_Bench on Xeon and Programmable Integrated Unified Memory Architecture.
CoRR, 2021
Case Study of Using Kokkos and SYCL as Performance-Portable Frameworks for Milc-Dslash Benchmark on NVIDIA, AMD and Intel GPUs.
Proceedings of the International Workshop on Performance, 2021
Experiences Porting the SU3_Bench Microbenchmark to the Intel Arria 10 and Xilinx Alveo U280 FPGAs.
Proceedings of the IWOCL'21: International Workshop on OpenCL, Munich Germany, April, 2021, 2021
2020
Proceedings of the 2020 IEEE/ACM Performance Modeling, 2020
2018
Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC.
Concurr. Comput. Pract. Exp., 2018
A Metric for Evaluating Supercomputer Performance in the Era of Extreme Heterogeneity.
Proceedings of the 2018 IEEE/ACM Performance Modeling, 2018
2017
Proceedings of the High Performance Computing, 2017
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2017
2016
Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor.
Proceedings of the High Performance Computing, 2016
Proceedings of the 7th International Workshop on Performance Modeling, 2016
2015
Assessing the role of mini-applications in predicting key performance characteristics of scientific and engineering applications.
J. Parallel Distributed Comput., 2015
2014
2013
Analysis of Cray XC30 Performance Using Trinity-NERSC-8 Benchmarks and Comparison with Cray XE6 and IBM BG/Q.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, 2013
2012
Application-driven analysis of two generations of capability computing: the transition to multicore processors.
Concurr. Comput. Pract. Exp., 2012
Unprecedented Scalability and Performance of the New NNSA Tri-Lab Linux Capacity Cluster 2.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
2011
Proceedings of the Recent Advances in the Message Passing Interface, 2011
Investigating the Impact of the Cielo Cray XE6 Architecture on Scientific Application Codes.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011
2010
Int. J. Distributed Syst. Technol., 2010
2006
Measuring MPI Send and Receive Overhead and Application Availability in High Performance Network Interfaces.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
2004
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004