Douglas Doerfler

Orcid: 0000-0001-5016-8854

According to our database1, Douglas Doerfler authored at least 27 papers between 2004 and 2022.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2022
FPGA-based HPC accelerators: An evaluation on performance and energy efficiency.
Concurr. Comput. Pract. Exp., 2022

SU3_Bench on a Programmable Integrated Unified Memory Architecture (PIUMA) and How that Differs from Standard NUMA CPUs.
Proceedings of the High Performance Computing - 37th International Conference, 2022

2021
Achieving performance portability in Gaussian basis set density functional theory on accelerator based architectures in NWChemEx.
Parallel Comput., 2021

Performance Portability for Advanced Architectures.
Comput. Sci. Eng., 2021

Performance Optimization of SU3_Bench on Xeon and Programmable Integrated Unified Memory Architecture.
CoRR, 2021

Case Study of Using Kokkos and SYCL as Performance-Portable Frameworks for Milc-Dslash Benchmark on NVIDIA, AMD and Intel GPUs.
Proceedings of the International Workshop on Performance, 2021

Experiences Porting the SU3_Bench Microbenchmark to the Intel Arria 10 and Xilinx Alveo U280 FPGAs.
Proceedings of the IWOCL'21: International Workshop on OpenCL, Munich Germany, April, 2021, 2021

2020
The Performance and Energy Efficiency Potential of FPGAs in Scientific Computing.
Proceedings of the 2020 IEEE/ACM Performance Modeling, 2020

2018
Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC.
Concurr. Comput. Pract. Exp., 2018

A Metric for Evaluating Supercomputer Performance in the Era of Extreme Heterogeneity.
Proceedings of the 2018 IEEE/ACM Performance Modeling, 2018

2017
Analyzing Performance of Selected NESAP Applications on the Cori HPC System.
Proceedings of the High Performance Computing, 2017

Performance and Energy Usage of Workloads on KNL and Haswell Architectures.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2017

2016
Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor.
Proceedings of the High Performance Computing, 2016


2015
Assessing the role of mini-applications in predicting key performance characteristics of scientific and engineering applications.
J. Parallel Distributed Comput., 2015

2014
Exascale design space exploration and co-design.
Future Gener. Comput. Syst., 2014

2013
Analysis of Cray XC30 Performance Using Trinity-NERSC-8 Benchmarks and Comparison with Cray XE6 and IBM BG/Q.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, 2013

2012
Application-driven analysis of two generations of capability computing: the transition to multicore processors.
Concurr. Comput. Pract. Exp., 2012

Unprecedented Scalability and Performance of the New NNSA Tri-Lab Linux Capacity Cluster 2.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Navigating an Evolutionary Fast Path to Exascale.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Poster: Assessing the Predictive Capabilities of Mini-applications.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

2011
The Impact of Injection Bandwidth Performance on Application Scalability.
Proceedings of the Recent Advances in the Message Passing Interface, 2011

Investigating the Impact of the Cielo Cray XE6 Architecture on Scientific Application Codes.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

2010
Application Performance on the Tri-Lab Linux Capacity Cluster - TLCC.
Int. J. Distributed Syst. Technol., 2010

2006
Measuring MPI Send and Receive Overhead and Application Availability in High Performance Network Interfaces.
Proceedings of the Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2006

A preliminary analysis of the InfiniPath and XD1 network interfaces.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

2004
A comparison of 4X InfiniBand and Quadrics Elan-4 technologies.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004


  Loading...