Nan Ding

Orcid: 0000-0001-9624-9449

  • Lawrence Berkeley National Laboratory, Computational Research Division, Berkeley, CA, USA
  • Tsinghua University, Department of Computer Science and Technology, Beijing, China (PhD 2018)

According to our database1, Nan Ding authored at least 19 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:



Evaluating the potential of disaggregated memory systems for HPC applications.
Concurr. Comput. Pract. Exp., August, 2024

Performance Modeling and Analysis of a de Bruijn Graph Based Local Assembly Kernel on Multiple Vendor GPUs.
Proceedings of the SC24-W: Workshops of the International Conference for High Performance Computing, 2024

A Workflow Roofline Model for End-to-End Workflow Performance Analysis.
Proceedings of the International Conference for High Performance Computing, 2024

Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters.
Proceedings of the International Conference for High Performance Computing, 2023

Evaluating the Performance of One-sided Communication on CPUs and GPUs.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Instruction Roofline: An insightful visual performance model for GPUs.
Concurr. Comput. Pract. Exp., 2022

A Methodology for Evaluating Tightly-integrated and Disaggregated Accelerated Architectures.
Proceedings of the IEEE/ACM International Workshop on Performance Modeling, 2022

Accelerating large scale <i>de novo</i> metagenome assembly using GPUs.
Proceedings of the International Conference for High Performance Computing, 2021

Evaluating Performance and Portability of a core bioinformatics kernel on multiple vendor GPUs.
Proceedings of the International Workshop on Performance, 2021

A Message-Driven, Multi-GPU Parallel Sparse Triangular Solver.
Proceedings of the 2021 SIAM Conference on Applied and Computational Discrete Algorithms, 2021

APMT: an automatic hardware counter-based performance modeling tool for HPC applications.
CCF Trans. High Perform. Comput., 2020

Leveraging One-Sided Communication for Sparse Triangular Solvers.
Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, 2020

LOGAN: High-Performance GPU-Based X-Drop Long-Read Alignment.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

GPU accelerated partial order multiple sequence alignment for long reads self-correction.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

An automatic performance model-based scheduling tool for coupled climate system models.
J. Parallel Distributed Comput., 2019

An Instruction Roofline Model for GPUs.
Proceedings of the 2019 IEEE/ACM Performance Modeling, 2019

Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on Sunway TaihuLight.
Proceedings of the International Conference for High Performance Computing, 2017

Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer.
Proceedings of the International Conference for High Performance Computing, 2016

CESMTuner: An Auto-tuning Framework for the Community Earth System Model.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014
