Nathan R. Tallent

Orcid: 0000-0003-4297-3057

According to our database1, Nathan R. Tallent authored at least 59 papers between 2002 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Final Report for CHESS: Cloud, High-Performance Computing, and Edge for Science and Security.
CoRR, 2024

Workflows Community Summit 2024: Future Trends and Challenges in Scientific Workflows.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2024

Overcoming Memory Constraints in Quantum Circuit Simulation with a High-Fidelity Compression Framework.
CoRR, 2024

OPDR: Order-Preserving Dimension Reduction for Semantic Embedding of Multimodal Scientific Data.
CoRR, 2024

SAM-I-Am: Semantic Boosting for Zero-shot Atomic-Scale Electron Micrograph Segmentation.
CoRR, 2024

MemFriend: Understanding Memory Performance with Spatial-Temporal Affinity.
Proceedings of the International Symposium on Memory Systems, 2024

Automatic Extraction of Network Configurations for Realistic Simulation and Validation.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

Performance Analysis of Data Processing in Distributed File Systems with Near Data Processing.
Proceedings of the International Symposium on Networks, Computers and Communications, 2024

Graph Analytics on Jellyfish topology.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and Dynamics.
Proceedings of the IEEE International Conference on Cluster Computing, 2024

MassiveGNN: Efficient Training via Prefetching for Massively Connected Distributed Graphs.
Proceedings of the IEEE International Conference on Cluster Computing, 2024

2023
Accelerating matrix-centric graph processing on GPUs through bit-level optimizations.
J. Parallel Distributed Comput., July, 2023

Data Flow Lifecycles for Optimizing Workflow Coordination.
Proceedings of the International Conference for High Performance Computing, 2023

2022
Characterizing Performance of Graph Neighborhood Communication Patterns.
IEEE Trans. Parallel Distributed Syst., 2022

QuaL<sup>2</sup> M: Learning Quantitative Performance of Latency-Sensitive Code.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

ReWorDs 2022 Keynote: Towards Orchestrating Distributed & Data-Intensive Workflows.
Proceedings of the 18th IEEE International Conference on e-Science, 2022

MemGaze: Rapid and Effective Load-Level Memory Trace Analysis.
Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021
EXAGRAPH: Graph and combinatorial methods for enabling exascale applications.
Int. J. High Perform. Comput. Appl., 2021

Single-node partitioned-memory for huge graph analytics: cost and performance trade-offs.
Proceedings of the International Conference for High Performance Computing, 2021

Diolkos: improving ethernet throughput through dynamic port selection.
Proceedings of the CF '21: Computing Frontiers Conference, 2021

WinnowML: Stable feature selection for maximizing prediction accuracy of time-based system modeling.
Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021

2020
Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect.
IEEE Trans. Parallel Distributed Syst., 2020

Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing.
Future Gener. Comput. Syst., 2020

Rapid Memory Footprint Access Diagnostics.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

Geomancy: Automated Performance Enhancement through Data Layout Optimization.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

Vertex Reordering for Real-World Graphs and Applications: An Empirical Evaluation.
Proceedings of the IEEE International Symposium on Workload Characterization, 2020

Effectively Using Remote I/O For Work Composition in Distributed Workflows.
Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

2019
Rapidly Measuring Loop Footprints.
Proceedings of the 2019 IEEE International Conference on Cluster Computing, 2019

TAZeR: Hiding the Cost of Remote I/O in Distributed Scientific Workflows.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

2018
Stochastic Programming Approach for Resource Selection Under Demand Uncertainty.
Proceedings of the Job Scheduling Strategies for Parallel Processing, 2018

Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite.
Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018

Optimizing Distributed Data-Intensive Workflows.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

Deep Learning for Enhancing Fault Tolerant Capabilities of Scientific Workflows.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

2017
Representative paths analysis.
Proceedings of the International Conference for High Performance Computing, 2017

Evaluating On-Node GPU Interconnects for Deep Learning Workloads.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation, 2017

Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Generating Performance Models for Irregular Applications.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

2016
Assessing Advanced Technology in CENATE.
Proceedings of the IEEE International Conference on Networking, 2016

Modeling the Impact of Silicon Photonics on Graph Analytics.
Proceedings of the IEEE International Conference on Networking, 2016

Fault Modeling of Extreme Scale Applications Using Machine Learning.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Algorithm and Architecture Independent Benchmarking with SEAK.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

2015
A case for application-oblivious energy-efficient MPI runtime.
Proceedings of the International Conference for High Performance Computing, 2015

Towards efficient scheduling of data intensive high energy physics workflows.
Proceedings of the 10th Workshop on Workflows in Support of Large-Scale Science, 2015

Diagnosing the causes and severity of one-sided message contention.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

Power and performance trade-offs for Space Time Adaptive Processing.
Proceedings of the 26th IEEE International Conference on Application-specific Systems, 2015

2014
Palm: easing the burden of analytical performance modeling.
Proceedings of the 2014 International Conference on Supercomputing, 2014

2011
Using Sampling to Understand Parallel Program Performance.
Proceedings of the Tools for High Performance Computing 2011, 2011

Scalable fine-grained call path tracing.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

2010
HPCTOOLKIT: tools for performance analysis of optimized parallel programs.
Concurr. Comput. Pract. Exp., 2010

Scalable Identification of Load Imbalance in Parallel Executions Using Call Path Profiles.
Proceedings of the Conference on High Performance Computing Networking, 2010

Analyzing lock contention in multithreaded applications.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

Effectively Presenting Call Path Profiles of Application Performance.
Proceedings of the 39th International Conference on Parallel Processing, 2010

2009
Identifying Performance Bottlenecks in Work-Stealing Computations.
Computer, 2009

Diagnosing performance bottlenecks in emerging petascale applications.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Effective performance measurement and analysis of multithreaded applications.
Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2009

Binary analysis for measurement and attribution of program performance.
Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, 2009

2008
OpenAD/F: A Modular Open-Source Tool for Automatic Differentiation of Fortran Codes.
ACM Trans. Math. Softw., 2008

2002
HPCVIEW: A Tool for Top-down Analysis of Node Performance.
J. Supercomput., 2002


  Loading...