Didem Unat

Orcid: 0000-0002-2351-0770

Affiliations:
  • Koç University, Istanbul, Turkey


According to our database1, Didem Unat authored at least 53 papers between 2009 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
The Landscape of GPU-Centric Communication.
CoRR, 2024

A Sparse Tensor Generator with Efficient Feature Extraction.
CoRR, 2024

Snoopie: A Multi-GPU Communication Profiler and Visualizer.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024

2023
Precise Event Sampling on AMD Versus Intel: Quantitative and Qualitative Comparison.
IEEE Trans. Parallel Distributed Syst., May, 2023

Precise event sampling-based data locality tools for AMD multicore architectures.
Concurr. Comput. Pract. Exp., 2023

Bringing Order to Sparsity: A Sparse Matrix Reordering Study on Multicore CPUs.
Proceedings of the International Conference for High Performance Computing, 2023

Multi-GPU Communication Schemes for Iterative Solvers: When CPUs are Not in Charge.
Proceedings of the 37th International Conference on Supercomputing, 2023

2022
ReuseTracker: Fast Yet Accurate Multicore Reuse Distance Analyzer.
ACM Trans. Archit. Code Optim., 2022

Mixed and Multi-Precision SpMV for GPUs with Row-wise Precision Selection.
Proceedings of the 2022 IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2022

2021
A Split Execution Model for SpTRSV.
IEEE Trans. Parallel Distributed Syst., 2021

A computational-graph partitioning method for training memory-constrained DNNs.
Parallel Comput., 2021

Structured Adaptive Mesh Refinement Adaptations to Retain Performance Portability With Increasing Heterogeneity.
Comput. Sci. Eng., 2021

Monitoring Collective Communication Among GPUs.
Proceedings of the Euro-Par 2021: Parallel Processing Workshops, 2021

Low-Overhead Reuse Distance Profiling Tool for Multicore.
Proceedings of the Euro-Par 2021: Parallel Processing Workshops, 2021

2020
TIGER: Topology-aware Assignment using Ising machines Application to Classical Algorithm Tasks and Quantum Circuit Gates.
CoRR, 2020

Adaptive Level Binning: A New Algorithm for Solving Sparse Triangular Systems.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020

Tiling-Based Programming Model for Structured Grids on GPU Clusters.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020

A Prediction Framework for Fast Sparse Triangular Solves.
Proceedings of the Euro-Par 2020: Parallel Processing, 2020

ComScribe: Identifying Intra-node GPU Communication.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2020

2019
Communication analysis and optimization of 3D front tracking method for multiphase flow simulations.
Int. J. High Perform. Comput. Appl., 2019

Asynchronous AMR on Multi-GPUs.
Proceedings of the High Performance Computing, 2019

ComDetective: a lightweight communication detection tool for threads.
Proceedings of the International Conference for High Performance Computing, 2019

Program analysis for process migration.
Proceedings of the 8th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis, 2019

2018
Load Balancing for Parallel Multiphase Flow Simulation.
Sci. Program., 2018

Output nondeterminism detection for programming models combining dataflow with shared memory.
Parallel Comput., 2018

Special issue on High performance computing conference (BASARIM-2017).
Concurr. Comput. Pract. Exp., 2018

BindMe: A thread binding library with advanced mapping algorithms.
Concurr. Comput. Pract. Exp., 2018

Fast multidimensional reduction and broadcast operations on GPU for machine learning.
Concurr. Comput. Pract. Exp., 2018

Phase asynchronous AMR execution for productive and performant astrophysical flows.
Proceedings of the International Conference for High Performance Computing, 2018

Phase-Based Data Placement Scheme for Heterogeneous Memory Systems.
Proceedings of the 30th International Symposium on Computer Architecture and High Performance Computing, 2018

Runtime Determinacy Race Detection for OpenMP Tasks.
Proceedings of the Euro-Par 2018: Parallel Processing, 2018

2017
Trends in Data Locality Abstractions for HPC Systems.
IEEE Trans. Parallel Distributed Syst., 2017

Access pattern-aware data placement for hybrid DRAM/NVM.
Turkish J. Electr. Eng. Comput. Sci., 2017

Object Placement for High Bandwidth Memory Augmented with High Capacity Memory.
Proceedings of the 29th International Symposium on Computer Architecture and High Performance Computing, 2017

EmbedSanitizer: Runtime Race Detection Tool for 32-bit Embedded ARM.
Proceedings of the Runtime Verification - 17th International Conference, 2017

Overlapping Data Transfers with Computation on GPU with Tiles.
Proceedings of the 46th International Conference on Parallel Processing, 2017

Nonintrusive AMR Asynchrony for Communication Optimization.
Proceedings of the Euro-Par 2017: Parallel Processing - 23rd International Conference on Parallel and Distributed Computing, Santiago de Compostela, Spain, August 28, 2017

2016
BoxLib with Tiling: An Adaptive Mesh Refinement Software Framework.
SIAM J. Sci. Comput., 2016

BoxLib with Tiling: An AMR Software Framework.
CoRR, 2016

TiDA: High-Level Programming Abstractions for Data Locality Management.
Proceedings of the High Performance Computing - 31st International Conference, 2016

Perilla: metadata-based optimizations of an asynchronous runtime for adaptive mesh refinement.
Proceedings of the International Conference for High Performance Computing, 2016

2015
ExaSAT: An exascale co-design tool for performance modeling.
Int. J. High Perform. Comput. Appl., 2015

2014
Abstract machine models and proxy architectures for exascale computing.
Proceedings of the 1st International Workshop on Hardware-Software Co-Design for High Performance Computing, 2014

2013
A new approach to interactive viewpoint selection for volume data sets.
Inf. Vis., 2013

Modeling and predicting performance of high performance computing applications on hardware accelerators.
Int. J. High Perform. Comput. Appl., 2013

Software Design Space Exploration for Exascale Combustion Co-design.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013

2012
Domain-specific translator and optimizer for massive on- chip parallelism.
PhD thesis, 2012

Hands-on Performance Tuning of 3D Finite Difference Earthquake Simulation on GPU Fermi Chipset.
Proceedings of the International Conference on Computational Science, 2012

Accelerating a 3D Finite-Difference Earthquake Simulation with a C-to-CUDA Translator.
Comput. Sci. Eng., 2012

Interactive data-centric viewpoint selection.
Proceedings of the Visualization and Data Analysis 2012, 2012

2011
Modeling and predicting application performance on hardware accelerators.
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

Mint: realizing CUDA performance in 3D stencil methods with annotated C.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

2009
An Adaptive Sub-sampling Method for In-memory Compression of Scientific Data.
Proceedings of the 2009 Data Compression Conference (DCC 2009), 2009


  Loading...