Abdullah Kayi

Orcid: 0000-0001-5909-9891

According to our database1, Abdullah Kayi authored at least 18 papers between 2007 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
FlowTracer: A Tool for Uncovering Network Path Usage Imbalance in AI Training Clusters.
CoRR, 2024

2023
Data Movement Accelerator Engines on a Prototype Power10 Processor.
IEEE Micro, 2023

2021
Asynchronous Decentralized Distributed Training of Acoustic Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

2020
Improving Efficiency in Large-Scale Decentralized Distributed Training.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
A Highly Efficient Distributed Deep Learning System for Automatic Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2016
Enabling PGAS Productivity with Hardware Support for Shared Address Mapping: A UPC Case Study.
ACM Trans. Archit. Code Optim., 2016

Comparing Runtime Systems with Exascale Ambitions Using the Parallel Research Kernels.
Proceedings of the High Performance Computing - 31st International Conference, 2016

2015
Adaptive Cache Coherence Mechanisms with Producer-Consumer Sharing Optimization for Chip Multiprocessors.
IEEE Trans. Computers, 2015

2014
Bandwidth Adaptive Cache Coherence Optimizations for Chip Multiprocessors.
Int. J. Parallel Program., 2014

Hardware support for address mapping in PGAS languages: a UPC case study.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

2012
Bandwidth Adaptive Write-update Optimizations for Chip Multiprocessors.
Proceedings of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, 2012

2011
Address Translation Optimization for Unified Parallel C Multi-dimensional Arrays.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

2010
An adaptive cache coherence protocol for chip multiprocessors.
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies, 2010

2009
Performance issues in emerging homogeneous multi-core architectures.
Simul. Model. Pract. Theory, 2009

Performance analysis and tuning for clusters with ccNUMA nodes for scientific coputing - a case study.
Comput. Syst. Sci. Eng., 2009

2008
Performance Evaluation of Clusters with ccNUMA Nodes - A Case Study.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, 2008

Application Performance Tuning for Clusters with ccNUMA Nodes.
Proceedings of the 11th IEEE International Conference on Computational Science and Engineering, 2008

2007
Experimental Evaluation of Emerging Multi-core Architectures.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007


  Loading...