Onur Kayiran

According to our database1, Onur Kayiran authored at least 31 papers between 2013 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023

2021
Efficient Cache Utilization via Model-aware Data Placement for Recommendation Models.
Proceedings of the MEMSYS 2021: The International Symposium on Memory Systems, Washington, USA, September 27, 2021

Analyzing and Leveraging Decoupled L1 Caches in GPUs.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

2020
Analyzing and Leveraging Shared L1 Caches in GPUs.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019
Optimizing GPU Cache Policies for MI Workloads.
CoRR, 2019

Opportunistic computing in GPU architectures.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

Optimizing GPU Cache Policies for MI Workloads.
Proceedings of the IEEE International Symposium on Workload Characterization, 2019

Analyzing and Leveraging Remote-Core Bandwidth for Enhanced Performance in GPUs.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
CODA: Enabling Co-location of Computation and Data for Multiple GPU Systems.
ACM Trans. Archit. Code Optim., 2018

Quantifying Data Locality in Dynamic Parallelism in GPUs.
Proc. ACM Meas. Anal. Comput. Syst., 2018

Holistic Management of the GPGPU Memory Hierarchy to Manage Warp-level Latency Tolerance.
CoRR, 2018

Architectural Support for Efficient Large-Scale Automata Processing.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Modular Routing Design for Chiplet-Based Systems.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

Efficient and Fair Multi-programming in GPUs via Effective Bandwidth Management.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017
CODA: Enabling Co-location of Computation and Data for Near-Data Processing.
CoRR, 2017

There and Back Again: Optimizing the Interconnect in Networks of Memory Cubes.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017


Controlled Kernel Launch for Dynamic Parallelism in GPUs.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

2016
Exploiting Core Criticality for Enhanced GPU Performance.
Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science, 2016

OSCAR: Orchestrating STT-RAM cache traffic for heterogeneous CPU-GPU architectures.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

Prefetching Techniques for Near-memory Throughput Processors.
Proceedings of the 2016 International Conference on Supercomputing, 2016

Efficient synthetic traffic models for large, complex SoCs.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

μC-States: Fine-grained GPU Datapath Power Management.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
Anatomy of GPU Memory System for Multi-Application Execution.
Proceedings of the 2015 International Symposium on Memory Systems, 2015

Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015

2014
Managing GPU Concurrency in Heterogeneous Architectures.
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

2013
Orchestrated scheduling and prefetching for GPGPUs.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

Neither more nor less: Optimizing thread-level parallelism for GPGPUs.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013


  Loading...