Yuriy Kuleshov

Remote. Sens., 2022

PROMPT: Learning Dynamic Resource Allocation Policies for Edge-Network Applications.

[DOI]

CoRR, 2022

Chapter Four - Routerless networks-on-chip.

[DOI]

Fawaz Alazemi

Bella Bose

Adv. Comput., 2022

2020

AI for Computer Architecture: Principles, Practice, and Prospects

[DOI]

Daniel Rodrigues Carvalho

Drew Penney

Daniel Jiménez

Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01770-4, 2020

UVMBench: A Comprehensive Benchmark Suite for Researching Unified Virtual Memory in GPUs.

[DOI]

CoRR, 2020

The gem5 Simulator: Version 20.0+.

[DOI]

Amin Farmahini Farahani

Hamidreza Khaleghzadeh

CoRR, 2020

Accelerated Reply Injection for Removing NoC Bottleneck in GPGPUs.

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

A Deep Reinforcement Learning Framework for Architectural Exploration: A Routerless NoC Case Study.

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

EquiNox: Equivalent NoC Injection Routers for Silicon Interposer-Based Throughput Processors.

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

2019

A Survey of Machine Learning Applied to Computer Architecture Design.

[DOI]

Drew D. Penney

CoRR, 2019

Optimizing Routerless Network-on-Chip Designs: An Innovative Learning-Based Framework.

[DOI]

CoRR, 2019

Design Space Exploration of Memory Controller Placement in Throughput Processors with Deep Learning.

[DOI]

IEEE Comput. Archit. Lett., 2019

On Trade-off Between Static and Dynamic Power Consumption in NoC Power Gating.

[DOI]

Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, 2019

Dynamically linked MSHRs for adaptive miss handling in GPUs.

[DOI]

Yongbin Gu

Proceedings of the ACM International Conference on Supercomputing, 2019

Express Link Placement for NoC-Based Many-Core Platforms.

[DOI]

Proceedings of the 48th International Conference on Parallel Processing, 2019

Characterizing On-Chip Traffic Patterns in General-Purpose GPUs: A Deep Learning Approach.

[DOI]

Proceedings of the 37th IEEE International Conference on Computer Design, 2019

Shortcut Mining: Exploiting Cross-Layer Shortcut Reuse in DCNN Accelerators.

[DOI]

Arash AziziMazreah

Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

2018

Tolerating Soft Errors in Deep Learning Accelerators with Reliable On-Chip Memory Designs.

[DOI]

Proceedings of the 2018 IEEE International Conference on Networking, 2018

CART: Cache Access Reordering Tree for Efficient Cache and Memory Accesses in GPUs.

[DOI]

Yongbin Gu

Proceedings of the 36th IEEE International Conference on Computer Design, 2018

Routerless Network-on-Chip.

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017

CALM: Contention-Aware Latency-Minimal Application Mapping for Flattened Butterfly On-Chip Networks.

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2017

XPro: A Cross-End Processing Architecture for Data Analytics in Wearables.

[DOI]

Aosen Wang

Wenyao Xu

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

2016

Providing Balanced Mapping for Multiple Applications in Many-Core Chip Multiprocessors.

[DOI]

Siyu Yue

IEEE Trans. Computers, 2016

A Filtering Mechanism to Reduce Network Bandwidth Utilization of Transaction Execution.

[DOI]

ACM Trans. Archit. Code Optim., 2016

Simulation of NoC power-gating: Requirements, optimizations, and the Agate simulator.

[DOI]

J. Parallel Distributed Comput., 2016

Maximizing the performance of NoC-based MPSoCs under total power and power density constraints.

[DOI]

Proceedings of the 17th International Symposium on Quality Electronic Design, 2016

2015

Power punch: Towards non-blocking power-gating of NoC routers.

[DOI]

Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

TAPP: temperature-aware application mapping for NoC-based many-core processors.

[DOI]

Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

2014

Futility Scaling: High-Associativity Cache Partitioning.

[DOI]

Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

Smart butterfly: reducing static power dissipation of network-on-chip with core-state-awareness.

[DOI]

Siyu Yue

Proceedings of the International Symposium on Low Power Electronics and Design, 2014

Balancing On-Chip Network Latency in Multi-application Mapping for Chip-Multiprocessors.

[DOI]

Siyu Yue

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Mitigating the Mismatch between the Coherence Protocol and Conflict Detection in Hardware Transactional Memory.

[DOI]

Lihang Zhao

Jeffrey T. Draper

Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

MP3: Minimizing performance penalty for power-gating of Clos network-on-chip.

[DOI]

Lihang Zhao

Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

Application mapping for express channel-based networks-on-chip.

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

2013

An Analytical Performance Model for Partitioning Off-Chip Memory Bandwidth.

[DOI]

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

RAIR: Interference Reduction in Regionalized Networks-on-Chip.

[DOI]

Kai Hwang

Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Bubble coloring: avoiding routing- and protocol-induced deadlocks with minimal virtual channel requirement.

[DOI]

Proceedings of the International Conference on Supercomputing, 2013

In-network traffic regulation for Transactional Memory.

[DOI]

Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

Worm-Bubble Flow Control.

[DOI]

Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

2012

Efficient implementation of globally-aware network flow control.

[DOI]

J. Parallel Distributed Comput., 2012

NoRD: Node-Router Decoupling for Effective Power-gating of On-Chip Routers.

[DOI]

Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012

2011

Critical Bubble Scheme: An Efficient Implementation of Globally Aware Network Flow Control.

[DOI]