Nan Jiang

Commun. ACM, 2021

Need for Speed: Experiences Building a Trustworthy System-Level GPU Simulator.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

2020

A 0.32-128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm.

[BibT_eX]

[DOI]

Brian Zimmer

IEEE J. Solid State Circuits, 2020

An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives.

[BibT_eX]

[DOI]

Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

2019

A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm.

[BibT_eX]

[DOI]

Brian Zimmer

Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019

Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture.

[BibT_eX]

[DOI]

Yakun Sophia Shao

Jason Clemons

Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

A 0.11 PJ/OP, 0.32-128 Tops, Scalable Multi-Chip-Module-Based Deep Neural Network Accelerator Designed with A High-Productivity vlsi Methodology.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Hot Chips 31 Symposium (HCS), 2019

2018

Exploiting idle resources in a high-radix switch for supplemental storage.

[BibT_eX]

[DOI]

Matthias A. Blumrich

Larry R. Dennison

Proceedings of the International Conference for High Performance Computing, 2018

2015

Network endpoint congestion control for fine-grained communication.

[BibT_eX]

[DOI]

Larry R. Dennison

Proceedings of the International Conference for High Performance Computing, 2015

2013

Channel reservation protocol for over-subscribed channels and destinations.

[BibT_eX]

[DOI]

Daniel Becker

Proceedings of the International Conference for High Performance Computing, 2013

A detailed and flexible cycle-accurate Network-on-Chip simulator.

[BibT_eX]

[DOI]

Daniel U. Becker

Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

2012

Adaptive Backpressure: Efficient buffer management for on-chip networks.

[BibT_eX]

[DOI]

Daniel U. Becker

Proceedings of the 30th International IEEE Conference on Computer Design, 2012

Network congestion avoidance through Speculative Reservation.

[BibT_eX]

[DOI]

Daniel U. Becker

Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011

Packet Chaining: Efficient Single-Cycle Allocation for On-Chip Networks.

[BibT_eX]

[DOI]

Daniel Becker

IEEE Comput. Archit. Lett., 2011

2009

Indirect adaptive routing on large scale interconnection networks.

[BibT_eX]

[DOI]

John Kim

Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

2008

A MIPS R2000 implementation.

[BibT_eX]

[DOI]

Proceedings of the 45th Design Automation Conference, 2008

2007

Parallelized radix-2 scalable Montgomery multiplier.

[BibT_eX]

[DOI]