Nan Jiang

Affiliations:
  • NVIDIA Corporation, St. Louis, USA
  • Stanford University, CA, USA


According to our database1, Nan Jiang authored at least 17 papers between 2007 and 2021.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2021
Simba: scaling deep-learning inference with chiplet-based architecture.
Commun. ACM, 2021

Need for Speed: Experiences Building a Trustworthy System-Level GPU Simulator.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

2020
A 0.32-128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm.
IEEE J. Solid State Circuits, 2020

An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

2019
A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm.
Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019

Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

A 0.11 PJ/OP, 0.32-128 Tops, Scalable Multi-Chip-Module-Based Deep Neural Network Accelerator Designed with A High-Productivity vlsi Methodology.
Proceedings of the 2019 IEEE Hot Chips 31 Symposium (HCS), 2019

2018
Exploiting idle resources in a high-radix switch for supplemental storage.
Proceedings of the International Conference for High Performance Computing, 2018

2015
Network endpoint congestion control for fine-grained communication.
Proceedings of the International Conference for High Performance Computing, 2015

2013
Channel reservation protocol for over-subscribed channels and destinations.
Proceedings of the International Conference for High Performance Computing, 2013

A detailed and flexible cycle-accurate Network-on-Chip simulator.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

2012
Adaptive Backpressure: Efficient buffer management for on-chip networks.
Proceedings of the 30th International IEEE Conference on Computer Design, 2012

Network congestion avoidance through Speculative Reservation.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011
Packet Chaining: Efficient Single-Cycle Allocation for On-Chip Networks.
IEEE Comput. Archit. Lett., 2011

2009
Indirect adaptive routing on large scale interconnection networks.
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

2008
A MIPS R2000 implementation.
Proceedings of the 45th Design Automation Conference, 2008

2007
Parallelized radix-2 scalable Montgomery multiplier.
Proceedings of the IFIP VLSI-SoC 2007, 2007


  Loading...