Kailash Gopalakrishnan

Wei Wang

Moriyoshi Ohara

Proceedings of the IEEE Symposium in Low-Power and High-Speed Chips, 2019

DLFloat: A 16-b Floating Point Format Designed for Deep Learning Training and Inference.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE Symposium on Computer Arithmetic, 2019

2018

Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN).

[BibT_eX]

[DOI]

Pierce I-Jen Chuang

Zhuo Wang

CoRR, 2018

PACT: Parameterized Clipping Activation for Quantized Neural Networks.

[BibT_eX]

[DOI]

Zhuo Wang

Pierce I-Jen Chuang

CoRR, 2018

A Scalable Multi- TeraOPS Deep Learning Processor Core for AI Trainina and Inference.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Symposium on VLSI Circuits, 2018

Training Deep Neural Networks with 8-bit Floating Point Numbers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Taming the beast: Programming Peta-FLOP class Deep Learning Systems.

[BibT_eX]

[DOI]

Leland Chang

Proceedings of the International Symposium on Low Power Electronics and Design, 2018

Across the Stack Opportunities for Deep Learning Acceleration.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Low Power Electronics and Design, 2018

True Gradient-Based Training of Deep Binary Activated Neural Networks Via Continuous Binarization.

[BibT_eX]

[DOI]

Charbel Sakr

Zhuo Wang

Naresh R. Shanbhag

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Exploiting approximate computing for deep learning acceleration.

[BibT_eX]

[DOI]

Chia-Yu Chen

Viji Srinivasan

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Accelerator Design for Deep Learning Training: Extended Abstract: Invited.

[BibT_eX]

[DOI]

Ankur Agrawal

Chia-Yu Chen

Jinwook Oh

Sunil Shukla

Viji Srinivasan

Wei Zhang

Proceedings of the 54th Annual Design Automation Conference, 2017

POSTER: Design Space Exploration for Performance Optimization of Deep Neural Networks on Shared Memory Accelerators.

[BibT_eX]

[DOI]

Leland Chang

Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016

Energy-Efficient Simultaneous Localization and Mapping via Compounded Approximate Computing.

[BibT_eX]

[DOI]

Jinwook Oh

Guilherme C. Januario

Proceedings of the 2016 IEEE International Workshop on Signal Processing Systems, 2016

Approximate computing: Challenges and opportunities.

[BibT_eX]

[DOI]

Ankur Agrawal

Zehra Sura

Proceedings of the IEEE International Conference on Rebooting Computing, 2016

2015

Deep Learning with Limited Numerical Precision.

[BibT_eX]

[DOI]

Suyog Gupta

Ankur Agrawal

Pritish Narayanan

Proceedings of the 32nd International Conference on Machine Learning, 2015

2014

Learning Machines Implemented on Non-Deterministic Hardware.

[BibT_eX]

[DOI]

Suyog Gupta

Vikas Sindhwani

CoRR, 2014

2013

Nanoscale electronic synapses using phase change devices.

[BibT_eX]

[DOI]

Bryan L. Jackson

Bipin Rajendran

Gregory S. Corrado

Matthew J. Breitwisch

Geoffrey W. Burr

Roger Cheek

ACM J. Emerg. Technol. Comput. Syst., 2013

2008

Overview of candidate device technologies for storage-class memory.

[BibT_eX]

[DOI]