Matthew Mattina

Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2021

Strong data processing inequality in neural networks with noisy neurons and its implications.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Information Theory, 2021

Debiasing Model Updates for Improving Personalized Federated Training.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Federated Learning Based on Dynamic Regularization.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Towards Efficient Point Cloud Graph Neural Networks Through Architectural Simplification.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

2020

Sparse Systolic Tensor Array for Efficient CNN Hardware Acceleration.

[BibT_eX]

[DOI]

CoRR, 2020

High Throughput Matrix-Matrix Multiplication between Asymmetric Bit-Width Operands.

[BibT_eX]

[DOI]

Dibakar Gope

Jesse G. Beu

CoRR, 2020

Compressing Language Models using Doped Kronecker Products.

[BibT_eX]

[DOI]

CoRR, 2020

Noisy Machines: Understanding Noisy Neural Networks and Enhancing Robustness to Analog Hardware Errors Using Distillation.

[BibT_eX]

[DOI]

CoRR, 2020

Systolic Tensor Array: An Efficient Structured-Sparse GEMM Accelerator for Mobile CNN Inference.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2020

Searching for Winograd-aware Quantized Networks.

[BibT_eX]

[DOI]

Javier Fernández-Marqués

Andrew Mundy

Proceedings of the Third Conference on Machine Learning and Systems, 2020

A Systematic Methodology for Characterizing Scalability of DNN Accelerators using SCALE-Sim.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ISP4ML: The Role of Image Signal Processing in Efficient Deep Learning Vision Systems.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Pattern Recognition, 2020

Rank and run-time aware compression of NLP Applications.

[BibT_eX]

[DOI]

Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing, 2020

Efficient Residue Number System Based Winograd Convolution.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Ternary MobileNets via Per-Layer Hybrid Filter Banks.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

ISP4ML: Understanding the Role of Image Signal Processing in Efficient Deep Learning Vision Systems.

[BibT_eX]

[DOI]

CoRR, 2019

Compressing RNNs for IoT devices by 15-38x using Kronecker Products.

[BibT_eX]

[DOI]

CoRR, 2019

Measuring scheduling efficiency of RNNs for NLP applications.

[BibT_eX]

[DOI]

CoRR, 2019

FixyNN: Efficient Hardware for Mobile Computer Vision via Transfer Learning.

[BibT_eX]

[DOI]

Shreyas K. Venkataramanaiah

Chuteng Zhou

Patrick Hansen

Jae-sun Seo

CoRR, 2019

Pushing the limits of RNN Compression.

[BibT_eX]

[DOI]

Proceedings of the Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing, 2019

SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

FixyNN: Energy-Efficient Real-Time Mobile Computer Vision Hardware Acceleration via Transfer Learning.

[BibT_eX]

[DOI]

Shreyas K. Venkataramanaiah

Chuteng Zhou

Patrick Hansen

Jae-sun Seo

Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

Ternary Hybrid Neural-Tree Networks for Highly Constrained IoT Applications.

[BibT_eX]

[DOI]

Dibakar Gope

Ganesh Dasika

Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

Learning Low-precision Neural Networks without Straight-Through Estimator (STE).

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Run-Time Efficient RNN Compression for Inference on Edge Devices.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications, 2019

Efficient Winograd or Cook-Toom Convolution Kernel Implementation on Widely Used Mobile CPUs.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications, 2019

2018

Efficient and Robust Machine Learning for Real-World Systems.

[BibT_eX]

[DOI]

Sebastian Tschiatschek

Robert Peharz

Zoubin Ghahramani

CoRR, 2018

Energy Efficient Hardware for On-Device CNN Inference via Transfer Learning.

[BibT_eX]

[DOI]

CoRR, 2018

SCALE-Sim: Systolic CNN Accelerator.

[BibT_eX]

[DOI]

CoRR, 2018

Mobile Machine Learning Hardware at ARM: A Systems-on-Chip (SoC) Perspective.

[BibT_eX]

[DOI]

Yuhao Zhu

CoRR, 2018

Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision.

[BibT_eX]

[DOI]

Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018

2008

TILE64 - Processor: A 64-Core SoC with Mesh Interconnect.

[BibT_eX]

[DOI]

Proceedings of the 2008 IEEE International Solid-State Circuits Conference, 2008

2007

On-Chip Interconnection Architecture of the Tile Processor.

[BibT_eX]

[DOI]

IEEE Micro, 2007

2006

Last level cache (LLC) performance of data mining workloads on a CMP - a case study of parallel bioinformatics workloads.

[BibT_eX]

[DOI]

Aamer Jaleel