Song Han

CoRR, 2022

On-chip QNN: Towards Efficient On-Chip Training of Quantum Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2022

Variational Quantum Pulse Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Quantum Computing and Engineering, 2022

On-Device Training Under 256KB Memory.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

TorchSparse: Efficient Point Cloud Inference Engine.

[BibT_eX]

[DOI]

Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

RobustAnalog: Fast Variation-Aware Analog Circuit Design Via Multi-task RL.

[BibT_eX]

[DOI]

Proceedings of the 2022 ACM/IEEE Workshop on Machine Learning for CAD, 2022

VISTA 2.0: An Open, Data-driven Simulator for Multimodal Sensing and Policy Learning for Autonomous Vehicles.

[BibT_eX]

[DOI]

Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Network Augmentation for Tiny Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

TorchQuantum Case Study for Robust Quantum Circuits.

[BibT_eX]

[DOI]

Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

QOC: quantum on-chip training with parameter shift and gradient pruning.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

QuantumNAT: quantum noise-aware training with noise injection, quantization and normalization.

[BibT_eX]

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DeepVS: a deep learning approach for RF-based vital signs sensing.

[BibT_eX]

[DOI]

Proceedings of the BCB '22: 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Northbrook, Illinois, USA, August 7, 2022

2021

MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2021

RoQNN: Noise-Aware Training for Robust Quantum Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2021

TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device.

[BibT_eX]

[DOI]

CoRR, 2021

PatchNet - Short-range Template Matching for Efficient Video Processing.

[BibT_eX]

[DOI]

CoRR, 2021

Delayed Gradient Averaging: Tolerate the Communication Latency for Federated Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Memory-efficient Patch-based Inference for Tiny Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

IOS: Inter-Operator Scheduler for CNN Acceleration.

[BibT_eX]

[DOI]

Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

PointAcc: Efficient Point Cloud Accelerator.

[BibT_eX]

[DOI]

Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

SemAlign: Annotation-Free Camera-LiDAR Calibration with Semantic Alignment Loss.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Efficient and Robust LiDAR-Based End-to-End Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2021

LocTex: Learning Data-Efficient Visual Representations from Localized Textual Supervision.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning.

[BibT_eX]

[DOI]

Hanrui Wang

Zhekai Zhang

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

NAAS: Neural Accelerator Architecture Search.

[BibT_eX]

[DOI]

Yujun Lin

Mengtian Yang

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

Anycost GANs for Interactive Image Synthesis and Editing.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

The 2020 Low-Power Computer Vision Challenge.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

2020

Deep Leakage from Gradients.

[BibT_eX]

[DOI]

Ligeng Zhu

Proceedings of the Federated Learning - Privacy and Incentive, 2020

Long Live TIME: Improving Lifetime and Security for NVM-Based Training-in-Memory Systems.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Energy Efficient On-Demand Dynamic Branch Prediction Models.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2020

Scanning the Issue.

[BibT_eX]

[DOI]

H.-S. Philip Wong

Kerem Akarvardar

Dimitri A. Antoniadis

Proc. IEEE, 2020

Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey.

[BibT_eX]

[DOI]

Proc. IEEE, 2020

AutoML for Architecting Efficient and Specialized Neural Networks.

[BibT_eX]

[DOI]

IEEE Micro, 2020

Hardware-Centric AutoML for Mixed-Precision Quantization.

[BibT_eX]

[DOI]

Int. J. Comput. Vis., 2020

Tiny Transfer Learning: Towards Memory-Efficient On-Device Learning.

[BibT_eX]

[DOI]

CoRR, 2020

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy.

[BibT_eX]

[DOI]

CoRR, 2020

Domain-specific hardware accelerators.

[BibT_eX]

[DOI]

William J. Dally

Yatish Turakhia

Commun. ACM, 2020

Differentiable Augmentation for Data-Efficient GAN Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

MCUNet: Tiny Deep Learning on IoT Devices.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

TinyTL: Reduce Memory, Not Parameters for Efficient On-Device Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Lite Transformer with Long-Short Range Attention.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Once-for-All: Train One Network and Specialize it for Efficient Deployment.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

SpArch: Efficient Architecture for Sparse Matrix Multiplication.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

DataMix: Efficient Privacy-Preserving Edge-Cloud Inference.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

Modeling and Optimization for Self-powered Non-volatile IoT Edge Devices with Ultra-low Harvesting Power.

[BibT_eX]

[DOI]

ACM Trans. Cyber Phys. Syst., 2019

Training Kinetics in 15 Minutes: Large-scale Distributed Training on Videos.

[BibT_eX]

[DOI]

CoRR, 2019

Once for All: Train One Network and Specialize it for Efficient Deployment.

[BibT_eX]

[DOI]

Han Cai

Dimitris S. Papailiopoulos

CoRR, 2019

Design Automation for Efficient Deep Learning Computing.

[BibT_eX]

[DOI]

CoRR, 2019

SysML: The New Frontier of Machine Learning Systems.

[BibT_eX]

[DOI]

Alexandros G. Dimakis

Anastasios Kyrillidis

Shivaram Venkataraman

CoRR, 2019

Deep Leakage from Gradients.

[BibT_eX]

[DOI]

Ligeng Zhu

Zhijian Liu

Shaileshh Bojja Venkatakrishnan

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

MicroNet for Efficient Language Modeling.

[BibT_eX]

[DOI]

Proceedings of the NeurIPS 2019 Competition and Demonstration Track, 2019

Park: An Open Platform for Learning-Augmented Computer Systems.

[BibT_eX]

[DOI]

Mehrdad Khani Shirkoohi

Songtao He

Vikram Nathan

Frank Cangialosi

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Point-Voxel CNN for Efficient 3D Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Defensive Quantization: When Efficiency Meets Robustness.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware.

[BibT_eX]

[DOI]

Han Cai

Ligeng Zhu

Proceedings of the 7th International Conference on Learning Representations, 2019

On-Device Image Classification with Proxyless Neural Architecture Search and Quantization-Aware Fine-Tuning.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

TSM: Temporal Shift Module for Efficient Video Understanding.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Fine-Grained Sparse Accelerator for Multi-Precision DNN.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

Fast Inference of Deep Neural Networks for Real-time Particle Physics Applications.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

HAQ: Hardware-Aware Automated Quantization With Mixed Precision.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Learning to Design Circuits.

[BibT_eX]

[DOI]

CoRR, 2018

HAQ: Hardware-Aware Automated Quantization.

[BibT_eX]

[DOI]

CoRR, 2018

Temporal Shift Module for Efficient Video Understanding.

[BibT_eX]

[DOI]

CoRR, 2018

Fast inference of deep neural networks in FPGAs for particle physics.

[BibT_eX]

[DOI]

CoRR, 2018

Path-Level Network Transformation for Efficient Architecture Search.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Efficient Sparse-Winograd Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

AMC: AutoML for Model Compression and Acceleration on Mobile Devices.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

Bandwidth-efficient deep learning.

[BibT_eX]

[DOI]

William J. Dally

Proceedings of the 55th Annual Design Automation Conference, 2018

Long live TIME: improving lifetime for training-in-memory engines by structured gradient sparsification.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

2017

Software-Hardware Codesign for Efficient Neural Network Acceleration.

[BibT_eX]

[DOI]

IEEE Micro, 2017

Deep Generative Adversarial Networks for Compressed Sensing Automates MRI.

[BibT_eX]

[DOI]

CoRR, 2017

Exploring the Regularity of Sparse Structure in Convolutional Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2017

Trained Ternary Quantization.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

Efficient Sparse-Winograd Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

DSD: Dense-Sparse-Dense Training for Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA.

[BibT_eX]

[DOI]

William (Bill) J. Dally

Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

An FPGA Design Framework for CNN Sparsification and Acceleration.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

Exploring the Granularity of Sparsity in Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

2016

Research for Practice: Cryptocurrencies, Blockchains, and Smart Contracts; Hardware for Deep Learning.

[BibT_eX]

[DOI]

ACM Queue, 2016

Generate Image Descriptions based on Deep RNN and Memory Cells for Images Features.

[BibT_eX]

[DOI]

Shijian Tang

CoRR, 2016

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size.

[BibT_eX]

[DOI]

CoRR, 2016

DSD: Regularizing Deep Neural Networks with Dense-Sparse-Dense Training Flow.

[BibT_eX]

[DOI]

CoRR, 2016

Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding.

[BibT_eX]

[DOI]