2024

EQ-ViT: Algorithm-Hardware Co-Design for End-to-End Acceleration of Real-Time Vision Transformer Inference on Versal ACAP Architecture.

[DOI]

Peiyan Dong

Jinming Zhuang

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2024

EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge.

[DOI]

CoRR, 2024

Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers.

[DOI]

Proceedings of the 38th ACM International Conference on Supercomputing, 2024

SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs.

[DOI]

Proceedings of the 34th International Conference on Field-Programmable Logic and Applications, 2024

SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum- Flux - Parametron Superconducting Circuits.

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge.

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

A Life-Cycle Energy and Inventory Analysis of Adiabatic Quantum-Flux-Parametron Circuits.

[DOI]

CoRR, 2023

PackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobile.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

HotBEV: Hardware-oriented Transformer-based Multi-View 3D Detector for BEV Perception.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices.

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Data Level Lottery Ticket Hypothesis for Vision Transformers.

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

SpeedDETR: Speed-aware Transformers for End-to-end Object Detection.

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Fast and Fair Medical AI on the Edge Through Neural Architecture Search for Hybrid Vision Models.

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers.

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

Late Breaking Results: Fast Fair Medical Applications? Hybrid Vision Models Achieve the Fairness on the Edge.

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Invited: Algorithm-Software-Hardware Co-Design for Deep Learning Acceleration.

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training.

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Floating Gate Transistor-Based Accurate Digital In-Memory Computing for Deep Neural Networks.

[DOI]

Adv. Intell. Syst., December, 2022

Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework.

[DOI]

ACM Trans. Embed. Comput. Syst., September, 2022

GRIM: A General, Real-Time Deep Learning Inference Framework for Mobile Devices Based on Fine-Grained Structured Weight Sparsity.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

The Lottery Ticket Hypothesis for Vision Transformers.

[DOI]

CoRR, 2022

Quantum Neural Network Compression.

[DOI]

Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

SPViT: Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

TAAS: a timing-aware analytical strategy for AQFP-capable placement automation.

[DOI]

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021

NS-FDN: Near-Sensor Processing Architecture of Feature-Configurable Distributed Network for Beyond-Real-Time Always-on Keyword Spotting.

[DOI]

IEEE Trans. Circuits Syst. I Regul. Pap., 2021

SPViT: Enabling Faster Vision Transformers via Soft Token Pruning.

[DOI]

CoRR, 2021

Work in Progress: Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework.

[DOI]

Proceedings of the 27th IEEE Real-Time and Embedded Technology and Applications Symposium, 2021

Puncturing the memory wall: Joint optimization of network compression with approximate memory for ASR application.

[DOI]

Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

2020

RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition.

[DOI]

CoRR, 2020

CSB-RNN: a faster-than-realtime RNN acceleration framework with compressed structured blocks.

[DOI]

Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020

RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition.

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks.

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks.

[DOI]

CoRR, 2019