2024
EQ-ViT: Algorithm-Hardware Co-Design for End-to-End Acceleration of Real-Time Vision Transformer Inference on Versal ACAP Architecture.
,
,
,
,
,
,
,
,
,
,
,
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2024
EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge.
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 38th ACM International Conference on Supercomputing, 2024
SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 34th International Conference on Field-Programmable Logic and Applications, 2024
SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum- Flux - Parametron Superconducting Circuits.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
A Life-Cycle Energy and Inventory Analysis of Adiabatic Quantum-Flux-Parametron Circuits.
CoRR, 2023
PackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobile.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
HotBEV: Hardware-oriented Transformer-based Multi-View 3D Detector for BEV Perception.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Data Level Lottery Ticket Hypothesis for Vision Transformers.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
SpeedDETR: Speed-aware Transformers for End-to-end Object Detection.
Proceedings of the International Conference on Machine Learning, 2023
Fast and Fair Medical AI on the Edge Through Neural Architecture Search for Hybrid Vision Models.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers.
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Late Breaking Results: Fast Fair Medical Applications? Hybrid Vision Models Achieve the Fairness on the Edge.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Invited: Algorithm-Software-Hardware Co-Design for Deep Learning Acceleration.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Floating Gate Transistor-Based Accurate Digital In-Memory Computing for Deep Neural Networks.
Adv. Intell. Syst., December, 2022
Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework.
,
,
,
,
,
,
,
,
,
,
,
,
ACM Trans. Embed. Comput. Syst., September, 2022
GRIM: A General, Real-Time Deep Learning Inference Framework for Mobile Devices Based on Fine-Grained Structured Weight Sparsity.
IEEE Trans. Pattern Anal. Mach. Intell., 2022
The Lottery Ticket Hypothesis for Vision Transformers.
CoRR, 2022
Quantum Neural Network Compression.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022
You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding.
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2022, 2022
SPViT: Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2022, 2022
TAAS: a timing-aware analytical strategy for AQFP-capable placement automation.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022
2021
NS-FDN: Near-Sensor Processing Architecture of Feature-Configurable Distributed Network for Beyond-Real-Time Always-on Keyword Spotting.
,
,
,
,
,
,
,
,
,
,
IEEE Trans. Circuits Syst. I Regul. Pap., 2021
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning.
CoRR, 2021
Work in Progress: Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 27th IEEE Real-Time and Embedded Technology and Applications Symposium, 2021
Puncturing the memory wall: Joint optimization of network compression with approximate memory for ASR application.
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021
2020
RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition.
,
,
,
,
,
,
,
,
,
,
CoRR, 2020
CSB-RNN: a faster-than-realtime RNN acceleration framework with compressed structured blocks.
Proceedings of the ICS '20: 2020 International Conference on Supercomputing, 2020
RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2019
DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks.
CoRR, 2019