Zhenglun Kong

Orcid: 0000-0002-8120-4456

According to our database1, Zhenglun Kong authored at least 35 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Exploring Token Pruning in Vision State Space Models.
CoRR, 2024

Search for Efficient Large Language Models.
CoRR, 2024

Efficient Pruning of Large Language Model with Adaptive Estimation Fusion.
CoRR, 2024

EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge.
CoRR, 2024

FasterVD: On Acceleration of Video Diffusion Models.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024

Pruning Foundation Models for High Accuracy without Retraining.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Rethinking Token Reduction for State Space Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
GPU Accelerated Color Correction and Frame Warping for Real-time Video Stitching.
CoRR, 2023

HotBEV: Hardware-oriented Transformer-based Multi-View 3D Detector for BEV Perception.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Data Level Lottery Ticket Hypothesis for Vision Transformers.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

SpeedDETR: Speed-aware Transformers for End-to-end Object Detection.
Proceedings of the International Conference on Machine Learning, 2023

Fast and Fair Medical AI on the Edge Through Neural Architecture Search for Hybrid Vision Models.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

Late Breaking Results: Fast Fair Medical Applications? Hybrid Vision Models Achieve the Fairness on the Edge.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
The Lottery Ticket Hypothesis for Vision Transformers.
CoRR, 2022

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding.
Proceedings of the Computer Vision - ECCV 2022, 2022

SPViT: Enabling Faster Vision Transformers via Latency-Aware Soft Token Pruning.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning.
CoRR, 2021

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI.
Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021

A Compression-Compilation Framework for On-mobile Real-time BERT Applications.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

NPAS: A Compiler-Aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
6.7ms on Mobile with over 78% ImageNet Accuracy: Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration.
CoRR, 2020

Achieving Real-Time Execution of Transformer-based Large-scale Models on Mobile with Compiler-aware Neural Architecture Optimization.
CoRR, 2020

SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency.
CoRR, 2020

Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

2017
High stability and robustness of a developed novel laser acupuncture theranostic device.
Microelectron. Reliab., 2017


  Loading...