Zhenglun Kong
Orcid: 0000-0002-8120-4456
According to our database1,
Zhenglun Kong
authored at least 35 papers
between 2017 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
CoRR, 2024
EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge.
CoRR, 2024
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers.
Proceedings of the 38th ACM International Conference on Supercomputing, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
CoRR, 2023
HotBEV: Hardware-oriented Transformer-based Multi-View 3D Detector for BEV Perception.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
Proceedings of the International Conference on Machine Learning, 2023
Fast and Fair Medical AI on the Edge Through Neural Architecture Search for Hybrid Vision Models.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Late Breaking Results: Fast Fair Medical Applications? Hybrid Vision Models Achieve the Fairness on the Edge.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Condense: A Framework for Device and Frequency Adaptive Neural Network Models on the Edge.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding.
Proceedings of the Computer Vision - ECCV 2022, 2022
Proceedings of the Computer Vision - ECCV 2022, 2022
2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI.
Proceedings of the 22nd International Symposium on Quality Electronic Design, 2021
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021
HMC-TRAN: A Tensor-core Inspired Hierarchical Model Compression for Transformer-based DNNs on GPU.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021
NPAS: A Compiler-Aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
2020
6.7ms on Mobile with over 78% ImageNet Accuracy: Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration.
CoRR, 2020
Achieving Real-Time Execution of Transformer-based Large-scale Models on Mobile with Compiler-aware Neural Architecture Optimization.
CoRR, 2020
SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency.
CoRR, 2020
Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020
2017
High stability and robustness of a developed novel laser acupuncture theranostic device.
Microelectron. Reliab., 2017