Jiaming Tang

GetMobile Mob. Comput. Commun., December, 2024

Efficient Deep Learning Computing: From TinyML to LargeLM

[DOI]

PhD thesis, 2024

Tiny Machine Learning: Progress and Futures.

[DOI]

CoRR, 2024

AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration.

[DOI]

Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024

VILA: On Pre-training for Visual Language Models.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

VILA: On Pre-training for Visual Language Models.

[DOI]

CoRR, 2023

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration.

[DOI]

CoRR, 2023

Offsite-Tuning: Transfer Learning without Full Model.

[DOI]

Guangxuan Xiao

CoRR, 2023

PockEngine: Sparse and Efficient Fine-tuning in a Pocket.

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models.

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

2022

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications.

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2022

TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Devices.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

GAN Compression: Efficient Architectures for Interactive Conditional GANs.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models.

[DOI]

CoRR, 2022

On-Device Training Under 256KB Memory.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Network Augmentation for Tiny Deep Learning.

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning.

[DOI]

CoRR, 2021

TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device.

[DOI]

CoRR, 2021

Memory-efficient Patch-based Inference for Tiny Deep Learning.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Anycost GANs for Interactive Image Synthesis and Editing.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

AutoML for Architecting Efficient and Specialized Neural Networks.

[DOI]

IEEE Micro, 2020

Hardware-Centric AutoML for Mixed-Precision Quantization.

[DOI]

Int. J. Comput. Vis., 2020

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy.

[DOI]

CoRR, 2020

Differentiable Augmentation for Data-Efficient GAN Training.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

MCUNet: Tiny Deep Learning on IoT Devices.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Lite Transformer with Long-Short Range Attention.

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution.

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Runtime Network Routing for Efficient Image Classification.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2019

Training Kinetics in 15 Minutes: Large-scale Distributed Training on Videos.

[DOI]

CoRR, 2019

Design Automation for Efficient Deep Learning Computing.

[DOI]

CoRR, 2019

Defensive Quantization: When Efficiency Meets Robustness.

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

On-Device Image Classification with Proxyless Neural Architecture Search and Quantization-Aware Fine-Tuning.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

TSM: Temporal Shift Module for Efficient Video Understanding.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Joint Monocular 3D Vehicle Detection and Tracking.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

HAQ: Hardware-Aware Automated Quantization With Mixed Precision.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

HAQ: Hardware-Aware Automated Quantization.

[DOI]

CoRR, 2018

Temporal Shift Module for Efficient Video Understanding.

[DOI]