Raghuraman Krishnamoorthi

Tijmen Blankevoort

CoRR, February, 2025

EdgeTAM: On-Device Track Anything Model.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

Efficient Track Anything.

[BibT_eX]

[DOI]

Balakrishnan Varadarajan

Ramya Akula

Forrest N. Iandola

Bilge Soran

CoRR, 2024

Llama Guard 3-1B-INT4: Compact and Efficient Safeguard for Human-AI Conversations.

[BibT_eX]

[DOI]

CoRR, 2024

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding.

[BibT_eX]

[DOI]

Balakrishnan Varadarajan

Mohamed Elhoseiny

CoRR, 2024

Agent-as-a-Judge: Evaluate Agents with Agents.

[BibT_eX]

[DOI]

CoRR, 2024

SpinQuant: LLM quantization with learned rotations.

[BibT_eX]

[DOI]

Yuandong Tian

Tijmen Blankevoort

CoRR, 2024

Communication Efficient Distributed Training with Distributed Lion.

[BibT_eX]

[DOI]

CoRR, 2024

Data Efficient Reflow for Few Step Audio Generation.

[BibT_eX]

[DOI]

Wei-Ning Hsu

Yangyang Shi

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Communication Efficient Distributed Training with Distributed Lion.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases.

[BibT_eX]

[DOI]

Liangzhen Lai

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once.

[BibT_eX]

[DOI]

Shiyu Chang

Zhangyang Wang

Proceedings of the Forty-first International Conference on Machine Learning, 2024

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-Device ASR Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

LLM-QAT: Data-Free Quantization Aware Training for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts.

[BibT_eX]

[DOI]

Muhammad Abdul-Mageed

Laks V. S. Lakshmanan

Proceedings of the Findings of the Association for Computational Linguistics, 2024

PathFusion: Path-Consistent Lidar-Camera Deep Feature Fusion.

[BibT_eX]

[DOI]

Proceedings of the International Conference on 3D Vision, 2024

2023

SqueezeSAM: User friendly mobile interactive segmentation.

[BibT_eX]

[DOI]

Balakrishnan Varadarajan

CoRR, 2023

Gen2Det: Generate to Detect.

[BibT_eX]

[DOI]

Chenchen Zhu

Abhinav Shrivastava

CoRR, 2023

Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images.

[BibT_eX]

[DOI]

Zhuoran Yu

Chenchen Zhu

Sean Chang Culatana

Fanyi Xiao

Yong Jae Lee

CoRR, 2023

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning.

[BibT_eX]

[DOI]

Yunyang Xiong

Mohamed Elhoseiny

CoRR, 2023

Fast Point Cloud Generation with Straight Flows.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Binary and Ternary Natural Language Generation.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting.

[BibT_eX]

[DOI]

CoRR, 2022

Learning a Dual-Mode Speech Recognition Model VIA Self-Pruning.

[BibT_eX]

[DOI]

Ozlem Kalinli

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Check-N-Run: a Checkpointing System for Training Deep Learning Recommendation Models.

[BibT_eX]

[DOI]

Krishnakumar Nair

Misha Smelyanskiy

Murali Annavaram

Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

BiT: Robustly Binarized Multi-distilled Transformer.

[BibT_eX]

[DOI]

Yashar Mehdad

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks.

[BibT_eX]

[DOI]

Yingyan Lin

Proceedings of the International Conference on Machine Learning, 2022

2021

Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale.

[BibT_eX]

[DOI]

IEEE Micro, 2021

2020

Check-N-Run: A Checkpointing System for Training Recommendation Models.

[BibT_eX]

[DOI]

Murali Annavaram

Krishnakumar Nair

Misha Smelyanskiy

CoRR, 2020

2019

Deep Learning Recommendation Model for Personalization and Recommendation Systems.

[BibT_eX]

[DOI]

CoRR, 2019

2018

Quantizing deep convolutional networks for efficient inference: A whitepaper.

[BibT_eX]

[DOI]

CoRR, 2018

2007

FLO Physical Layer: An Overview.

[BibT_eX]

[DOI]

Murali R. Chari

Fuyun Ling

Ashok Mantravadi