2025

Scalable LLM Math Reasoning Acceleration with Low-rank Distillation.

[DOI]

Harry Dong

Bilge Acun

Beidi Chen

Yuejie Chi

CoRR, May, 2025

Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information.

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Towards Low-bit Communication for Tensor Parallel LLM Inference.

[DOI]

CoRR, 2024

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference.

[DOI]

CoRR, 2024

Prompt-prompted Mixture of Experts for Efficient LLM Generation.

[DOI]

Harry Dong

Beidi Chen

Yuejie Chi

CoRR, 2024

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference.

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

A Lightweight Transformer for Faster and Robust EBSD Data Collection.

[DOI]

CoRR, 2023

Deep Unfolded Tensor Robust PCA With Self-Supervised Learning.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent.

[DOI]

CoRR, 2022