2025

Hardware-Efficient Attention for Fast Decoding.

[DOI]

CoRR, May, 2025

2024

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning.

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

High Probability Bounds for Stochastic Continuous Submodular Maximization.

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023