2025
Hardware-Efficient Attention for Fast Decoding.
CoRR, May, 2025

2024
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
High Probability Bounds for Stochastic Continuous Submodular Maximization.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023