2024
Token Alignment via Character Matching for Subword Completion.
CoRR, 2024

Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars.
CoRR, 2024

Bifurcated Attention for Single-Context Large-Batch Sampling.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

CodeFort: Robust Training for Code Generation Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

BASS: Batched Attention-optimized Speculative Sampling.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Token Alignment via Character Matching for Subword Completion.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Multi-lingual Evaluation of Code Generation Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Multi-lingual Evaluation of Code Generation Models.
CoRR, 2022

2020
Combining Word Embeddings and N-grams for Unsupervised Document Summarization.
CoRR, 2020

The 2019 BBN Cross-lingual Information Retrieval System.
Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech, 2020

2018
Speech Recognition: Keyword Spotting Through Image Recognition.
CoRR, 2018