2025
Audio-Visual Representation Learning For Lip-Sync Estimation Through Ranking Augmented Contrastive Training.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

2023
SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
RefTextLAS: Reference Text Biased Listen, Attend, and Spell Model For Accurate Reading Evaluation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Efficient domain adaptation of language models in ASR systems using Prompt-tuning.
CoRR, 2021

Towards Continual Entity Learning in Language Models for Conversational Agents.
CoRR, 2021

2020
Neural Composition: Learning to Generate from Multiple Models.
CoRR, 2020

2019
Jasper: An End-to-End Convolutional Neural Acoustic Model.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation.
CoRR, 2018