Vitaly Lavrukhin

According to our database1, Vitaly Lavrukhin authored at least 23 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
EMMeTT: Efficient Multimodal Machine Translation Training.
CoRR, 2024

Chain-of-Thought Prompting for Speech Translation.
CoRR, 2024

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data.
CoRR, 2024

Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter.
CoRR, 2024

Label-Looping: Highly Efficient Decoding for Transducers.
CoRR, 2024

A Chat about Boring Problems: Studying GPT-Based Text Normalization.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
NeMo Forced Aligner and its application to word alignment for subtitle generation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Confidence-based Ensembles of End-to-End Speech Recognition Models.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Conformer-Based Target-Speaker Automatic Speech Recognition For Single-Channel Audio.
Proceedings of the IEEE International Conference on Acoustics, 2023

LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of End-to-End ASR Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

2021
NeMo Toolbox for Speech Dataset Construction.
CoRR, 2021

A Toolbox for Construction and Analysis of Speech Datasets.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

SPGISpeech: 5, 000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Hi-Fi Multi-Speaker English TTS Dataset.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

2020
Quartznet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
NeMo: a toolkit for building AI applications using Neural Modules.
CoRR, 2019

Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks.
CoRR, 2019

Jasper: An End-to-End Convolutional Neural Acoustic Model.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018
Training Neural Speech Recognition Systems with Synthetic Speech Augmentation.
CoRR, 2018

OpenSeq2Seq: extensible toolkit for distributed and mixed precision training of sequence-to-sequence models.
CoRR, 2018


  Loading...