Alexei Baevski

According to our database1, Alexei Baevski authored at least 44 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Scaling Speech Technology to 1, 000+ Languages.
J. Mach. Learn. Res., 2024

2023
OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav.
CoRR, 2023

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language.
Proceedings of the International Conference on Machine Learning, 2023

Measuring the Impact of Domain Factors in Self-Supervised Pre-Training.
Proceedings of the IEEE International Conference on Acoustics, 2023

Toward Joint Language Modeling for Speech Units and Text.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Av-Data2Vec: Self-Supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Introducing Semantics into Speech Encoders.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Introducing Semantics into Speech Encoders.
CoRR, 2022

Offline Visual Representation Learning for Embodied Navigation.
CoRR, 2022

Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training.
CoRR, 2022

Towards End-to-End Unsupervised Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Masked Autoencoders that Listen.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Simple and Effective Zero-shot Cross-lingual Phoneme Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

On-demand compute reduction with stochastic wav2vec 2.0.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Wav2Vec-Aug: Improved self-supervised training with limited data.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Simple and Effective Unsupervised Speech Synthesis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language.
Proceedings of the International Conference on Machine Learning, 2022

Improved Language Identification Through Cross-Lingual Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022

Unified Speech-Text Pre-training for Speech Translation and Recognition.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Improved Language Identification Through Cross-Lingual Self-Supervised Learning.
CoRR, 2021

Generative Spoken Language Modeling from Raw Audio.
CoRR, 2021

Unsupervised Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Large-Scale Self- and Semi-Supervised Learning for Speech Translation.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Unsupervised Cross-Lingual Representation Learning for Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Comparison of Discrete Latent Variable Models for Speech Representation Learning.
Proceedings of the IEEE International Conference on Acoustics, 2021

Self-Training and Pre-Training are Complementary for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Reservoir Transformers.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Multilingual Speech Translation from Efficient Finetuning of Pretrained Models.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Reservoir Transformer.
CoRR, 2020

The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling.
CoRR, 2020

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations.
CoRR, 2020

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations.
Proceedings of the 8th International Conference on Learning Representations, 2020

Effectiveness of Self-Supervised Pre-Training for ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Effectiveness of self-supervised pre-training for speech recognition.
CoRR, 2019

Facebook FAIR's WMT19 News Translation Task Submission.
Proceedings of the Fourth Conference on Machine Translation, 2019

fairseq: A Fast, Extensible Toolkit for Sequence Modeling.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Pre-trained language model representations for language generation.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

wav2vec: Unsupervised Pre-Training for Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Pay Less Attention with Lightweight and Dynamic Convolutions.
Proceedings of the 7th International Conference on Learning Representations, 2019

Adaptive Input Representations for Neural Language Modeling.
Proceedings of the 7th International Conference on Learning Representations, 2019

Cloze-driven Pretraining of Self-attention Networks.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019


  Loading...