Daan van Esch

According to our database1, Daan van Esch authored at least 24 papers between 2016 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Multimodal Modeling for Spoken Language Identification.
Proceedings of the IEEE International Conference on Acoustics, 2024

LinguaMeta: Unified Metadata for Thousands of Languages.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Connecting Language Technologies with Rich, Diverse Data Sources Covering Thousands of Languages.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023
Multimodal Modeling For Spoken Language Identification.
CoRR, 2023

2022
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets.
Trans. Assoc. Comput. Linguistics, 2022

Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning.
CoRR, 2022

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data.
CoRR, 2022

Building Machine Translation Systems for the Next Thousand Languages.
CoRR, 2022

Handling Compounding in Mobile Keyboard Input.
CoRR, 2022

Writing System and Speaker Metadata for 2, 800+ Language Varieties.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

XTREME-S: Evaluating Cross-lingual Speech Representations.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Mining Large-Scale Low-Resource Pronunciation Data From Wikipedia.
CoRR, 2021

2020
Data-Driven Parametric Text Normalization: Rapidly Scaling Finite-State Transduction Verbalizers to New Languages.
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages and Collaboration and Computing for Under-Resourced Languages, 2020

Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

2019
Writing Across the World's Languages: Deep Internationalization for Gboard, the Google Keyboard.
CoRR, 2019

Automatic Keyboard Layout Design for Low-Resource Latin-Script Languages.
CoRR, 2019

Unified Verbalization for Speech Recognition & Synthesis Across Languages.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Building Large-Vocabulary ASR Systems for Languages Without Any Audio Training Data.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Developing Pronunciation Models in New Languages Faster by Exploiting Common Grapheme-to-Phoneme Correspondences Across Languages.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018
Mining Training Data for Language Modeling Across the World's Languages.
Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

Building Speech Recognition Systems for Language Documentation: The CoEDL Endangered Language Pipeline and Inference System (ELPIS).
Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

Text Normalization Infrastructure that Scales to Hundreds of Language Varieties.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

2017
An Expanded Taxonomy of Semiotic Classes for Text Normalization.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016
Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016


  Loading...