Holger Schwenk

Affiliations:
  • University of Le Mans


According to our database1, Holger Schwenk authored at least 128 papers between 1994 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Aligning Speech Segments Beyond Pure Semantics.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Seamless: Multilingual Expressive and Streaming Speech Translation.
CoRR, 2023

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation.
CoRR, 2023

SONAR: Sentence-Level Multimodal and Language-Agnostic Representations.
CoRR, 2023

Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

DiffEdit: Diffusion-based semantic image editing with mask guidance.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Multilingual Representation Distillation with Contrastive Learning.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Speech-to-Speech Translation for a Real-world Unwritten Language.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
No Language Left Behind: Scaling Human-Centered Machine Translation.
CoRR, 2022

Findings of the WMT'22 Shared Task on Large-Scale Machine Translation Evaluation for African Languages.
Proceedings of the Seventh Conference on Machine Translation, 2022

Textless Speech-to-Speech Translation on Real Data.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

stopes - Modular Machine Translation Pipelines.
Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing, 2022

FlexIT: Towards Flexible Semantic Image Translation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Embedding Arithmetic of Multimodal Queries for Image Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021
Beyond English-Centric Multilingual Machine Translation.
J. Mach. Learn. Res., 2021

Textless Speech-to-Speech Translation on Real Data.
CoRR, 2021

Embedding Arithmetic for Text-driven Image Transformation.
CoRR, 2021

Multimodal and Multilingual Embeddings for Large-Scale Speech Mining.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task.
Proceedings of the 18th International Conference on Spoken Language Translation, 2021

WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

CCMatrix: Mining Billions of High-Quality Parallel Sentences on the Web.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Beyond English-Centric Multilingual Machine Translation.
CoRR, 2020

Searching the Web for Cross-lingual Parallel Data.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

A General Framework to Weight Heterogeneous Parallel Data for Model Adaptation in Statistical MT.
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers, 2020

MLQA: Evaluating Cross-lingual Extractive Question Answering.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond.
Trans. Assoc. Comput. Linguistics, 2019

CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB.
CoRR, 2019

Low-Resource Corpus Filtering Using Multilingual Sentence Embeddings.
Proceedings of the Fourth Conference on Machine Translation, 2019

Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Analysis of Joint Multilingual Sentence Representations and Semantic K-Nearest Neighbor Graphs.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
A Corpus for Multilingual Document Classification in Eight Languages.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

XNLI: Evaluating Cross-lingual Sentence Representations.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Filtering and Mining Parallel Data in a Joint Multilingual Space.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Introduction to the special issue on deep learning approaches for machine translation.
Comput. Speech Lang., 2017

Parallel fragments : Measuring their impact on translation performance.
Comput. Speech Lang., 2017

Learning Joint Multilingual Sentence Representations with Neural Machine Translation.
CoRR, 2017

Learning Joint Multilingual Sentence Representations with Neural Machine Translation.
Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Very Deep Convolutional Networks for Text Classification.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

2016
Empirical Use of Information Retrieval to Build Synthetic Data for SMT Domain Adaptation.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Building and using multimodal comparable corpora for machine translation.
Nat. Lang. Eng., 2016

OCR Error Correction Using Statistical Machine Translation.
Int. J. Comput. Linguistics Appl., 2016

Very Deep Convolutional Networks for Natural Language Processing.
CoRR, 2016

2015
On Using Monolingual Corpora in Neural Machine Translation.
CoRR, 2015

Continuous Adaptation to User Feedback for Statistical Machine Translation.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Improving continuous space language models auxiliary features.
Proceedings of the 12th International Workshop on Spoken Language Translation: Papers, 2015

Incremental Adaptation Strategies for Neural Network Language Models.
Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality, 2015

2014
Translation project adaptation for MT-enhanced computer assisted translation.
Mach. Transl., 2014

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.
CoRR, 2014

LIUM English-to-French spoken language translation system and the Vecsys/LIUM automatic speech recognition system for Italian language for IWSLT 2014.
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2014, 2014

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014


2013
Issues in incremental adaptation of statistical MT from human post-edits.
Proceedings of the 2nd Workshop on Post-editing Technology and Practice, 2013

CSLM - a modular open-source continuous space language modeling toolkit.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Multimodal Comparable Corpora as Resources for Extracting Parallel Data: Parallel Phrases Extraction.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

A Multi-Domain Translation Model Framework for Statistical Machine Translation.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2012
LIUM's SMT Machine Translation Systems for WMT 2012.
Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

Traduction automatique à partir de corpus comparables: extraction de phrases parallèles à partir de données comparables multimodales (Automatic Translation from Comparable corpora : extracting parallel sentences from mul- timodal comparable corpora) [in French].
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

Parallel Texts Extraction from Multimodal Comparable Corpora.
Proceedings of the Advances in Natural Language Processing, 2012

Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation.
Proceedings of the Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, 2012

Automatic Translation of Scientific Documents in the HAL Archive.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Incremental adaptation using translation information and post-editing analysis.
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012

Semi-supervised transliteration mining from parallel and comparable corpora.
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012

Collaborative Machine Translation Service for Scientific texts.
Proceedings of the EACL 2012, 2012

Continuous Space Translation Models for Phrase-Based Statistical Machine Translation.
Proceedings of the COLING 2012, 2012

2011
Optimising Multiple Metrics with MERT.
Prague Bull. Math. Linguistics, 2011

Parallel sentence generation from comparable corpora for improved SMT.
Mach. Transl., 2011

LIUM's SMT Machine Translation Systems for WMT 2011.
Proceedings of the Sixth Workshop on Statistical Machine Translation, 2011

Investigations on Translation Model Adaptation Using Monolingual Data.
Proceedings of the Sixth Workshop on Statistical Machine Translation, 2011

LIUM's Statistical Machine Translation System for the NTCIR Chinese/English PatentMT.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

Qualitative Analysis of Post-Editing for High Quality Machine Translation.
Proceedings of Machine Translation Summit XIII: Papers, 2011

LIUM's systems for the IWSLT 2011 speech translation tasks.
Proceedings of the 2011 International Workshop on Spoken Language Translation, 2011

Parametric Weighting of Parallel Data for Statistical Machine Translation.
Proceedings of the Fifth International Joint Conference on Natural Language Processing, 2011

Exploiting Comparable Corpora with TER and TERp.
Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora, 2011

2010
Continuous-Space Language Models for Statistical Machine Translation.
Prague Bull. Math. Linguistics, 2010

Translation Model Adaptation by Resampling.
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, 2010

LIUM SMT Machine Translation System for WMT 2010.
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, 2010

Adaptation d'un Système de Traduction Automatique Statistique avec des Ressources monolingues.
Proceedings of the Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2010

N-gram-based machine translation enhanced with neural networks for the French-English BTEC-IWSLT'10 task.
Proceedings of the 2010 International Workshop on Spoken Language Translation, 2010

2009
SMT and SPE Machine Translation Systems for WMT'09.
Proceedings of the Fourth Workshop on Statistical Machine Translation, 2009

Translation Model Adaptation for an Arabic/French News Translation System by Lightly- Supervised Training.
Proceedings of Machine Translation Summit XII: Posters, 2009

LIUM's statistical machine translation system for IWSLT 2009.
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2009, 2009

LIUM's statistical machine translation systems for IWSLT 2009.
Proceedings of the 2009 International Workshop on Spoken Language Translation, 2009

On the Use of Comparable Corpora to Improve SMT performance.
Proceedings of the EACL 2009, 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, Athens, Greece, March 30, 2009

Trends and challenges in language modeling for speech recognition and machine translation.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
System Combination for Machine Translation of Spoken and Written Language.
IEEE Trans. Speech Audio Process., 2008

First Steps towards a General Purpose French/English Statistical Machine Translation System.
Proceedings of the Third Workshop on Statistical Machine Translation, 2008

The LIUM Arabic/English statistical machine translation system for IWSLT 2008.
Proceedings of the 2008 International Workshop on Spoken Language Translation, 2008

Investigations on large-scale lightly-supervised training for statistical machine translation.
Proceedings of the 2008 International Workshop on Spoken Language Translation, 2008

Data selection and smoothing in an open-source system for the 2008 NIST machine translation evaluation.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Large and Diverse Language Models for Statistical Machine Translation.
Proceedings of the Third International Joint Conference on Natural Language Processing, 2008

2007
Continuous space language models.
Comput. Speech Lang., 2007

Building a Statistical Machine Translation System for French Using the Europarl Corpus.
Proceedings of the Second Workshop on Statistical Machine Translation, 2007

Modèles statistiques enrichis par la syntaxe pour la traduction automatique.
Proceedings of the Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Posters, 2007

Combining Morphosyntactic Enriched Representation with n-best Reranking in Statistical Translation.
Proceedings of the NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation, 2007

A state-of-the-art statistical machine translation system based on Moses.
Proceedings of Machine Translation Summit XI: Papers, 2007

The TALP n-gram-based SMT system for IWSLT 2007.
Proceedings of the 2007 International Workshop on Spoken Language Translation, 2007

Improved machine translation of speech-to-text outputs.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

The LIMSI 2006 TC-STAR EPPS Transcription Systems.
Proceedings of the IEEE International Conference on Acoustics, 2007

Smooth Bilingual N-Gram Translation.
Proceedings of the EMNLP-CoNLL 2007, 2007

2006
Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system.
IEEE Trans. Speech Audio Process., 2006

The LIMSI RT06s Lecture Transcription System.
Proceedings of the Machine Learning for Multimodal Interaction, 2006

Continuous space language models for the IWSLT 2006 task.
Proceedings of the 2006 International Workshop on Spoken Language Translation, 2006

Continuous Space Language Models for Statistical Machine Translation.
Proceedings of the ACL 2006, 2006

2005
Training Neural Network Language Models on Very Large Corpora.
Proceedings of the HLT/EMNLP 2005, 2005

Building continuous space language models for transcribing european languages.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

The 2004 BBN/LIMSI 20xRT English conversational telephone speech recognition system.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Where are we in transcribing French broadcast news?
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004
Neural network language models for conversational speech recognition.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Language recognition using phone latices.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Speech recognition in multiple languages and domains: the 2003 BBN/LIMSI EARS system.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Speech transcription in multiple languages.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Conversational telephone speech recognition.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Connectionist language modeling for large vocabulary continuous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

2000
Boosting Neural Networks.
Neural Comput., 2000

Combining multiple speech recognizers using voting and language model information.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999
Using boosting to improve a hybrid HMM/neural network speech recognizer.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
The Diabolo Classifier.
Neural Comput., 1998

1997
Training Methods for Adaptive Boosting of Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

AdaBoosting Neural Networks: Application to on-line Character Recognition.
Proceedings of the Artificial Neural Networks, 1997

1996
Constraint tangent distance for on-line character recognition.
Proceedings of the 13th International Conference on Pattern Recognition, 1996

A new distance measure for online character recognition.
Proceedings of International Conference on Neural Networks (ICNN'96), 1996

1994
Transformation Invariant Autoassociation with Application to Handwritten Character Recognition.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994


  Loading...