Thomas Lavergne

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Re-train or Train from Scratch? Comparing Pre-training Strategies of BERT in the Medical Domain.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Specializing Static and Contextual Embeddings in the Medical Domain Using Knowledge Graphs: Let's Keep It Simple.

[BibT_eX]

[DOI]

Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022

2021

DiaBLa: a corpus of bilingual spontaneous written dialogues for machine translation.

[BibT_eX]

[DOI]

Lang. Resour. Evaluation, 2021

Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems, 2021

2020

Handling Entity Normalization with no Annotated Corpus: Weakly Supervised Methods Based on Distributional Representation and Ontological Information.

[BibT_eX]

[DOI]

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Automatic Removal of Identifying Information in Official EU Languages for Public Administrations: The MAPA Project.

[BibT_eX]

[DOI]

Proceedings of the Legal Knowledge and Information Systems, 2020

CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Computational Linguistics, 2020

Experiments from LIMSI at the French Named Entity Recognition Coarse-grained Task.

[BibT_eX]

[DOI]

Sahar Ghannay

Proceedings of the Working Notes of CLEF 2020, 2020

2019

Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition.

[BibT_eX]

[DOI]

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018

Detecting context-dependent sentences in parallel corpora.

[BibT_eX]

[DOI]

Rachel Bawden

Sophie Rosset

Proceedings of the Actes de la Conférence TALN. CORIA-TALN-RJC 2018 - Volume 1, 2018

Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard.

[BibT_eX]

[DOI]

Marianne Vergez-Couret

Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

2017

Détection de concepts et granularité de l'annotation (Concept detection and annotation granularity ).

[BibT_eX]

[DOI]

Proceedings of the Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Orléans, France, June 26-30, 2017, Volume 2, 2017

Traitement automatique de la langue biomédicale au LIMSI (Biomedical language processing at LIMSI).

[BibT_eX]

[DOI]

Christopher R. Norman

Proceedings of the Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Orléans, France, June 26-30, 2017 - Volume 3, 2017

Learning the Structure of Variable-Order CRFs: a finite-state perspective.

[BibT_eX]

[DOI]

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Multiple Methods for Multi-class, Multi-label ICD-10 Coding of Multi-granularity, Multilingual Death Certificates.

[BibT_eX]

[DOI]

Proceedings of the Working Notes of CLEF 2017, 2017

CLEF eHealth 2017 Multilingual Information Extraction task Overview: ICD10 Coding of Death Certificates in English and French.

[BibT_eX]

[DOI]

Aude Robert

Robert Anderson

Kevin Bretonnel Cohen

Proceedings of the Working Notes of CLEF 2017, 2017

2016

The QT21/HimL Combined Machine Translation System.

[BibT_eX]

[DOI]

Proceedings of the First Conference on Machine Translation, 2016

LIMSI$@$WMT'16: Machine Translation of News.

[BibT_eX]

[DOI]

Proceedings of the First Conference on Machine Translation, 2016

Une catégorisation de fins de lignes non-supervisée (End-of-line classification with no supervision).

[BibT_eX]

[DOI]

Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 2 : TALN (Posters), 2016

LIMSI@IWSLT'16: MT Track.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Spoken Language Translation, 2016

Two-Step MT: Predicting Target Morphology.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Spoken Language Translation, 2016

Supervised classification of end-of-lines in clinical text with no manual annotation.

[BibT_eX]

[DOI]

Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining, 2016

A Dataset for ICD-10 Coding of Death Certificates: Creation and Usage.

[BibT_eX]

[DOI]

Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining, 2016

LIMSI ICD10 coding Experiments on CépiDC Death Certificate Statements.

[BibT_eX]

[DOI]

Proceedings of the Working Notes of CLEF 2016, 2016

Clinical Information Extraction at the CLEF eHealth Evaluation lab 2016.

[BibT_eX]

[DOI]

Kevin Bretonnel Cohen

Proceedings of the Working Notes of CLEF 2016, 2016

Hybrid methods for ICD-10 coding of death certificates.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis, 2016

2015

The contribution of co-reference resolution to supervised relation detection between bacteria and biotopes entities.

[BibT_eX]

[DOI]

BMC Bioinform., December, 2015

LIMSI$@$WMT'15 : Translation Task.

[BibT_eX]

[DOI]

Proceedings of the Tenth Workshop on Statistical Machine Translation, 2015

Etiquetage morpho-syntaxique en domaine de spécialité: le domaine médical.

[BibT_eX]

[DOI]

Christelle Rabary

Leonardo Campillos Llanos

Proceedings of the Actes de la 22e conference sur le Traitement Automatique des Langues Naturelles. Articles courts, 2015

Oublier ce qu'on sait, pour mieux apprendre ce qu'on ne sait pas : une étude sur les contraintes de type dans les modèles CRF.

[BibT_eX]

[DOI]

Proceedings of the Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2015

LIMSI @ CLEF eHealth 2015 - Task 1b.

[BibT_eX]

[DOI]

Eva D'hondt

François Morlane-Hondère

Dhouha Bouamor

Swen Ribeiro

Proceedings of the Working Notes of CLEF 2015, 2015

2014

Natural language processing of radiology reports for the detection of thromboembolic diseases and clinically relevant incidental findings.

[BibT_eX]

[DOI]

BMC Bioinform., 2014

LIMSI $@$ WMT'14 Medical Translation Task.

[BibT_eX]

[DOI]

Proceedings of the Ninth Workshop on Statistical Machine Translation, 2014

Automatic language identity tagging on word and sentence-level in multilingual text sources: a case-study on Luxembourgish.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Optimizing annotation efforts to build reliable annotated corpora for training statistical models.

[BibT_eX]

[DOI]

Proceedings of the 8th Linguistic Annotation Workshop, 2014

2013

LIMSI @ WMT13.

[BibT_eX]

[DOI]

Proceedings of the Eighth Workshop on Statistical Machine Translation, 2013

A fully discriminative training framework for Statistical Machine Translation (Un cadre d'apprentissage intégralement discriminant pour la traduction statistique) [in French].

[BibT_eX]

[DOI]

Alexandre Allauzen

Proceedings of the Traitement Automatique des Langues Naturelles, 2013

Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Discriminative training of a phoneme confusion model for a dynamic lexicon in ASR.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A Supervised Abbreviation Resolution System for Medical Text.

[BibT_eX]

[DOI]

Proceedings of the Working Notes for CLEF 2013 Conference , 2013

A Supervised Named-Entity Extraction System for Medical Text.

[BibT_eX]

[DOI]

Proceedings of the Working Notes for CLEF 2013 Conference , 2013

LIMSI's participation to the 2013 shared task on Native Language Identification.

[BibT_eX]

[DOI]

Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, 2013

Automatic Named Entity Pre-annotation for Out-of-domain Human Annotation.

[BibT_eX]

[DOI]

Sophie Rosset

Mohamed Ameur Ben Jannet

Jérémy Leixa

Olivier Galibert

Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, 2013

2012

Use of C-Band Scatterometer for Sea Ice Edge Identification.

[BibT_eX]

[DOI]

Lars-Anders Breivik

Steinar Eastwood

IEEE Trans. Geosci. Remote. Sens., 2012

LIMSI @ WMT12.

[BibT_eX]

[DOI]

Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

Joint WMT 2012 Submission of the QUAERO Project.

[BibT_eX]

[DOI]

Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

Repérage des entités nommées pour l'arabe : adaptation non-supervisée et combinaison de systèmes (Named Entity Recognition for Arabic : Unsupervised adaptation and Systems combination) [in French].

[BibT_eX]

[DOI]

Souhir Gahbiche-Braham

Hélène Bonneau-Maynard

Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

Joint Segmentation and POS Tagging for Arabic Using a CRF-based Classifier.

[BibT_eX]

[DOI]

Souhir Gahbiche-Braham

Hélène Bonneau-Maynard

Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

2011

Filtering artificial texts with statistical machine learning techniques.

[BibT_eX]

[DOI]

Tanguy Urvoy

Lang. Resour. Evaluation, 2011

Designing an Improved Discriminative Word Aligner.

[BibT_eX]

[DOI]

Int. J. Comput. Linguistics Appl., 2011

From n-gram-based to CRF-based Translation Models.

[BibT_eX]

[DOI]

Proceedings of the Sixth Workshop on Statistical Machine Translation, 2011

LIMSI @ WMT11.

[BibT_eX]

[DOI]

Alexandre Allauzen

Hélène Bonneau-Maynard

Proceedings of the Sixth Workshop on Statistical Machine Translation, 2011

A First LVCSR System for Luxembourgish, a Low-Resourced European Language.

[BibT_eX]

[DOI]

Proceedings of the Human Language Technology Challenges for Computer Science and Linguistics, 2011

LIMSI's experiments in domain adaptation for IWSLT11.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Workshop on Spoken Language Translation, 2011

Advances on spoken language translation in the Quaero program.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Workshop on Spoken Language Translation, 2011

2010

Consolidating the Two-Stream Inversion Package (JRC-TIP) to Retrieve Land Surface Parameters From Albedo Products.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2010

Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labeling.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2010

Practical Very Large Scale CRFs.

[BibT_eX]

[DOI]

Olivier Cappé

Proceedings of the ACL 2010, 2010

2009

Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labelling

[BibT_eX]

[DOI]

CoRR, 2009

Introduction of a new paraphrase generation tool based on Monte-Carlo sampling.

[BibT_eX]

[DOI]

Proceedings of the ACL 2009, 2009

2008

Tracking Web spam with HTML style similarities.

[BibT_eX]

[DOI]

ACM Trans. Web, 2008

Detecting Fake Content with Relative Entropy Scoring.

[BibT_eX]

[DOI]

Tanguy Urvoy

Proceedings of the ECAI'08 Workshop on Uncovering Plagiarism, 2008

2007

Validation of the operational MERIS FAPAR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Geoscience & Remote Sensing Symposium, 2007

2006

Unnatural language detection.

[BibT_eX]

[DOI]

Proceedings of the COnférence en Recherche d'Infomations et Applications, 2006

Tracking Web Spam with Hidden Style Similarity.

[BibT_eX]

[DOI]

Tanguy Urvoy