Thomas Lavergne

Orcid: 0000-0002-9498-4551

According to our database1, Thomas Lavergne authored at least 72 papers between 2003 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Dataset for Pharmacovigilance in German, French, and Japanese: Annotating Adverse Drug Reactions across Languages.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Vector Spaces for Quantifying Disparity of Multiword Expressions in Annotated Text.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024

2023
Tri-apprentissage génératif : génération de données pour de la reconnaissance d'entitées nommées semi-supervisé.
Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023 - Volume 1 : travaux de recherche originaux, 2023

2022
Simulated Geophysical Noise in Sea Ice Concentration Estimates of Open Water and Snow-Covered Sea Ice.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2022

Decorate the Examples: A Simple Method of Prompt Design for Biomedical Relation Extraction.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Re-train or Train from Scratch? Comparing Pre-training Strategies of BERT in the Medical Domain.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Specializing Static and Contextual Embeddings in the Medical Domain Using Knowledge Graphs: Let's Keep It Simple.
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022

2021
DiaBLa: a corpus of bilingual spontaneous written dialogues for machine translation.
Lang. Resour. Evaluation, 2021

Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing.
Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems, 2021

2020
Handling Entity Normalization with no Annotated Corpus: Weakly Supervised Methods Based on Distributional Representation and Ontological Information.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020


CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Experiments from LIMSI at the French Named Entity Recognition Coarse-grained Task.
Proceedings of the Working Notes of CLEF 2020, 2020

2019
Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Detecting context-dependent sentences in parallel corpora.
Proceedings of the Actes de la Conférence TALN. CORIA-TALN-RJC 2018 - Volume 1, 2018

Corpora with Part-of-Speech Annotations for Three Regional Languages of France: Alsatian, Occitan and Picard.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

2017
Détection de concepts et granularité de l'annotation (Concept detection and annotation granularity ).
Proceedings of the Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Orléans, France, June 26-30, 2017, Volume 2, 2017

Traitement automatique de la langue biomédicale au LIMSI (Biomedical language processing at LIMSI).
Proceedings of the Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles. Orléans, France, June 26-30, 2017 - Volume 3, 2017

Learning the Structure of Variable-Order CRFs: a finite-state perspective.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Multiple Methods for Multi-class, Multi-label ICD-10 Coding of Multi-granularity, Multilingual Death Certificates.
Proceedings of the Working Notes of CLEF 2017, 2017

CLEF eHealth 2017 Multilingual Information Extraction task Overview: ICD10 Coding of Death Certificates in English and French.
Proceedings of the Working Notes of CLEF 2017, 2017

2016

LIMSI$@$WMT'16: Machine Translation of News.
Proceedings of the First Conference on Machine Translation, 2016

Une catégorisation de fins de lignes non-supervisée (End-of-line classification with no supervision).
Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 2 : TALN (Posters), 2016

LIMSI@IWSLT'16: MT Track.
Proceedings of the 13th International Conference on Spoken Language Translation, 2016

Two-Step MT: Predicting Target Morphology.
Proceedings of the 13th International Conference on Spoken Language Translation, 2016

Supervised classification of end-of-lines in clinical text with no manual annotation.
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining, 2016

A Dataset for ICD-10 Coding of Death Certificates: Creation and Usage.
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining, 2016

LIMSI ICD10 coding Experiments on CépiDC Death Certificate Statements.
Proceedings of the Working Notes of CLEF 2016, 2016

Clinical Information Extraction at the CLEF eHealth Evaluation lab 2016.
Proceedings of the Working Notes of CLEF 2016, 2016

Hybrid methods for ICD-10 coding of death certificates.
Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis, 2016

2015
The contribution of co-reference resolution to supervised relation detection between bacteria and biotopes entities.
BMC Bioinform., December, 2015

LIMSI$@$WMT'15 : Translation Task.
Proceedings of the Tenth Workshop on Statistical Machine Translation, 2015

Etiquetage morpho-syntaxique en domaine de spécialité: le domaine médical.
Proceedings of the Actes de la 22e conference sur le Traitement Automatique des Langues Naturelles. Articles courts, 2015

Oublier ce qu'on sait, pour mieux apprendre ce qu'on ne sait pas : une étude sur les contraintes de type dans les modèles CRF.
Proceedings of the Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2015

LIMSI @ CLEF eHealth 2015 - Task 1b.
Proceedings of the Working Notes of CLEF 2015, 2015

2014
Natural language processing of radiology reports for the detection of thromboembolic diseases and clinically relevant incidental findings.
BMC Bioinform., 2014

LIMSI $@$ WMT'14 Medical Translation Task.
Proceedings of the Ninth Workshop on Statistical Machine Translation, 2014

Automatic language identity tagging on word and sentence-level in multilingual text sources: a case-study on Luxembourgish.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Optimizing annotation efforts to build reliable annotated corpora for training statistical models.
Proceedings of the 8th Linguistic Annotation Workshop, 2014

2013
LIMSI @ WMT13.
Proceedings of the Eighth Workshop on Statistical Machine Translation, 2013

A fully discriminative training framework for Statistical Machine Translation (Un cadre d'apprentissage intégralement discriminant pour la traduction statistique) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2013

Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Discriminative training of a phoneme confusion model for a dynamic lexicon in ASR.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A Supervised Abbreviation Resolution System for Medical Text.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013

A Supervised Named-Entity Extraction System for Medical Text.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013

LIMSI's participation to the 2013 shared task on Native Language Identification.
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, 2013

Automatic Named Entity Pre-annotation for Out-of-domain Human Annotation.
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, 2013

2012
Use of C-Band Scatterometer for Sea Ice Edge Identification.
IEEE Trans. Geosci. Remote. Sens., 2012

LIMSI @ WMT12.
Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

Joint WMT 2012 Submission of the QUAERO Project.
Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

Repérage des entités nommées pour l'arabe : adaptation non-supervisée et combinaison de systèmes (Named Entity Recognition for Arabic : Unsupervised adaptation and Systems combination) [in French].
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

Joint Segmentation and POS Tagging for Arabic Using a CRF-based Classifier.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

2011
Filtering artificial texts with statistical machine learning techniques.
Lang. Resour. Evaluation, 2011

Designing an Improved Discriminative Word Aligner.
Int. J. Comput. Linguistics Appl., 2011

From n-gram-based to CRF-based Translation Models.
Proceedings of the Sixth Workshop on Statistical Machine Translation, 2011


A First LVCSR System for Luxembourgish, a Low-Resourced European Language.
Proceedings of the Human Language Technology Challenges for Computer Science and Linguistics, 2011

LIMSI's experiments in domain adaptation for IWSLT11.
Proceedings of the 2011 International Workshop on Spoken Language Translation, 2011

Advances on spoken language translation in the Quaero program.
Proceedings of the 2011 International Workshop on Spoken Language Translation, 2011

2010
Consolidating the Two-Stream Inversion Package (JRC-TIP) to Retrieve Land Surface Parameters From Albedo Products.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2010

Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labeling.
IEEE J. Sel. Top. Signal Process., 2010

Practical Very Large Scale CRFs.
Proceedings of the ACL 2010, 2010

2009
Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labelling
CoRR, 2009

Introduction of a new paraphrase generation tool based on Monte-Carlo sampling.
Proceedings of the ACL 2009, 2009

2008
Tracking Web spam with HTML style similarities.
ACM Trans. Web, 2008

Detecting Fake Content with Relative Entropy Scoring.
Proceedings of the ECAI'08 Workshop on Uncovering Plagiarism, 2008

2007
Validation of the operational MERIS FAPAR.
Proceedings of the IEEE International Geoscience & Remote Sensing Symposium, 2007

2006
Unnatural language detection.
Proceedings of the COnférence en Recherche d'Infomations et Applications, 2006

Tracking Web Spam with Hidden Style Similarity.
Proceedings of the AIRWeb 2006, 2006

2005
Using 1-D Models to Interpret the Reflectance Anisotropy of 3-D Canopy Targets: Issues and Caveats.
IEEE Trans. Geosci. Remote. Sens., 2005

2003
Mediterranean sea wind and wave characteristics from satellite, buoy and numerical model data.
Proceedings of the 2003 IEEE International Geoscience and Remote Sensing Symposium, 2003


  Loading...