Béatrice Daille

  • University of Nantes, LINA

According to our database1, Béatrice Daille authored at least 98 papers between 1994 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


ACL-rlg: A Dataset for Reading List Generation.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

Self-Compositional Data Augmentation for Scientific Keyphrase Generation.
CoRR, 2024

Adaptation of Biomedical and Clinical Pretrained Models to French Long Documents: A Comparative Study.
CoRR, 2024

DrBenchmark: A Large Language Understanding Evaluation Benchmark for French Biomedical Domain.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

How Important Is Tokenization in French Medical Masked Language Models?
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

DrBERT: Un modèle robuste pré-entraîné en français pour les domaines biomédical et clinique.
Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023, 2023

Classification de relation pour la génération de mots-clés absents.
Proceedings of the Actes de CORIA-TALN 2023. Actes de l'atelier "Analyse et Recherche de Textes Scientifiques", 2023

Projet NaviTerm : navigation terminologique pour une montée en compétence rapide et personnalisée sur un domaine de recherche.
Proceedings of the Actes de CORIA-TALN 2023. Actes de l'atelier "Analyse et Recherche de Textes Scientifiques", 2023

DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Cross-lingual and Cross-domain Transfer Learning for Automatic Term Extraction from Low Resource Data.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain.
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022

A Large-Scale Dataset for Biomedical Keyphrase Generation.
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022

Caractérisation des relations sémantiques entre termes multi-mots fondée sur l'analogie (Semantic relations recognition between multi-word terms by means of analogy ).
Proceedings of the Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale, 2021

Keyword extraction: Issues and methods.
Nat. Lang. Eng., 2020

Books of Hours. the First Liturgical Data Set for Text Segmentation.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Large-Scale Evaluation of Keyphrase Extraction Models.
Proceedings of the JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, 2020

Hierarchical Text Segmentation for Medieval Manuscripts.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

The DELICES Project: Indexing Scientific Literature Through Semantic Expansion.
Proceedings of the First Joint Conference of the Information Retrieval Communities in Europe (CIRCLE 2020), 2020

Identification of Fertile Translations in Comparable Corpora: A Morpho-Compositional Approach.
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers, 2020

Automatic segmentation of texts into units of meaning for reading assistance.
CoRR, 2019

Terminology systematization for Cybersecurity domain in Italian Language.
Proceedings of the Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Terminologie et Intelligence Artificielle (atelier TALN-RECITAL & IC), 2019

Réutilisation de Textes dans les Manuscrits Anciens (Text Reuse in Ancient Manuscripts).
Proceedings of the Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume II : Articles courts, 2019

KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents.
Proceedings of the 12th International Conference on Natural Language Generation, 2019

Towards Automatic Variant Analysis of Ancient Devotional Texts.
Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change, 2019

Towards a Diagnosis of Textual Difficulties for Children with Dyslexia.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Word Embedding Approach for Synonym Extraction of Multi-Word Terms.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Segmentation automatique d'un texte en rhèses (Automatic segmentation of a text into rhesis).
Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 2 : TALN (Posters), 2016

Extraction de lexiques bilingues à partir de corpus comparables spécialisés à travers une langue pivot (Bilingual lexicon extraction from specialized comparable corpora using a pivot language).
Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 2 : TALN (Articles longs), 2016

Extraction d'expressions-cibles de l'opinion : de l'anglais au français (Opinion Target Expression extraction : from English to French).
Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 2 : TALN (Posters), 2016

Modélisation unifiée du document et de son domaine pour une indexation par termes-clés libre et contrôlée (Unified document and domain-specific model for keyphrase extraction and assignment.
Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 2 : TALN (Articles longs), 2016

Evaluating Lexical Similarity to build Sentiment Similarity.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Bilingual Lexicon Extraction at the Morpheme Level Using Distributional Analysis.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Ambiguity Diagnosis for Terms in Digital Humanities.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

TermITH-Eval: a French Standard-Based Resource for Keyphrase Extraction Evaluation.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Keyphrase Annotation with Graph Co-Ranking.
Proceedings of the COLING 2016, 2016

Terminology Extraction with Term Variant Detection.
Proceedings of ACL-2016 System Demonstrations, Berlin, Germany, August 7-12, 2016, 2016

Méthode semi-compositionnelle pour l'extraction de synonymes des termes complexes.
Trait. Autom. des Langues, 2015

Vers un diagnostic d'ambiguïté des termes candidats d'un texte.
Proceedings of the Actes de la 22e conference sur le Traitement Automatique des Langues Naturelles. Articles courts, 2015

Extraction de Contextes Riches en Connaissances en corpus spécialisés.
Proceedings of the Actes de la 22e conference sur le Traitement Automatique des Langues Naturelles. Articles courts, 2015

Attempting to Bypass Alignment from Comparable Corpora via Pivot Language.
Proceedings of the Eighth Workshop on Building and Using Comparable Corpora, 2015

Trait. Autom. des Langues, 2014

Tools for Terminology Processing.
CoRR, 2014

The impact of domains for Keyphrase extraction (Influence des domaines de spécialité dans l'extraction de termes-clés) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2014

Semi-compositional Method for Synonym Extraction of Multi-Word Terms.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Compound Terms and Their Multi-word Variants: Case of German and Russian Languages.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2014

Trait. Autom. des Langues, 2013

Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Extraction.
Res. Comput. Sci., 2013

Apopsis Demonstrator for Tweet Analysis (Démonstrateur Apopsis pour l'analyse des tweets) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2013

Identification, Alignment, and Tranlsation of Relational Adjectives from Comparable Corpora (Identification, alignement, et traductions des adjectifs relationnels en corpus comparables) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2013

TTC TermSuite - Terminological Alignment from Comparable Corpora (TTC TermSuite alignement terminologique à partir de corpus comparables) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2013

Multilingual Compound Splitting (Segmentation Multilingue des Mots Composés) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2013

TTC: Terminology Extraction, Translation Tools and Comparable Corpora Cross-lingual Knowledge Extraction (XLike).
Proceedings of Machine Translation Summit XIV: European projects, 2013

Ranking Translation Candidates Acquired from Comparable Corpora.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

TopicRank: Graph-Based Topic Ranking for Keyphrase Extraction.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

Bilingual Terminology Mining from Language for Special Purposes Comparable Corpora.
Proceedings of the Building and Using Comparable Corpora., 2013

Identification of Fertile Translations in Medical Comparable Corpora: a Morpho-Compositional Approach
CoRR, 2012

Compositionnalité et contextes issus de corpus comparables pour la traduction terminologique (Compositionality and Context for Bilingual Lexicon Extraction from Comparable Corpora) [in French].
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

Revising the Compositional Method for Terminology Acquisition from Comparable Corpora.
Proceedings of the COLING 2012, 2012

Extraction of Domain-Specific Bilingual Lexicon from Comparable Corpora: Compositional Translation and Ranking.
Proceedings of the COLING 2012, 2012

Clustering Short Text and Its Evaluation.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2012

Neoclassical Compound Alignments from Comparable Corpora.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2012

Detecting Derivatives using Specific and Invariant Descriptors.
Polibits, 2011

Annotating opinion - evaluation of blogs: the Blogoscopy corpus.
Lang. Resour. Evaluation, 2011

Identifier la cible d'un passage d'opinion dans un corpus multithématique (Identifying the target of an opinion transition in a thematic corpus).
Proceedings of the Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2011

TTC TermSuite : une chaîne de traitement pour la fouille terminologique multilingue (TTC TermSuite: a processing chain for multilingual terminology mining).
Proceedings of the Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Démonstrations, 2011

Reduction of Search Space to Annotate Monolingual Corpora.
Proceedings of the Fifth International Joint Conference on Natural Language Processing, 2011

TTC TermSuite - A UIMA Application for Multilingual Terminology Extraction from Comparable Corpora.
Proceedings of the IJCNLP 2011 System Demonstrations, 2011

Compilation of Specialized Comparable Corpora in French and Japanese.
Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora, 2011

Brains, not brawn: The use of "smart" comparable corpora in bilingual terminology mining.
ACM Trans. Speech Lang. Process., 2010

Compositionality and lexical alignment of multi-word terms.
Lang. Resour. Evaluation, 2010

Evaluation de descripteurs statistiques et linguistiques pour la détection de dérivation de texte.
Document Numérique, 2010

UNPMC: Naive Approach to Extract Keyphrases from Scientific Articles.
Proceedings of the 5th International Workshop on Semantic Evaluation, 2010

Learning Subjectivity Phrases missing from Resources through a Large Set of Semantic Tests.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Catégorisation des évaluations dans un corpus de blogs multi-domaine.
Proceedings of the Fouille de Données d'Opinions, 2009

Joint signal and transcription analysis for named speaker identification.
Trait. Autom. des Langues, 2009

Pattern Based Term Extraction Using ACABIT System
CoRR, 2009

Catégorisation sémantico-discursive des évaluations exprimées dans la blogosphère.
Proceedings of the Actes de la 16ème conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2009

Reconnaissance de critères de comparabilité dans un corpus multilingue spécialisé.
Proceedings of the COnférence en Recherche d'Infomations et Applications, 2009

Characterization of Scientific and Popular Science Discourse in French, Japanese and Russian.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

A Multi-Word Term Extraction Program for Arabic Language.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

Multi-word term indexing for Arabic document retrieval.
Proceedings of the 13th IEEE Symposium on Computers and Communications (ISCC 2008), 2008

An Effective Compositional Model for Lexical Alignment.
Proceedings of the Third International Joint Conference on Natural Language Processing, 2008

Caractérisation des discours scientifiques et vulgarisés en français, japonais et russe.
Proceedings of the Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Posters, 2007

A Service Oriented Architecture for Adaptable Terminology Acquisition.
Proceedings of the Natural Language Processing and Information Systems, 2007

Bilingual Terminology Mining - Using Brain, not brawn comparable corpora.
Proceedings of the ACL 2007, 2007

Comparabilité de corpus et fouille terminologique multilingue.
Trait. Autom. des Langues, 2006

Une architecture de services pour mieux spécialiser les processus d'acquisition terminologique.
Trait. Autom. des Langues, 2006

French-English Terminology Extraction from Comparable Corpora.
Proceedings of the Natural Language Processing, 2005

Extraction de terminologies bilingues à partir de corpus comparables.
Proceedings of the Actes de la 11ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2004

French-English Multi-word Term Alignment Based on Lexical Context Analysis.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

In vitro evaluation of a program for machine-aided indexing.
Inf. Process. Manag., 2002

Terminology Mining.
Proceedings of the Information Extraction in the Web Era: Natural Language Communication for Knowledge Acquisition and Intelligent Information Agents, 2002

Extracting French-Japanese Word Pairs from Bilingual Corpora based on Transliteration Rules.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Incremental Recognition and Referential Categorization of French Proper Names.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Morphological Rule Induction for Terminology Acquisition.
Proceedings of the COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31, 2000

Lexical database and information access: a fruitful association?
Proceedings of the First International Conference on Language Resources and Evaluation, 1998

Bricks and Skeletons: Some Ideas for the Near Future of MAHT.
Mach. Transl., 1997

Towards Automatic Extraction of Monolingual and Bilingual Terminology.
Proceedings of the 15th International Conference on Computational Linguistics, 1994
