Antoine Doucet

Orcid: 0000-0001-6160-3356

According to our database1, Antoine Doucet authored at least 200 papers between 2002 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Digitizing History: Transitioning Historical Paper Documents to Digital Content for Information Retrieval and Mining - A Comprehensive Survey.
IEEE Trans. Comput. Soc. Syst., October, 2024

Can cross-domain term extraction benefit from cross-lingual transfer and nested term labeling?
Mach. Learn., July, 2024

A Review of Deep Learning Models for Twitter Sentiment Analysis: Challenges and Opportunities.
IEEE Trans. Comput. Soc. Syst., June, 2024

Report on the 19th French Conference on Information Retrieval and Applications (CORIA 2024).
SIGIR Forum, June, 2024

Can we please everyone? Group recommendations in signed social networks.
Multim. Tools Appl., May, 2024

Named Entity Recognition and Classification in Historical Documents: A Survey.
ACM Comput. Surv., February, 2024

L3iTC at the FinLLM Challenge Task: Quantization for Financial Text Classification & Summarization.
CoRR, 2024

Is Prompting What Term Extraction Needs?
Proceedings of the Text, Speech, and Dialogue - 27th International Conference, 2024

CoastTerm: A Corpus for Multidisciplinary Term Extraction in Coastal Scientific Literature.
Proceedings of the Text, Speech, and Dialogue - 27th International Conference, 2024

L3i++ at SemEval-2024 Task 8: Can Fine-tuned Large Language Model Detect Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text?
Proceedings of the 18th International Workshop on Semantic Evaluation, 2024

Global-SEG: Text Semantic Segmentation Based on Global Semantic Pair Relations.
Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024

LIAS: Layout Information-Based Article Separation in Historical Newspapers.
Proceedings of the Linking Theory and Practice of Digital Libraries, 2024

LIT: Label-Informed Transformers on Token-Based Classification.
Proceedings of the Linking Theory and Practice of Digital Libraries, 2024

Leveraging Open Large Language Models for Historical Named Entity Recognition.
Proceedings of the Linking Theory and Practice of Digital Libraries, 2024

Leveraging Transfer Learning for Article Segmentation in Historical Newspapers.
Proceedings of the Linking Theory and Practice of Digital Libraries, 2024

Extraction d'information induite par sous-graphes (SETI) appliquée aux documents administratifs.
Proceedings of the COnférence en Recherche d'Informations et Applications, 2024

STRAS : Une approche à base de règles sémantiques et d'indices textuels pour la séparation des articles dans les journaux historiques.
Proceedings of the COnférence en Recherche d'Informations et Applications, 2024

Benchmarking du jeu de données NAS pour la séparation d'articles dans la presse ancienne.
Proceedings of the COnférence en Recherche d'Informations et Applications, 2024

2023
A survey on bipartite graphs embedding.
Soc. Netw. Anal. Min., December, 2023

PARESv2 : PArish REgistry Survey - Historical Census Table Dataset (19th, 20th centuries) - France.
Dataset, December, 2023

Multi-view multi-objective clustering-based framework for scientific document summarization using citation context.
Appl. Intell., July, 2023

In-depth analysis of the impact of OCR errors on named entity recognition and linking.
Nat. Lang. Eng., March, 2023

An OER on digital historical research on European historical newspapers with the NewsEye platform.
Educ. Inf., 2023

A Comprehensive Survey of Document-level Relation Extraction (2016-2023).
CoRR, 2023

Archive TimeLine Summarization (ATLS): Conceptual Framework for Timeline Generation over Historical Document Collections.
CoRR, 2023

Contextualizing Emerging Trends in Financial News Articles.
CoRR, 2023

The Recent Advances in Automatic Term Extraction: A survey.
CoRR, 2023

Jeu de données de tickets de caisse pour la détection de fraude documentaire.
Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023, 2023

Détection de faux tickets de caisse à l'aide d'entités et de relations basées sur une ontologie de domaine.
Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023, 2023

Oui mais... ChatGPT peut-il identifier des entités dans des documents historiques ?
Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023, 2023

Injection de connaissances temporelles dans la reconnaissance d'entités nommées historiques.
Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023, 2023

A Quantitative Analysis of Noise Impact on Document Ranking.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2023

L3I++ at SemEval-2023 Task 2: Prompting for Multilingual Complex Named Entity Recognition.
Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023

Yes but.. Can ChatGPT Identify Entities in Historical Documents?
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2023

Vision Transformer for Pneumonia Classification in X-ray Images.
Proceedings of the 2023 8th International Conference on Intelligent Information Technology, 2023

Receipt Dataset for Document Forgery Detection.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Detecting Forged Receipts with Domain-Specific Ontology-Based Entities & Relations.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

DocILE Benchmark for Document Information Localization and Extraction.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Analyzing the Impact of Tokenization on Multilingual Epidemic Surveillance in Low-Resource Languages.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Subgraph-Induced Extraction Technique for Information (SETI) from Administrative Documents.
Proceedings of the Document Analysis and Recognition - ICDAR 2023 Workshops, 2023

STRAS: A Semantic Textual-Cues Leveraged Rule-Based Approach for Article Separation in Historical Newspapers.
Proceedings of the Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration, 2023

Benchmarking NAS for Article Separation in Historical Newspapers.
Proceedings of the Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration, 2023

Text Line Detection in Historical Index Tables: Evaluations on a New French PArish REcord Survey Dataset (PARES).
Proceedings of the Leveraging Generative Intelligence in Digital Libraries: Towards Human-Machine Collaboration, 2023

An Explorative Guide on How to Detect Forged Car Insurance Claims with Language Models.
Proceedings of the 15th International Joint Conference on Knowledge Discovery, 2023

Injecting Temporal-Aware Knowledge in Historical Named Entity Recognition.
Proceedings of the Advances in Information Retrieval, 2023

Extended Overview of DocILE 2023: Document Information Localization and Extraction.
Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), 2023

Overview of DocILE 2023: Document Information Localization and Extraction.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2023

2022
Computational Approaches to Digitised Historical Newspapers (Dagstuhl Seminar 22292).
Dagstuhl Reports, July, 2022

HIPE-2022 Shared Task Named Entity Datasets.
Dataset, May, 2022

HIPE-2022 Shared Task Named Entity Datasets.
Dataset, March, 2022

HIPE-2022 Shared Task Named Entity Datasets.
Dataset, March, 2022

HIPE-2022 Shared Task Named Entity Datasets.
Dataset, February, 2022

Correction to: MELHISSA: a multilingual entity linking architecture for historical press articles.
Int. J. Digit. Libr., 2022

MELHISSA: a multilingual entity linking architecture for historical press articles.
Int. J. Digit. Libr., 2022

Assessing the impact of OCR noise on multilingual event detection over digitised documents.
Int. J. Digit. Libr., 2022

Integrated interdisciplinary workflows for research on historical newspapers: Perspectives from humanities scholars, computer scientists, and librarians.
J. Assoc. Inf. Sci. Technol., 2022

Survey of Post-OCR Processing Approaches.
ACM Comput. Surv., 2022

Using contextual sentence analysis models to recognize ESG concepts.
CoRR, 2022

Diachronic Analysis of Time References in News Articles.
Proceedings of the Companion of The Web Conference 2022, Virtual Event / Lyon, France, April 25, 2022

Fine-tuning de modèles de langues pour la veille épidémiologique multilingue avec peu de ressources (Fine-tuning Language Models for Low-resource Multilingual Epidemic Surveillance).
Proceedings of the Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale, 2022

L'importance des entités pour la tâche de détection d'événements en tant que système de question-réponse (Exploring Entities in Event Detection as Question Answering).
Proceedings of the Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale, 2022

Utilizing Keywords Evolution in Context for Emerging Trend Detection in Scientific Publications.
Proceedings of the 11th International Symposium on Information and Communication Technology, 2022

Weighting Sliding Tiles For Writer Identification in Handwritten Musical Scores.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2022

L3i at SemEval-2022 Task 11: Straightforward Additional Context for Multilingual Named Entity Recognition.
Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022

Ensembling Transformers for Cross-domain Automatic Term Extraction.
Proceedings of the From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries, 2022

Experimenting with Unsupervised Multilingual Event Detection in Historical Newspapers.
Proceedings of the From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries, 2022

Adapting Transformers for Detecting Emergency Events on Social Media.
Proceedings of the 14th International Joint Conference on Knowledge Discovery, 2022

Introducing the HIPE 2022 Shared Task: Named Entity Recognition and Linking in Multilingual Historical Documents.
Proceedings of the Advances in Information Retrieval, 2022

Robust and Multilingual Analysis of Historical Documents.
Proceedings of Text2Story, 2022

Exploring Entities in Event Detection as Question Answering.
Proceedings of the Advances in Information Retrieval, 2022

Can Cross-Domain Term Extraction Benefit from Cross-lingual Transfer?
Proceedings of the Discovery Science - 25th International Conference, 2022

ReadOCR: A Novel Dataset and Readability Assessment of OCRed Texts.
Proceedings of the Document Analysis Systems - 15th IAPR International Workshop, 2022

Overview of HIPE-2022: Named Entity Recognition and Linking in Multilingual Historical Documents.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2022

Extended Overview of HIPE-2022: Named Entity Recognition and Linking in Multilingual Historical Documents.
Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to, 2022

Knowledge-based Contexts for Historical Named Entity Recognition & Linking.
Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to, 2022

Tracking News Stories in Short Messages in the Era of Infodemic.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2022

2021
Event representation on Wikidata and Wikipedia with, and without the analysis of vernacular languages.
Dataset, April, 2021

Deep multimodal learning for cross-modal retrieval: One model for all tasks.
Pattern Recognit. Lett., 2021

Event Detection as Question Answering with Entity Information.
CoRR, 2021

Transformer-based Methods for Recognizing Ultra Fine-grained Entities (RUFES).
CoRR, 2021

L3i_LBPAM at the FinSim-2 task: Learning Financial Semantic Similarities with Siamese Transformers.
Proceedings of the Companion of The Web Conference 2021, 2021

Simple Ways to Improve NER in Every Language using Markup.
Proceedings of the 2nd International Workshop on Cross-lingual Event-centric Open Analytics co-located with the 30th The Web Conference (WWW 2021), 2021

Elastic Embedded Background Linking for News Articles with Keywords, Entities and Events.
Proceedings of the Thirtieth Text REtrieval Conference, 2021

Transformer-based Methods with #Entities for Detecting Emergency Events on Social Media.
Proceedings of the Thirtieth Text REtrieval Conference, 2021

A Multilingual Dataset for Named Entity Recognition, Entity Linking and Stance Detection in Historical Newspapers.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

Event Related Document Retrieval with Multilingual Read World Event Representation.
Proceedings of the ISWC 2021 Posters, 2021

HistoInformatics2021: The 6<sup>th</sup> International Workshop on Computational History.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2021

Information Extraction from Invoices.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

Evaluating the Robustness of Embedding-Based Topic Models to OCR Noise.
Proceedings of the Towards Open and Trustworthy Digital Societies, 2021

Multilingual Epidemic Event Extraction.
Proceedings of the Towards Open and Trustworthy Digital Societies, 2021

Named Entity Recognition Architecture Combining Contextual and Global Features.
Proceedings of the Towards Open and Trustworthy Digital Societies, 2021

Token-Level Multilingual Epidemic Dataset for Event Extraction.
Proceedings of the Linking Theory and Practice of Digital Libraries, 2021

A Comprehensive Extraction of Relevant Real-World-Event Qualifiers for Semantic Search Engines.
Proceedings of the Linking Theory and Practice of Digital Libraries, 2021

Event Detection with Entity Markers.
Proceedings of the Advances in Information Retrieval, 2021


Étude comparative de méthodes de classification multilingue appliquées à l'épidémiologie.
Proceedings of the COnférence en Recherche d'Informations et Applications, 2021

État de l'art du changement sémantique à partir de plongements contextualisés.
Proceedings of the COnférence en Recherche d'Informations et Applications, 2021

Atténuer les erreurs de numérisation dans la reconnaissance d'entités nommées pour les documents historiques.
Proceedings of the COnférence en Recherche d'Informations et Applications, 2021

Multi-TimeLine Summarization (MTLS): Improving Timeline Summarization by Generating Multiple Summaries.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Data for "A Dataset for Multi-lingual Epidemiological Event Extraction".
Dataset, March, 2020

Data for "Dataset for Temporal Analysis of English-French Cognates".
Dataset, March, 2020

Meta-analysis of computational methods for breast cancer classification.
Int. J. Intell. Inf. Database Syst., 2020

Improving Skin-Disease Classification Based on Customized Loss Function Combined With Balanced Mini-Batch Logic and Real-Time Image Augmentation.
IEEE Access, 2020

Determining image age with rank-consistent ordinal classification and object-centered ensemble.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

Improving binary skin cancer classification based on best model selection method combined with optimizing full connected layers of Deep CNN.
Proceedings of the International Conference on Multimedia Analysis and Pattern Recognition, 2020

A Dataset for Multi-lingual Epidemiological Event Extraction.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Dataset for Temporal Analysis of English-French Cognates.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Linking Named Entities across Languages using Multilingual Word Embeddings.
Proceedings of the JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, 2020

Neural Machine Translation with BERT for Post-OCR Error Detection and Correction.
Proceedings of the JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, 2020

Accessing and Investigating Large Collections of Historical Newspapers with the NewsEye Platform.
Proceedings of the JCDL '20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, 2020

An Extended Evaluation of the Impact of Different Modules in ST-VQA Systems.
Proceedings of the Pattern Recognition and Artificial Intelligence, 2020

Multi-Attribute Learning With Highly Imbalanced Data.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Entity Linking for Historical Documents: Challenges and Solutions.
Proceedings of the Digital Libraries at Times of Massive Societal Transition, 2020

When to Use OCR Post-correction for Named Entity Recognition?
Proceedings of the Digital Libraries at Times of Massive Societal Transition, 2020

Assessing and Minimizing the Impact of OCR Quality on Named Entity Recognition.
Proceedings of the Digital Libraries for Open Knowledge, 2020

NewsEye: A digital investigator for historical newspapers.
Proceedings of the 15th Annual International Conference of the Alliance of Digital Humanities Organizations, 2020

Alleviating Digitization Errors in Named Entity Recognition for Historical Documents.
Proceedings of the 24th Conference on Computational Natural Language Learning, 2020

Multilingual Epidemiological Text Classification: A Comparative Study.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Robust Named Entity Recognition and Linking on Historical Multilingual Documents.
Proceedings of the Working Notes of CLEF 2020, 2020

Impact Analysis of Document Digitization on Event Extraction.
Proceedings of the 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020) co-located with the 19th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2020), 2020

2019
A lightweight and multilingual framework for crisis information extraction from Twitter data.
Soc. Netw. Anal. Min., 2019

Large Scale Analysis of Semantic and Temporal Aspects in Cultural Heritage Collection's Search.
Proceedings of the 19th ACM/IEEE Joint Conference on Digital Libraries, 2019

Deep Statistical Analysis of OCR Errors for Effective Post-OCR Processing.
Proceedings of the 19th ACM/IEEE Joint Conference on Digital Libraries, 2019

An Analysis of the Performance of Named Entity Recognition over OCRed Documents.
Proceedings of the 19th ACM/IEEE Joint Conference on Digital Libraries, 2019

ICDAR 2019 Competition on Post-OCR Text Correction.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

Post-OCR Error Detection by Generating Plausible Candidates.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

Semantic Text Recognition via Visual Question Answering.
Proceedings of the Second International Workshop on Machine Learning, 2019

Impact of OCR Quality on Named Entity Linking.
Proceedings of the Digital Libraries at the Crossroads of Digital Information for the Future, 2019

Document in Context of its Time (DICT): Providing Temporal Context to Support Analysis of Past Documents.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

Knowledge-Based Techniques for Document Fraud Detection: A Comprehensive Study.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2019

TLR at BSNLP2019: A Multilingual Named Entity Recognition System.
Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing, 2019

2018
Detecting prominent microblog users over crisis events phases.
Inf. Syst., 2018

Find it! Fraud Detection Contest Report.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Adaptive Edit-Distance and Regression Approach for Post-OCR Text Correction.
Proceedings of the Maturity and Innovation in Digital Libraries, 2018

Evaluating the Impact of OCR Errors on Topic Modeling.
Proceedings of the Maturity and Innovation in Digital Libraries, 2018

Feature Selection for Document Flow Segmentation.
Proceedings of the 13th IAPR International Workshop on Document Analysis Systems, 2018

Every Word has its History: Interactive Exploration and Visualization of Word Sense Evolution.
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018

Automatic Matching and Expansion of Abbreviated Phrases Without Context.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2018

Unsupervised Crisis Information Extraction from Twitter Data.
Proceedings of the IEEE/ACM 2018 International Conference on Advances in Social Networks Analysis and Mining, 2018

2017
Exploiting Social Annotations to Generate Resource Descriptions in a Distributed Environment: Cooperative Multi-Agent Simulation on Query-Based Sampling.
Rev. Socionetwork Strateg., 2017

The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions.
Proceedings of the 13th Workshop on Multiword Expressions, 2017

Neural Networks for Multi-Word Expression Detection.
Proceedings of the 13th Workshop on Multiword Expressions, 2017

Impact of OCR Errors on the Use of Digital Libraries: Towards a Better Access to Information.
Proceedings of the 2017 ACM/IEEE Joint Conference on Digital Libraries, 2017

Enhancing Table of Contents Extraction by System Aggregation.
Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

ICDAR2017 Competition on Post-OCR Text Correction.
Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

2016
Report on the Eighth Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR '15).
SIGIR Forum, 2016

Computational generation and dissection of lexical replacement humor.
Nat. Lang. Eng., 2016

Language-independent multi-document text summarization with document-specific word associations.
Proceedings of the 31st Annual ACM Symposium on Applied Computing, 2016

DataTourism: Designing an Architecture to Process Tourism Data.
Proceedings of the Information and Communication Technologies in Tourism 2016, 2016

Nouveau modèle pour la datation automatique de photographies à partir de caractéristiques visuelles.
Proceedings of the CORIA 2016 - Conférence en Recherche d'Informations et Applications, 2016

2015
Multilingual event extraction for epidemic detection.
Artif. Intell. Medicine, 2015

SentiML++: An Extension of the SentiML Sentiment Annotation Scheme.
Proceedings of the Semantic Web: ESWC 2015 Satellite Events - ESWC 2015 Satellite Events Portorož, Slovenia, May 31, 2015

Applying Semantic Web Technologies for Improving the Visibility of Tourism Data.
Proceedings of the Eighth Workshop on Exploiting Semantic Annotations in Information Retrieval, 2015

Identification of Microblogs Prominent Users during Events by Learning Temporal Sequences of Features.
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

Eighth Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR'15).
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

Temporal Reconciliation for Dating Photographs Using Entity Information.
Proceedings of the Eighth Workshop on Exploiting Semantic Annotations in Information Retrieval, 2015

2014
Building engagement for MOOC students: introducing support for time management on online learning platforms.
Proceedings of the 23rd International World Wide Web Conference, 2014

Document summarization based on word associations.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

Dating Color Images with Ordinal Classification.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Timeline Localization.
Proceedings of the Human-Computer Interaction. Theories, Methods, and Tools, 2014

Novel Query Suggestions: Initial Work Report.
Proceedings of the 5th International Workshop on Web-scale Knowledge Representation Retrieval & Reasoning, 2014

2013
Report on INEX 2013.
SIGIR Forum, 2013

Named Entity Filtering Based on Concept Association Graphs.
Res. Comput. Sci., 2013

DAnIEL, parsimonious yet high-coverage multilingual epidemic surveillance (DAnIEL : Veille épidémiologique multilingue parcimonieuse) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2013

Any Language Early Detection of Epidemic Diseases from Web News Streams.
Proceedings of the IEEE International Conference on Healthcare Informatics, 2013

ICDAR 2013 Competition on Book Structure Extraction.
Proceedings of the 12th International Conference on Document Analysis and Recognition, 2013

Overview of the INEX 2013 Social Book Search Track.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013


Added-Value of Automatic Multilingual Text Analysis for Epidemic Surveillance.
Proceedings of the Artificial Intelligence in Medicine, 2013

"Let Everything Turn Well in Your Wife": Generation of Adult Humor Using Lexical Constraints.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2012
Report on INEX 2011.
SIGIR Forum, 2012

Report on INEX 2012.
SIGIR Forum, 2012

Team Association Analysis for Named Entity Filtering.
Proceedings of The Twenty-First Text REtrieval Conference, 2012

DAnIEL: Language Independent Character-Based News Surveillance.
Proceedings of the Advances in Natural Language Processing, 2012

Overview of the INEX 2012 Social Book Search Track.
Proceedings of the CLEF 2012 Evaluation Labs and Workshop, 2012

Extraction, Exploitation and Evaluation of Document-based Knowledge.
, 2012

2011
Report on INEX 2010.
SIGIR Forum, 2011

Setting up a competition framework for the evaluation of structure extraction from OCR-ed books.
Int. J. Document Anal. Recognit., 2011

Overview of the INEX 2011 Books and Social Search Track.
Proceedings of the Focused Retrieval of Content and Structure, 2011

ICDAR 2011 Book Structure Extraction Competition.
Proceedings of the 2011 International Conference on Document Analysis and Recognition, 2011

2010
Report on INEX 2009.
SIGIR Forum, 2010

Automatic discovery of word semantic relations using paraphrase alignment and distributional lexical semantics analysis.
Nat. Lang. Eng., 2010

An efficient any language approach for the integration of phrases in document retrieval.
Lang. Resour. Evaluation, 2010

Overview of the INEX 2010 Book Track: Scaling Up the Evaluation Using Crowdsourcing.
Proceedings of the Comparative Evaluation of Focused Retrieval, 2010

Statistical Methods for the Evaluation of Indexing Phrases.
Proceedings of the KDIR 2010, 2010

2009
Report on INEX 2008.
SIGIR Forum, 2009

A Proposal for a Multilingual Epidemic Surveillance System.
Proceedings of the User Centric Media - First International Conference, 2009

Overview of the INEX 2009 Book Track.
Proceedings of the Focused Retrieval and Evaluation, 2009

ICDAR 2009 Book Structure Extraction Competition.
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

2008
Overview of the INEX 2007 Book Search track: BookSearch '07.
SIGIR Forum, 2008

XML-aided phrase indexing for hypertext documents.
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008

Enhancing Keyword Search with a Keyphrase Index.
Proceedings of the Advances in Focused Retrieval, 2008

Overview of the INEX 2008 Book Track.
Proceedings of the Advances in Focused Retrieval, 2008

New Tasks on Collections of Digitized Books.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2008

2007
Phrase Detection in the Wikipedia.
Proceedings of the Focused Access to XML Documents, 2007

2006
Advanced document description, a sequential approach.
SIGIR Forum, 2006

EXTIRP: Baseline Retrieval from Wikipedia.
Proceedings of the Comparative Evaluation of XML Information Retrieval Systems, 2006

Unsupervised Classification of Text-Centric XML Document Collections.
Proceedings of the Comparative Evaluation of XML Information Retrieval Systems, 2006

2003
Accurate Retrieval of XML Document Fragments using EXTIRP.
Proceedings of the INEX 2003 Workshop Proceedings, 2003

2002
Naïve Clustering of a large XML Document Collection.
Proceedings of the First Workshop of the INitiative for the Evaluation of XML Retrieval (INEX), 2002


  Loading...