2024
Diachronic Document Dataset for Semantic Layout Analysis.
CoRR, 2024
CamemBERT 2.0: A Smarter French Language Model Aged to Perfection.
CoRR, 2024
Molyé: A Corpus-based Approach to Language Contact in Colonial France.
CoRR, 2024
In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation.
CoRR, 2024
Towards Zero-Shot Multimodal Machine Translation.
CoRR, 2024
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus.
CoRR, 2024
Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck.
CoRR, 2024
SpiRit-LM: Interleaved Spoken and Written Language Model.
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
PatentEval: Understanding Errors in Patent Generation.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
Headless Language Models: Learning without Predicting with Contrastive Weight Tying.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Tree of Problems: Improving structured problem solving with compositionality.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Anisotropy Is Inherent to Self-Attention in Transformers.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024
Mieux comprendre les modèles de langue et les textes qu'ils produisent.
Proceedings of the COnférence en Recherche d'Informations et Applications, 2024
Making Sentence Embeddings Robust to User-Generated Content.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
On the Scaling Laws of Geographical Representation in Language Models.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
When Your Cousin Has the Right Connections: Unsupervised Bilingual Lexicon Induction for Related Data-Imbalanced Languages.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
From Text to Source: Results in Detecting Large Language Model-Generated Content.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
2023
Generative Spoken Dialogue Language Modeling.
,
,
,
,
,
,
,
,
,
,
Trans. Assoc. Comput. Linguistics, 2023
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations.
CoRR, 2023
Is Anisotropy Inherent to Transformers?
CoRR, 2023
A Simple Method for Unsupervised Bilingual Lexicon Induction for Data-Imbalanced, Closely Related Language Pairs.
CoRR, 2023
RoCS-MT: Robustness Challenge Set for Machine Translation.
Proceedings of the Eighth Conference on Machine Translation, 2023
Exploring Data-Centric Strategies for French Patent Classification: A Baseline and Comparisons.
Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023 - Volume 1 : travaux de recherche originaux, 2023
Cross-lingual Strategies for Low-resource Language Modeling: A Study on Five Indic Dialects.
Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023 - Volume 1 : travaux de recherche originaux, 2023
Towards a Robust Detection of Language Model-Generated Text: Is ChatGPT that easy to detect?
Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023 - Volume 1 : travaux de recherche originaux, 2023
Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Neural Agents Struggle to Take Turns in Bidirectional Emergent Communication.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Generative Spoken Language Model based on continuous word-sized audio tokens.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Data-Efficient French Language Modeling with CamemBERTa.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Trans. Assoc. Comput. Linguistics, 2022
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon.
Trans. Assoc. Comput. Linguistics, 2022
Are Discrete Units Necessary for Spoken Language Modeling?
IEEE J. Sel. Top. Signal Process., 2022
MANTa: Efficient Gradient-Based Tokenization for Robust End-to-End Language Modeling.
CoRR, 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
et al.
CoRR, 2022
MaskEval: Weighted MLM-Based Evaluation for Text Summarization and Simplification.
CoRR, 2022
Inria-ALMAnaCH at WMT 2022: Does Transcription Help Cross-Script Machine Translation?
Proceedings of the Seventh Conference on Machine Translation, 2022
Quand être absent de mBERT n'est que le commencement : Gérer de nouvelles langues à l'aide de modèles de langues multilingues (When Being Unseen from mBERT is just the Beginning : Handling New Languages With Multilingual Language Models).
Proceedings of the Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale, 2022
Le projet FREEM : ressources, outils et enjeux pour l'étude du français d'Ancien Régime (The F RE EM project: Resources, tools and challenges for the study of Ancien Régime French).
Proceedings of the Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale, 2022
MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
BERTrade: Using Contextual Embeddings to Parse Old French.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
Complex Labelling and Similarity Prediction in Legal Texts: Automatic Analysis of France's Court of Cassation Rulings.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
Automatic Normalisation of Early Modern French.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
Towards a Cleaner Document-Oriented Multilingual Crawled Corpus.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
MANTa: Efficient Gradient-Based Tokenization for End-to-End Robust Language Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Probing Multilingual Cognate Prediction Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022
2021
Mapping Urban Air Quality from Mobile Sensors Using Spatio-Temporal Geostatistics.
Sensors, 2021
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP.
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
CoRR, 2021
Rethinking Automatic Evaluation in Sentence Simplification.
CoRR, 2021
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021
Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021
Can Character-based Language Models Improve Downstream Task Performances In Low-Resource And Noisy Language Scenarios?
Proceedings of the Seventh Workshop on Noisy User-generated Text, 2021
Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task?
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021
2020
Multilingual Unsupervised Sentence Simplification.
CoRR, 2020
Can Multilingual Language Models Transfer to an Unseen Dialect? A Case Study on North African Arabizi.
CoRR, 2020
Les modèles de langue contextuels Camembert pour le français : impact de la taille et de l'hétérogénéité des données d'entrainement (C AMEM BERT Contextual Language Models for French: Impact of Training Data Size and Heterogeneity ).
Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020
Establishing a New State-of-the-Art for French Named Entity Recognition.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020
Controllable Sentence Simplification.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020
OFrLex: A Computational Morphological and Syntactic Lexicon for Old French.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020
Methodological Aspects of Developing and Managing an Etymological Lexical Resource: Introducing EtymDB-2.0.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020
Evaluating the Reliability of Acoustic Speech Embeddings.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
Building a User-Generated Content North-African Arabizi Treebank: Tackling Hell.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
CamemBERT: a Tasty French Language Model.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
2019
Modeling German Verb Argument Structures: LSTMs vs. Humans.
CoRR, 2019
Reference-less Quality Estimation of Text Simplification Systems.
CoRR, 2019
Développement d'un lexique morphologique et syntaxique de l'ancien français (Development of a morphological and syntactic lexicon of Old French).
Proceedings of the Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume II : Articles courts, 2019
Enhancing BERT for Lexical Normalization.
Proceedings of the 5th Workshop on Noisy User-generated Text, 2019
What Does BERT Learn about the Structure of Language?
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019
2018
Cheating a Parser to Death: Data-driven Cross-Treebank Annotation Transfer.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018
A multilingual collection of CoNLL-U-compatible morphological lexicons.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018
CoNLL-UL: Universal Morphological Lattices for Universal Dependency Parsing.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018
ELMoLex: Connecting ELMo and Lexicon Features for Dependency Parsing.
Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Brussels, Belgium, October 31, 2018
Informatiser le lexique - Modélisation, développement et exploitation de lexiques morphologiques, syntaxiques et sémantiques. (Computerising the lexicon - Modelling, development and use of morphological, syntactic and semantic lexicons).
, 2018
2017
Inferring Inflection Classes with Description Length.
J. Lang. Model., 2017
Construction automatique d'une base de données étymologiques à partir du wiktionary (Automatic construction of an etymological database using Wiktionary).
Proceedings of the Actes des 24ème Conférence sur le Traitement Automatique des Langues Naturelles, 2017
Speeding up corpus development for linguistic research: language documentation and acquisition in Romansh Tuatschin.
Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, 2017
Improving neural tagging with lexical information.
Proceedings of the 15th International Conference on Parsing Technologies, 2017
The ParisNLP entry at the ConLL UD Shared Task 2017: A Tale of a #ParsingTragedy.
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 2017
Annotating omission in statement pairs.
Proceedings of the 11th Linguistic Annotation Workshop, 2017
2016
External Lexical Information for Multilingual Part-of-Speech Tagging.
CoRR, 2016
Étiquetage multilingue en parties du discours avec MElt (Multilingual part-of-speech tagging with MElt).
Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 2 : TALN (Posters), 2016
From Noisy Questions to Minecraft Texts: Annotation Challenges in Extreme Syntax Scenario.
Proceedings of the 2nd Workshop on Noisy User-generated Text, 2016
2015
Constructing a poor man's wordnet in a resource-rich world.
Lang. Resour. Evaluation, 2015
2014
Data-driven synset induction and disambiguation for wordnet development.
Lang. Resour. Evaluation, 2014
The CoMeRe corpus for French: structuring and annotating heterogeneous CMC genres.
J. Lang. Technol. Comput. Linguistics, 2014
Named Entity Recognition and Correction in OCRized Corpora (Détection et correction automatique d'entités nommées dans des corpus OCRisés) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2014
Sub-categorization in 'pour' and lexical syntax (Sous-catégorisation en pour et syntaxe lexicale) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2014
Analogy-based Text Normalization : the case of unknowns words (Normalisation de textes par analogie: le cas des mots inconnus) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2014
A language-independent and fully unsupervised approach to lexicon induction and part-of-speech tagging for closely related languages.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
DeLex, a freely-avaible, large-scale and linguistically grounded morphological lexicon for German.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
Developing a French FrameNet: Methodology and First results.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
A Language-independent Approach to Extracting Derivational Relations from an Inflectional Lexicon.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
Automated Error Detection in Digitized Cultural Heritage Documents.
Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, 2014
2013
Dynamic extension of a French morphological lexicon based a text stream (Extension dynamique de lexiques morphologiques pour le français à partir d'un flux textuel) [in French].
Proceedings of the Traitement Automatique des Langues Naturelles, 2013
Implementing a Formal Model of Inflectional Morphology.
Proceedings of the Systems and Frameworks for Computational Morphology, 2013
Enforcing Subcategorization Constraints in a Parser Using Sub-parses Recombining.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013
Can MDL Improve Unsupervised Chinese Word Segmentation?
Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing, 2013
2012
Coupling an annotated corpus and a lexicon for state-of-the-art POS tagging.
Lang. Resour. Evaluation, 2012
Annotation référentielle du Corpus Arboré de Paris 7 en entités nommées (Referential named entity annotation of the Paris 7 French TreeBank) [in French].
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012
TCOF-POS : un corpus libre de français parlé annoté en morphosyntaxe (TCOF-POS : A Freely Available POS-Tagged Corpus of Spoken French) [in French].
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012
Population of a Knowledge Base for News Metadata from Unstructured Text and Web Data.
Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction, 2012
Evaluating and improving syntactic lexica by plugging them within a parser.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
Aleda, a free large-scale entity database for French.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
Wordnet extension made simple: A multilingual lexicon-based approach using wiki resources.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
Boosting the Coverage of a Semantic Lexicon by Automatically Extracted Event Nominalizations.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
Applying cross-lingual WSD to wordnet development.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
The French Social Media Bank: a Treebank of Noisy User Generated Content.
Proceedings of the COLING 2012, 2012
Unsupervized Word Segmentation: the Case for Mandarin Chinese.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012
Statistical Parsing of Spanish and Data Driven Lemmatization.
Proceedings of the Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages, 2012
2011
Modeling and implementing non canonical morphological phenomena.
Trait. Autom. des Langues, 2011
Évaluation de lexiques syntaxiques par leur intégartion dans l'analyseur syntaxiques FRMG
CoRR, 2011
Construction d'un lexique des adjectifs dénominaux (Construction of a lexicon of denominal adjectives).
Proceedings of the Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2011
Développement de ressources pour le persan : PerLex 2, nouveau lexique morphologique et MEltfa, étiqueteur morphosyntaxique (Development of resources for Persian: PerLex 2, a new morphological lexicon and MEltfa, a morphosyntactic tagger).
Proceedings of the Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2011
Un turc mécanique pour les ressources linguistiques : critique de la myriadisation du travail parcellisé (Mechanical Turk for linguistic resources: review of the crowdsourcing of parceled work).
Proceedings of the Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2011
Segmentation et induction de lexique non-supervisées du mandarin (Unsupervised segmentation and induction of mandarin lexicon).
Proceedings of the Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2011
Coopération de méthodes statistiques et symboliques pour l'adaptation non-supervisée d'un système d'étiquetage en entités nommées (Statistical and symbolic methods cooperation for the unsupervised adaptation of a named entity recognition system).
Proceedings of the Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2011
Non-canonical Inflection: Data, Formalisation and Complexity Measures.
Proceedings of the Systems and Frameworks for Computational Morphology, 2011
Classification-Based Extension of Wordnets from Heterogeneous Resources.
Proceedings of the Human Language Technology Challenges for Computer Science and Linguistics, 2011
Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use.
Proceedings of the Human Language Technology Challenges for Computer Science and Linguistics, 2011
Data Driven Lemmatization and Parsing of Italian.
Proceedings of the Evaluation of Natural Language and Speech Tools for Italian, 2011
2010
Détection et résolution d'entités nommées dans des dépêches d'agence.
Proceedings of the Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2010
Développement de ressources pour le persan: lexique morphologique et chaîne de traitements de surface.
Proceedings of the Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2010
Exploitation d'une ressource lexicale pour la construction d'un étiqueteur morpho-syntaxique état-de-l'art du français.
Proceedings of the Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2010
Ponctuations fortes abusives.
Proceedings of the Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2010
Traitement des inconnus : une approche systématique de l'incomplétude lexicale.
Proceedings of the Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2010
Control Verb, Argument Cluster Coordination and Multi Component TAG.
Proceedings of the 10th International Workshop on Tree Adjoining Grammar and Related Frameworks, 2010
A Morphological Lexicon for the Persian Language.
Proceedings of the International Conference on Language Resources and Evaluation, 2010
A Lexicon of French Quotation Verbs for Automatic Quotation Extraction.
Proceedings of the International Conference on Language Resources and Evaluation, 2010
The Lefff, a Freely Available and Large-coverage Morphological and Syntactic Lexicon for French.
Proceedings of the International Conference on Language Resources and Evaluation, 2010
Influence of Pre-Annotation on POS-Tagged Corpus Development.
Proceedings of the Fourth Linguistic Annotation Workshop, 2010
Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two.
Proceedings of the ACL 2010, 2010
Are Very Large Context-Free Grammars Tractable?
Proceedings of the Trends in Parsing Technology, 2010
2009
Producción eficiente de recursos lingüísticos: proyecto Victoria.
Proces. del Leng. Natural, 2009
Construcción y extensión de un léxico morfológico y sintáctico para el español: el Leffe.
Proces. del Leng. Natural, 2009
Intégrer les tables du Lexique-Grammaire à un analyseur syntaxique robuste à grande échelle.
Proceedings of the Actes de la 16ème conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2009
Trouver et confondre les coupables : un processus sophistiqué de correction de lexique.
Proceedings of the Actes de la 16ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2009
Towards Efficient Production of Linguistic Resources: the Victoria Project.
Proceedings of the Recent Advances in Natural Language Processing, 2009
A Morphological and Syntactic Wide-coverage Lexicon for Spanish: The Leffe.
Proceedings of the Recent Advances in Natural Language Processing, 2009
Coupling an Annotated Corpus and a Morphosyntactic Lexicon for State-of-the-Art POS Tagging with Less Human Effort.
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, 2009
Building a morphological and syntactic lexicon by merging various linguistic resources.
Proceedings of the 17th Nordic Conference of Computational Linguistics, 2009
MICA: A Probabilistic Dependency Parser Based on Tree Insertion Grammars (Application Note).
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009
Using Lexicon-Grammar Tables for French Verbs in a Large-Coverage Parser.
Proceedings of the Human Language Technology. Challenges for Computer Science and Linguistics, 2009
Extracting and Visualizing Quotations from News Wires.
Proceedings of the Human Language Technology. Challenges for Computer Science and Linguistics, 2009
Parsing Directed Acyclic Graphs with Range Concatenation Grammars.
Proceedings of the 11th International Workshop on Parsing Technologies (IWPT-2009), 2009
Constructing parse forests that include exactly the n-best PCFG trees.
Proceedings of the 11th International Workshop on Parsing Technologies (IWPT-2009), 2009
Multi-Component Tree Insertion Grammars.
Proceedings of the Formal Grammar - 14th International Conference, 2009
2008
Error Mining on Syntactic Parser Output.
Trait. Autom. des Langues, 2008
S XPipe 2: an architecture for surface preprocessing of raw corpora.
Trait. Autom. des Langues, 2008
Extensión y corrección semi-automática de léxicos morfo-sintácticos.
Proces. del Leng. Natural, 2008
Combining Multiple Resources to Build Reliable Wordnets.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008
Construction d'un wordnet libre du français à partir de ressources multilingues.
Proceedings of the Actes de la 15ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2008
Computer Aided Correction and Extension of a Syntactic Wide-Coverage Lexicon.
Proceedings of the COLING 2008, 2008
2007
Comparaison du Lexique-Grammaire des verbes pleins et de DICOVALENCE : vers une intégration dans le Lefff.
Proceedings of the Actes de la 14ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2007
Building a Morphosyntactic Lexicon and a Pre-syntactic Processing Chain for Polish.
Proceedings of the Human Language Technology. Challenges of the Information Society, 2007
Mining Parsing Results for Lexical Correction: Toward a Complete Correction Process of Wide-Coverage Lexicons.
Proceedings of the Human Language Technology. Challenges of the Information Society, 2007
Are Very Large Context-Free Grammars Tractable?
Proceedings of the Tenth International Conference on Parsing Technologies, 2007
2006
Modélisation et analyse des coordinations elliptiques par l'exploitation dynamique des forêts de dérivation.
Proceedings of the Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Posters, 2006
Trouver le coupable : Fouille d'erreurs sur des sorties d'analyseurs syntaxiques.
Proceedings of the Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2006
Modeling and Analysis of Elliptic Coordination by Dynamic Exploitation of Derivation Forests in LTAG Parsing.
Proceedings of the Eighth International Workshop on Tree Adjoining Grammar and Related Formalisms, 2006
The Lefff 2 syntactic lexicon for French: architecture, acquisition, use.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006
Deep non-probabilistic parsing of large corpora.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006
Error Mining in Parsing Results.
Proceedings of the ACL 2006, 2006
2005
Automatic Acquisition of a Slovak Lexicon from a Raw Corpus.
Proceedings of the Text, Speech and Dialogue, 8th International Conference, 2005
Les Méta-RCG: description et mise en oeuvre.
Proceedings of the Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2005
Un analyseur LFG efficace pour le français : SXLFG.
Proceedings of the Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2005
Chaînes de traitement syntaxique.
Proceedings of the Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2005
Linguistic Facts as Predicates over Ranges of the Sentence.
Proceedings of the Logical Aspects of Computational Linguistics, 2005
Efficient and Robust LFG Parsing: SxLFG.
Proceedings of the Ninth International Workshop on Parsing Technology, 2005
2004
Coupling Grammar and Knowledge Base: Range Concatenation Grammars and Description Logics.
Proceedings of the Text, Speech and Dialogue, 7th International Conference, 2004
Les Grammaires à Concaténation d'Intervalles (RCG) comme formalisme grammatical pour la linguistique.
Proceedings of the Actes de la 11ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2004
Morphology Based Automatic Acquisition of Large-coverage Lexica.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004