Marta Villegas

Orcid: 0000-0003-0711-0029

According to our database1, Marta Villegas authored at least 52 papers between 2000 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
DeepR3: Reducing, Reusing and Recycling Large Models for Developing Responsible and Green Language Technologies.
Proceedings of the Seminar of the Spanish Society for Natural Language Processing: Projects and System Demonstrations (SEPLN-CEDI-PD 2024) co-located with the 7th Spanish Conference on Informatics (CEDI 2024), 2024

A CURATEd CATalog: Rethinking the Extraction of Pretraining Corpora for Mid-Resourced Languages.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Building a Data Infrastructure for a Mid-Resource Language: The Case of Catalan.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

FLOR: On the Effectiveness of Language Adaptation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Becoming a High-Resource Language in Speech: The Catalan Case in the Common Voice Corpus.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Mass-Editing Memory with Attention in Transformers: A cross-lingual exploration of knowledge.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Entailment-based Task Transfer for Catalan Text Classification in Small Data Regimes.
Proces. del Leng. Natural, 2023

Anticipating the Debate: Predicting Controversy in News with Transformer-based NLP.
Proces. del Leng. Natural, 2023

MarIA: Spanish Language Models.
Proceedings of the 15th International Conference on Agents and Artificial Intelligence, 2023

A weakly supervised textual entailment approach to zero-shot text classification.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

2022
Language Report Spanish.
Proceedings of the European Language Equality, 2022

Language Report Catalan.
Proceedings of the European Language Equality, 2022

MarIA: Spanish Language Models.
Proces. del Leng. Natural, 2022

Pretrained Biomedical Language Models for Clinical NLP in Spanish.
Proceedings of the 21st Workshop on Biomedical Language Processing, 2022

Assessing the Limits of Straightforward Models for Nested Named Entity Recognition in Spanish Clinical Narratives.
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022

2021
The Catalan Language CLUB.
CoRR, 2021

Spanish Legalese Language Model and Corpora.
CoRR, 2021

Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models.
CoRR, 2021

Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario.
CoRR, 2021

Spanish Language Models.
CoRR, 2021

Persistent Homology Captures the Generalization of Neural Networks Without A Validation Set.
CoRR, 2021

Spanish Biomedical and Clinical Language Embeddings.
CoRR, 2021

Determining Structural Properties of Artificial Neural Networks Using Algebraic Topology.
CoRR, 2021

Are Multilingual Models the Best Choice for Moderately Under-resourced Languages? A Comprehensive Assessment for Catalan.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
A Vulnerability Study on Academic Collaboration Networks Based on Network Dynamics.
CoRR, 2020

BioASQ at CLEF2020: Large-Scale Biomedical Semantic Indexing and Question Answering.
Proceedings of the Advances in Information Retrieval, 2020

Overview of MESINESP, a Spanish Medical Semantic Indexing Task within BioASQ 2020.
Proceedings of the Working Notes of CLEF 2020, 2020

Overview of BioASQ 2020: The Eighth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2020

2019
Automatic De-identification of Medical Texts in Spanish: the MEDDOCAN Track, Corpus, Guidelines, Methods and Evaluation of Results.
Proceedings of the Iberian Languages Evaluation Forum co-located with 35th Conference of the Spanish Society for Natural Language Processing, 2019

PharmaCoNER: Pharmacological Substances, Compounds and proteins Named Entity Recognition track.
Proceedings of The 5th Workshop on BioNLP Open Shared Tasks, 2019

2018
The apertium bilingual dictionaries on the web of data.
Semantic Web, 2018

Finding Mentions of Abbreviations and Their Definitions in Spanish Clinical Cases: The BARR2 Shared Task Evaluation Results.
Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018) co-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN 2018), 2018

2017
Esfuerzos para fomentar la minería de textos en biomedicina más allá del inglés: el plan estratégico nacional español para las tecnologías del lenguaje.
Proces. del Leng. Natural, 2017

The Biomedical Abbreviation Recognition and Resolution (BARR) Track: Benchmarking, Evaluation and Importance of Abbreviation Recognition Systems Applied to Spanish Biomedical Abstracts.
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017) co-located with 33th Conference of the Spanish Society for Natural Language Processing (SEPLN 2017), 2017

2016
Leveraging RDF Graphs for Crossing Multiple Bilingual Dictionaries.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

2015
PAROLE/SIMPLE 'lemon' ontology and lexicons.
Semantic Web, 2015

Codes of Ethics, Ethical Behavior, and Organizational Culture from the Managerial Approach: A Case Study in the Colombian Banking Industry.
Int. J. Strateg. Inf. Technol. Appl., 2015

One Ontology to Bind Them All: The META-SHARE OWL Ontology for the Interoperability of Linguistic Datasets on the Web.
Proceedings of the Semantic Web: ESWC 2015 Satellite Events - ESWC 2015 Satellite Events Portorož, Slovenia, May 31, 2015

2014
Metadata as Linked Open Data: mapping disparate XML metadata registries into one RDF/OWL registry.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

2012
Using Language Resources in Humanities research.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

The IULA Treebank.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

2010
The Harvesting Day: una iniciativa para mejorar la visibilidad de los recursos lingüísticos.
Proces. del Leng. Natural, 2010

A Case Study on Interoperability for Language Resources and Applications.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

2009
Integrating Full-Text Search and Linguistic Analyses on Disperse Data for Humanities and Social Sciences Research Projects.
Proceedings of the Fifth International Conference on e-Science, 2009

2008
COLDIC, a Lexicographic Platform for LMF compliant lexica.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

2004
Cost-effective Cross-lingual Document Classification.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

2003
Cross-Lingual Text Categorization.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2003

2002
From DTD to relational dB. An automatic generation of a lexicographical station out off ISLE guidelines.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

From Resources to Applications. Designing the Multilingual ISLE Lexical Entry.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

2001
The ISLE in the ocean. Transatlantic standards for multilingual lexicons (with an eye to machine translation).
Proceedings of Machine Translation Summit VIII, 2001

2000
Multilingual Linguistic Resources: From Monolingual Lexicons to Bilingual Interrelated Lexicons.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

SIMPLE: A General Framework for the Development of Multilingual Lexicons.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000


  Loading...