Marcos García

Orcid: 0000-0002-6557-0210

  • University of A Coruña, LyS Group, Departamento de Letras, Galiza, Spain
  • University of Santiago de Compostela, Center for Research in Information Technologies (CITIUS), Spain

According to our database1, Marcos García authored at least 68 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:



Training and evaluation of vector models for Galician.
Lang. Resour. Evaluation, December, 2024

Towards accurate dependency parsing for Galician with limited resources.
Proces. del Leng. Natural, 2024

Multi-label Discourse Function Classification of Lexical Bundles in Basque and Spanish via transformer-based models.
Proces. del Leng. Natural, 2024

Open Generative Large Language Models for Galician.
Proces. del Leng. Natural, 2024

DeepR3: Reducing, Reusing and Recycling Large Models for Developing Responsible and Green Language Technologies.
Proceedings of the Seminar of the Spanish Society for Natural Language Processing: Projects and System Demonstrations (SEPLN-CEDI-PD 2024) co-located with the 7th Spanish Conference on Informatics (CEDI 2024), 2024

CorpusNÓS: A massive Galician corpus for training large language models.
Proceedings of the 16th International Conference on Computational Processing of Portuguese, 2024

Increasing manually annotated resources for Galician: the Parallel Universal Dependencies Treebank.
Proceedings of the 16th International Conference on Computational Processing of Portuguese, 2024

Compositionality and Ambiguity in Multiword Expressions: A Dataset for the Evaluation of Language Models in Galician.
Proceedings of the Progress in Artificial Intelligence, 2024

WordNet Expansion with Bilingual Word Embeddings and Neural Machine Translation.
Proceedings of the Progress in Artificial Intelligence, 2024

Annotation of lexical bundles with discourse functions in a Spanish academic corpus.
Proceedings of the 19th Workshop on Multiword Expressions, 2023

Dependency resolution at the syntax-semantics interface: psycholinguistic and computational insights on control dependencies.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

A computational psycholinguistic evaluation of the syntactic abilities of Galician BERT models at the interface of dependency resolution and training time.
Proces. del Leng. Natural, 2022

Evaluating Contextualized Vectors from both Large Language Models and Compositional Strategies.
Proces. del Leng. Natural, 2022

Proxecto Nós: Artificial intelligence at the service of the Galician language.
Proceedings of the Annual Conference of the Spanish Association for Natural Language Processing: Projects and Demonstrations (SEPLN-PD 2022) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2022), 2022

An exploration of the semantic knowledge in vector models: polysemy, synonymy and idiomaticity.
Proceedings of the Annual Conference of the Spanish Association for Natural Language Processing: Projects and Demonstrations (SEPLN-PD 2022) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2022), 2022

SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding.
Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022

A Targeted Assessment of the Syntactic Abilities of Transformer Models for Galician-Portuguese.
Proceedings of the Computational Processing of the Portuguese Language, 2022

Bertinho: Galician BERT Representations.
Proces. del Leng. Natural, 2021

Editor's Note.
Int. J. Interact. Multim. Artif. Intell., 2021

Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning.
Comput. Linguistics, 2021

Comparing Dependency-based Compositional Models with Contextualized Word Embeddings.
Proceedings of the 13th International Conference on Agents and Artificial Intelligence, 2021

Probing for idiomaticity in vector space models.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Assessing the Representations of Idiomaticity in Vector Models with a Noun Compound Dataset Labeled at Type and Token Levels.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Exploring the Representation of Word Meanings in Context: A Case Study on Homonymy and Synonymy.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Uma utilidade para o reconhecimento de topónimos em documentos medievais.
Linguamática, 2019

Editorial for the Special Issue on "Natural Language Processing and Text Mining".
Inf., 2019

NER and Open Information Extraction for Portuguese: Notebook for IberLEF 2019 Portuguese Named Entity Recognition and Relation Extraction Tasks.
Proceedings of the Iberian Languages Evaluation Forum co-located with 35th Conference of the Spanish Society for Natural Language Processing, 2019

A comparison of statistical association measures for identifying dependency-based collocations in various languages.
Proceedings of the Joint Workshop on Multiword Expressions and WordNet, 2019

Unsupervised Compositional Translation of Multiword Expressions.
Proceedings of the Joint Workshop on Multiword Expressions and WordNet, 2019

Exploring cross-lingual word embeddings for the inference of bilingual dictionaries.
Proceedings of TIAD-2019 Shared Task, 2019

Weighted Compositional Vectors for Translating Collocations Using Monolingual Corpora.
Proceedings of the Computational and Corpus-Based Phraseology, 2019

Identifying Lexical Bundles for an Academic Writing Assistant in Spanish.
Proceedings of the Computational and Corpus-Based Phraseology, 2019

Pay Attention when you Pay the Bills. A Multilingual Corpus with Dependency-based and Semantic Annotation of Collocations.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

A Method to Automatically Identify Diachronic Variation in Collocations.
Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change, 2019

New treebank or repurposed? On the feasibility of cross-lingual parsing of Romance languages with Universal Dependencies.
Nat. Lang. Eng., 2018

Dependency parsing with finite state transducers and compression rules.
Inf. Process. Manag., 2018

Distributional semantics for diachronic search.
Comput. Electr. Eng., 2018

LinguaKit: A Big Data-Based Multilingual Tool for Linguistic Analysis and Information Extraction.
Proceedings of the Fifth International Conference on Social Networks Analysis, 2018

Task-Oriented Evaluation of Dependency Parsing with Open Information Extraction.
Proceedings of the Computational Processing of the Portuguese Language, 2018

A Lexical Tool for Academic Writing in Spanish based on Expert and Novice Corpora.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

LinguaKit: uma ferramenta multilingue para a análise linguística e a extração de informação.
Linguamática, 2017

Towards Syntactic Iberian Polarity Classification.
Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, 2017

Using bilingual word-embeddings for multilingual collocation extraction.
Proceedings of the 13th Workshop on Multiword Expressions, 2017

A Web Interface for Diachronic Semantic Search in Spanish.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

A rule-based system for cross-lingual parsing of Romance languages with Universal Dependencies.
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 2017

Creación de un treebank de dependencias universales mediante recursos existentes para lenguas próximas: el caso del gallego.
Proces. del Leng. Natural, 2016

Semantic Relation Extraction. Resources, Tools and Strategies.
Proceedings of the Computational Processing of the Portuguese Language, 2016

Entity Linking with Distributional Semantics.
Proceedings of the Computational Processing of the Portuguese Language, 2016

Incorporating Lexico-semantic Heuristics into Coreference Resolution Sieves for Named Entity Recognition at Document-level.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Exploring the effectiveness of linguistic knowledge for biographical relation extraction.
Nat. Lang. Eng., 2015

Yet Another Suite of Multilingual NLP Tools.
Proceedings of the Languages, Applications and Technologies - 4th International Symposium, 2015

Multilingual Open Information Extraction.
Proceedings of the Progress in Artificial Intelligence, 2015

PoS-tagging the Web in Portuguese. National varieties, text typologies and spelling systems.
Proces. del Leng. Natural, 2014

Entity-Centric Coreference Resolution of Person Entities for Open Information Extraction.
Proces. del Leng. Natural, 2014

Análisis morfosintáctico y clasificación de entidades nombradas en un entorno Big Data.
Proces. del Leng. Natural, 2014

Comparing Ranking-based and Naive Bayes Approaches to Language Detection on Tweets.
Proceedings of the Tweet Language Identification Workshop co-located with 30th Conference of the Spanish Society for Natural Language Processing, 2014

Citius: A Naive-Bayes Strategy for Sentiment Analysis on English Tweets.
Proceedings of the 8th International Workshop on Semantic Evaluation, 2014

Multilingual corpora with coreferential annotation of person entities.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

An Entity-Centric Coreference Resolution System for Person Entities with Rich Linguistic Information.
Proceedings of the COLING 2014, 2014

Perldoop: Efficient execution of Perl scripts on Hadoop clusters.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

A Method to Lexical Normalisation of Tweets.
Proceedings of the Tweet Normalization Workshop co-located with 29th Conference of the Spanish Society for Natural Language Processing (SEPLN 2013), 2013

Automatic Phonetic Transcription by Phonological Derivation.
Proceedings of the Computational Processing of the Portuguese Language, 2012

Extraction of Bilingual Cognates from Wikipedia.
Proceedings of the Computational Processing of the Portuguese Language, 2012

Conversión Fonética Automática con Información Fonológica para el Gallego.
Proces. del Leng. Natural, 2011

Resolución de Correferencia de Nombres de Persona para Extracción de Información Biográfica.
Proces. del Leng. Natural, 2011

Evaluating Various Linguistic Features on Semantic Relation Extraction.
Proceedings of the Recent Advances in Natural Language Processing, 2011

A Resource-Based Method for Named Entity Extraction and Classification.
Proceedings of the Progress in Artificial Intelligence, 2011

Análise Morfossintáctica para Português Europeu e Galego: Problemas, Soluções e Avaliação.
Linguamática, 2010
