Filip Ginter

Orcid: 0000-0002-5484-6103

  • University of Turku, Dept of Future Technologies, Finland

According to our database1, Filip Ginter authored at least 126 papers between 2004 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:



Towards diverse and contextually anchored paraphrase modeling: A dataset and baselines for Finnish.
Nat. Lang. Eng., 2024

Extracting Social Connections from Finnish Karelian Refugee Interviews Using LLMs.
Proceedings of the Computational Humanities Research Conference 2024, 2024

Automatic Short Answer Grading for Finnish with ChatGPT.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Toxicity Detection in Finnish Using Machine Translation.
Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

Multi-CrossRE A Multi-Lingual Multi-Domain Dataset for Relation Extraction.
Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

Using ECCO-BERT and the Historical Thesaurus of English to Explore Concepts and Agency in Historical Writing Interpreting the Eighteenth-century Luxury Debate.
Proceedings of the Annual International Conference of the Alliance of Digital Humanities Organizations, 2023

Silver Syntax Pre-training for Cross-Domain Relation Extraction.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Neural Network and Random Forest Models in Protein Function Prediction.
IEEE ACM Trans. Comput. Biol. Bioinform., 2022

Identifying gender bias in blockbuster movies through the lens of machine learning.
CoRR, 2022

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code.
CoRR, 2022

Out-of-Domain Evaluation of Finnish Dependency Parsing.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Deep Learning, Film History: Model Explanation Techniques in the Analysis of Temporality in Finnish Fiction Film Metadata.
Proceedings of the 6th Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), 2022

Detecting Sequential Genre Change in Eighteenth-Century Texts.
Proceedings of the Computational Humanities Research Conference 2022, 2022

Explaining Classes through Stable Word Attributions.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Explainable Publication Year Prediction of Eighteenth Century Texts with the BERT Model.
Proceedings of the 3rd Workshop on Computational Approaches to Historical Language Change, 2022

Universal Lemmatizer: A sequence-to-sequence model for lemmatizing Universal Dependencies treebanks.
Nat. Lang. Eng., 2021

Semantic Search as Extractive Paraphrase Span Detection.
CoRR, 2021

Explaining Classes through Word Attribution.
CoRR, 2021

Annotation Guidelines for the Turku Paraphrase Corpus.
CoRR, 2021

Quantitative Evaluation of Alternative Translations in a Corpus of Highly Dissimilar Finnish Paraphrases.
CoRR, 2021

Deep learning for sentence clustering in essay grading support.
CoRR, 2021

WikiBERT Models: Deep Transfer Learning for Many Languages.
Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021

Fine-grained Named Entity Annotation for Finnish.
Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021

Finnish Paraphrase Corpus.
Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021

Deep learning for sentence clustering in essay grading support.
Proceedings of the 14th International Conference on Educational Data Mining, 2021

Supporting the use of standardized nursing terminologies with automatic subject heading prediction: a comparison of sentence-level text classification methods.
J. Am. Medical Informatics Assoc., 2020

Classifying online corporate reputation with machine learning: a study in the banking domain.
Internet Res., 2020

Towards Fully Bilingual Deep Language Modeling.
CoRR, 2020

Dependency parsing of biomedical text with BERT.
BMC Bioinform., 2020

Assisting nurses in care documentation: from automated sentence classification to coherent document structures with subject headings.
J. Biomed. Semant., 2020

The FISKMÖ Project: Resources and Tools for Finnish-Swedish Machine Translation and Cross-Linguistic Research.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Turku Enhanced Parser Pipeline: From Raw Text to Enhanced Graphs in the IWPT 2020 Shared Task.
Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies, 2020

Entity-Pair Embeddings for Improving Relation Extraction in the Biomedical Domain.
Proceedings of the 28th European Symposium on Artificial Neural Networks, 2020

Multilingual is not enough: BERT for Finnish.
CoRR, 2019

Morphological Tagging and Lemmatization of Albanian: A Manually Annotated Corpus and Neural Models.
CoRR, 2019

Is Multilingual BERT Fluent in Language Generation?
CoRR, 2019

Leveraging Text Repetitions and Denoising Autoencoders in OCR Post-correction.
CoRR, 2019

Template-free Data-to-Text Generation of Finnish Sports News.
Proceedings of the 22nd Nordic Conference on Computational Linguistics, NoDaLiDa 2019, Turku, Finland, September 30, 2019

The Long-Term Reuse of Text in the Finnish Press, 1771-1920.
Proceedings of the Digital Humanities in the Nordic Countries 4th Conference, 2019

Neural Dependency Parsing of Biomedical Text: TurkuNLP entry in the CRAFT Structural Annotation Task.
Proceedings of The 5th Workshop on BioNLP Open Shared Tasks, 2019

Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task.
J. Am. Medical Informatics Assoc., 2018

Potent pairing: ensemble of long short-term memory networks and support vector machine for chemical-protein relation extraction.
Database J. Biol. Databases Curation, 2018

Wide-scope biomedical named entity recognition and normalization with CRFs, fuzzy matching and character level modeling.
Database J. Biol. Databases Curation, 2018

Improving Layman Readability of Clinical Narratives with Unsupervised Synonym Replacement.
Proceedings of the Building Continents of Knowledge in Oceans of Data: The Future of Co-Created eHealth, 2018

Parse Me if You Can: Artificial Treebanks for Parsing Experiments on Elliptical Constructions.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies.
Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Brussels, Belgium, October 31, 2018

Turku Neural Parser Pipeline: An End-to-End System for the CoNLL 2018 Shared Task.
Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Brussels, Belgium, October 31, 2018

Enhancing Universal Dependency Treebanks: A Case Study.
Proceedings of the Second Workshop on Universal Dependencies, 2018

Mind the Gap: Data Enrichment in Dependency Parsing of Elliptical Constructions.
Proceedings of the Second Workshop on Universal Dependencies, 2018

Evaluation of a Prototype System that Automatically Assigns Subject Headings to Nursing Narratives Using Recurrent Neural Network.
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis, 2018

A System for Identifying and Exploring Text Repetition in Large Historical Document Corpora.
Proceedings of the 21st Nordic Conference on Computational Linguistics, 2017

Dep_search: Efficient Search Tool for Large Dependency Parsebanks.
Proceedings of the 21st Nordic Conference on Computational Linguistics, 2017

Creating register sub-corpora for the Finnish Internet Parsebank.
Proceedings of the 21st Nordic Conference on Computational Linguistics, 2017

Applying BLAST to Text Reuse Detection in Finnish Newspapers and Journals, 1771-1910.
Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language, 2017

Cross-Lingual Pronoun Prediction with Deep Recurrent Neural Networks v2.0.
Proceedings of the Third Workshop on Discourse in Machine Translation, 2017

Assessing the Annotation Consistency of the Universal Dependencies Corpora.
Proceedings of the Fourth International Conference on Dependency Linguistics, 2017

Fully Delexicalized Contexts for Syntax-Based Word Embeddings.
Proceedings of the Fourth International Conference on Dependency Linguistics, 2017

TurkuNLP: Delexicalized Pre-training of Word Embeddings for Dependency Parsing.
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 2017

Detecting mentions of pain and acute confusion in Finnish clinical text.
Proceedings of the BioNLP 2017, Vancouver, Canada, August 4, 2017, 2017

End-to-End System for Bacteria Habitat Extraction.
Proceedings of the BioNLP 2017, Vancouver, Canada, August 4, 2017, 2017

Ensemble of Convolutional Neural Networks for Medicine Intake Recognition in Twitter.
Proceedings of the 2nd Social Media Mining for Health Research and Applications Workshop co-located with the American Medical Informatics Association Annual Symposium (AMIA 2017), 2017

Filtering large-scale event collections using a combination of supervised and unsupervised learning for event trigger classification.
J. Biomed. Semant., 2016

Cell line name recognition in support of the identification of synthetic lethality in cancer from text.
Bioinform., 2016

Phrase-Based SMT for Finnish with More Data, Better Models and Alternative Alignment and Translation Tools.
Proceedings of the First Conference on Machine Translation, 2016

Cross-Lingual Pronoun Prediction with Deep Recurrent Neural Networks.
Proceedings of the First Conference on Machine Translation, 2016

Universal Dependencies for Persian.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Universal Dependencies v1: A Multilingual Treebank Collection.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Deep Learning with Minimal Training Data: TurkuNLP Entry in the BioNLP Shared Task 2016.
Proceedings of the 4th BioNLP Shared Task Workshop, BioNLP 2016, 2016

Syntactic analyses and named entity recognition for PubMed and PubMed Central - up-to-the-minute.
Proceedings of the 15th Workshop on Biomedical Natural Language Processing, 2016

Application of the EVEX resource to event extraction and network construction: Shared Task entry and result analysis.
BMC Bioinform., December, 2015

Care episode retrieval: distributional semantic models for information retrieval in the clinical domain.
BMC Medical Informatics Decis. Mak., 2015

The Finnish Proposition Bank.
Lang. Resour. Evaluation, 2015

Morphological Segmentation and OPUS for Finnish-English Machine Translation.
Proceedings of the Tenth Workshop on Statistical Machine Translation, 2015

Turku: Semantic Dependency Parsing as a Sequence Classification.
Proceedings of the 9th International Workshop on Semantic Evaluation, 2015

Universal Dependencies for Finnish.
Proceedings of the 20th Nordic Conference of Computational Linguistics, 2015

Sentence Compression For Automatic Subtitling.
Proceedings of the 20th Nordic Conference of Computational Linguistics, 2015

Towards the Classification of the Finnish Internet Parsebank: Detecting Translations and Informality.
Proceedings of the 20th Nordic Conference of Computational Linguistics, 2015

SETS: Scalable and Efficient Tree Search in Dependency Graphs.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Towards Universal Web Parsebanks.
Proceedings of the Third International Conference on Dependency Linguistics, 2015

Sharing annotations better: RESTful Open Annotation.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

Building the essential resources for Finnish: the Turku Dependency Treebank.
Lang. Resour. Evaluation, 2014

Statistical parsing of varieties of clinical Finnish.
Artif. Intell. Medicine, 2014

Turku: Broad-Coverage Semantic Parsing with Rich Features.
Proceedings of the 8th International Workshop on Semantic Evaluation, 2014

UTU: Disease Mention Recognition and Normalization with CRFs and Vector Space Representations.
Proceedings of the 8th International Workshop on Semantic Evaluation, 2014

Universal Stanford dependencies: A cross-linguistic typology.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Syntactic N-gram Collection from a Large-Scale Corpus of Internet Finnish.
Proceedings of the Human Language Technologies - The Baltic Perspective, 2014

Care Episode Retrieval.
Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis, 2014

Post-hoc Manipulations of Vector Space Models with Application to Semantic Role Labeling.
Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality, 2014

Joint Morphological and Syntactic Analysis for Richly Inflected Languages.
Trans. Assoc. Comput. Linguistics, 2013

Towards a Dependency-based PropBank of General Finnish.
Proceedings of the 19th Nordic Conference of Computational Linguistics, 2013

Building a Large Automatically Parsed Corpus of Finnish.
Proceedings of the 19th Nordic Conference of Computational Linguistics, 2013

Predicting Conjunct Propagation and Other Extended Stanford Dependencies.
Proceedings of the Second International Conference on Dependency Linguistics, 2013

Evaluating Large-scale Text Mining Applications Beyond the Traditional Numeric Performance Measures.
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing, 2013

EVEX in ST'13: Application of a large-scale text mining resource to event extraction and network construction.
Proceedings of the BioNLP Shared Task 2013 Workshop, Sofia, 2013

University of Turku in the BioNLP'11 Shared Task.
BMC Bioinform., 2012

Exploring Biomolecular Literature with EVEX: Connecting Genes through Events, Homology, and Indirect Associations.
Adv. Bioinformatics, 2012

PubMed-Scale Event Extraction for Post-Translational Modifications, Epigenetics and Protein Structural Relations.
Proceedings of the 2012 Workshop on Biomedical Natural Language Processing, 2012

Extracting Contextualized Complex Biological Events with Rich Graph-Based Feature Sets.
Comput. Intell., 2011

U-Compare bio-event meta-service: compatible BioNLP event extraction services.
BMC Bioinform., 2011

A Dependency-based Analysis of Treebank Annotation Errors.
Proceedings of the Computational Dependency Theory [papers from the International Conference on Dependency Linguistics, 2011

EVEX: A PubMed-Scale Resource for Homology-Based Generalization of Text Mining Predictions.
Proceedings of the 2011 Workshop on Biomedical Natural Language Processing, 2011

Event extraction on PubMed scale.
BMC Bioinform., 2010

Complex event extraction at PubMed scale.
Bioinform., 2010

Scaling up Biomedical Event Extraction to the Entire PubMed.
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing, 2010

Dependency-Based PropBanking of Clinical Finnish.
Proceedings of the Fourth Linguistic Annotation Workshop, 2010

Towards automated processing of clinical Finnish: Sublanguage analysis and a rule-based parser.
Int. J. Medical Informatics, 2009

Combining hidden Markov models and latent semantic analysis for topic segmentation and labeling: Method and clinical application.
Int. J. Medical Informatics, 2009

Parsing Clinical Finnish: Experiments with Rule-Based and Statistical Dependency Parsers.
Proceedings of the 17th Nordic Conference of Computational Linguistics, 2009

Learning to Extract Biological Event and Relation Graphs.
Proceedings of the 17th Nordic Conference of Computational Linguistics, 2009

Extracting Complex Biological Events with Rich Graph-Based Feature Sets.
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task, BioNLP@HLT-NAACL 2009, 2009

Comparative analysis of five protein-protein interaction corpora.
BMC Bioinform., 2008

All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning.
BMC Bioinform., 2008

A Graph Kernel for Protein-Protein Interaction Extraction.
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, 2008

BioInfer: a corpus for information extraction in the biomedical domain.
BMC Bioinform., 2007

On the unification of syntactic annotations under the Stanford dependency scheme: A case study on BioInfer and GENIA.
Proceedings of the Biological, translational, and clinical language processing, 2007

Evaluation of two dependency parsers on biomedical corpus targeted at protein-protein interactions.
Int. J. Medical Informatics, 2006

Regular Approximation of Link Grammar.
Proceedings of the Advances in Natural Language Processing, 2006

Contextual weighting for Support Vector Machines in literature mining: an application to gene versus protein name disambiguation.
BMC Bioinform., 2005

Kernels Incorporating Word Positional Information in Natural Language Disambiguation Tasks.
Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference, 2005

New Techniques for Disambiguation in Natural Language and Their Application to Biological Text.
J. Mach. Learn. Res., 2004

Ontology-Based Feature Transformations: A Data-Driven Approach.
Proceedings of the Advances in Natural Language Processing, 4th International Conference, 2004

Extracting Protein-Protein Interaction Sentences by Applying Rough Set Data Analysis.
Proceedings of the Rough Sets and Current Trends in Computing, 2004

Analysis of Link Grammar on Biomedical Dependency Corpus Targeted at Protein-Protein Interactions.
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications, 2004
