Pavel Rychlý

CoRR, 2024

Better Low-Resource Machine Translation with Smaller Vocabularies.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 27th International Conference, 2024

Bilingual Lexicon Induction From Comparable and Parallel Data: A Comparative Analysis.

[BibT_eX]

[DOI]

Michaela Denisová

Proceedings of the Text, Speech, and Dialogue - 27th International Conference, 2024

2023

Evaluation of Automatically Constructed Word Meaning Explanations.

[BibT_eX]

[DOI]

Marie Stará

Ales Horák

CoRR, 2023

MUNI-NLP Systems for Low-resource Indic Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Translation, 2023

MUNI-NLP Submission for Czech-Ukrainian Translation Task at WMT23.

[BibT_eX]

[DOI]

Yuliia Teslia

Proceedings of the Eighth Conference on Machine Translation, 2023

2022

MUNI-NLP Systems for Lower Sorbian-German and Lower Sorbian-Upper Sorbian Machine Translation @ WMT22.

[BibT_eX]

[DOI]

Proceedings of the Seventh Conference on Machine Translation, 2022

Utok: The Fast Rule-based Tokenizer.

[BibT_eX]

Samuel Spalek

Proceedings of the 16th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2022

HFT: High Frequency Tokens for Low-Resource NMT.

[BibT_eX]

[DOI]

Proceedings of the Fifth Workshop on Technologies for Machine Translation of Low-Resource Languages, 2022

2021

DMoG: A Data-Based Morphological Guesser.

[BibT_eX]

[DOI]

Vojtech Kovár

Proceedings of the 15th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2021

Development of HAMOD: a High Agreement Multi-lingual Outlier Detection dataset.

[BibT_eX]

[DOI]

Proceedings of the 15th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2021

When Word Pairs Matter.

[BibT_eX]

[DOI]

Michaela Denisová

Proceedings of the 15th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2021

2020

Current Challenges in Web Corpus Building.

[BibT_eX]

[DOI]

Proceedings of the 12th Web as Corpus Workshop, 2020

2019

Word Sense Induction Using Word Sketches.

[BibT_eX]

[DOI]

Proceedings of the Statistical Language and Speech Processing, 2019

Evaluation of Czech Distributional Thesauri.

[BibT_eX]

[DOI]

Proceedings of the 13th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2019

A Distributional Multi-word Thesaurus in Sketch Engine.

[BibT_eX]

[DOI]

Proceedings of the 13th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2019

SiLi Index: Data Structure for Fast Vector Space Searching.

[BibT_eX]

[DOI]

Ondrej Herman

Proceedings of the 13th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2019

2018

An Update of the Manually Annotated Amharic Corpus.

[BibT_eX]

Gezahegn Tsegaye Lemma

Proceedings of the 12th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2018

2017

KernelTagger - a PoS Tagger for Very Small Amount of Training Data.

[BibT_eX]

[DOI]

Proceedings of the 11th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2017

2016

DSL Shared Task 2016: Perfect Is The Enemy of Good Language Discrimination Through Expectation-Maximization and Chunk-based Language Model.

[BibT_eX]

[DOI]

Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, 2016

Annotated Amharic Corpora.

[BibT_eX]

[DOI]

Vit Suchomel

Proceedings of the Text, Speech, and Dialogue - 19th International Conference, 2016

Evaluation of the Sketch Engine Thesaurus on Analogy Queries.

[BibT_eX]

[DOI]

Proceedings of the 10th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2016

Finding Definitions in Large Corpora with Sketch Engine.

[BibT_eX]

[DOI]

Vojtech Kovár

Monika Mociariková

Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Adam Kilgarriff's Legacy to Computational Linguistics and Beyond.

[BibT_eX]

[DOI]

Proceedings of the Computational Linguistics and Intelligent Text Processing, 2016

2015

Concurrent Processing of Text Corpus Queries.

[BibT_eX]

[DOI]

Radoslav Rábara

Proceedings of the 9th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2015

Software and Data for Corpus Pattern Analysis.

[BibT_eX]

[DOI]

Proceedings of the 9th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2015

2014

Finding the Best Name for a Set of Words Automatically.

[BibT_eX]

[DOI]

Proceedings of the 8th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2014

Low Inter-Annotator Agreement = An Ill-Defined Problem?

[BibT_eX]

[DOI]

Vojtech Kovár

Proceedings of the 8th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2014

Optimization of Regular Expression Evaluation within the Manatee Corpus Management System.

[BibT_eX]

[DOI]

Proceedings of the 8th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2014

Extrinsic Corpus Evaluation with a Collocation Dictionary Task.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Finding Terms in Corpora for Many Languages with the Sketch Engine.

[BibT_eX]

[DOI]

Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 2014

2013

Fast Construction of a Word-Number Index for Large Data.

[BibT_eX]

[DOI]

Pavel Smerk

Proceedings of the 7th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2013

2012

CzAccent - Simple Tool for Restoring Accents in Czech Texts.

[BibT_eX]

[DOI]

Proceedings of the 6th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2012

Building a 70 billion word corpus of English from ClueWeb.

[BibT_eX]

[DOI]

Jan Pomikálek

Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Legal electronic dictionary for Czech.

[BibT_eX]

[DOI]

Frantisek Cvrcek

Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

2011

Words' Burstiness in Language Models.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2011

2010

Can Corpus Pattern Analysis Be Used in NLP?

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010

Frequency of Low-Frequency Words in Text Corpora.

[BibT_eX]

[DOI]

Proceedings of the 4th Workshop on Recent Advances in Slavonic Natural Languages Processing, 2010

Fast Syntactic Searching in Very Large Corpora for Many Languages.

[BibT_eX]

[DOI]

Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation, 2010

Automatic Identification of Legal Terms in Czech Law Texts.

[BibT_eX]

[DOI]

Pavel Smerk

Proceedings of the Semantic Processing of Legal Texts: Where the Language of Law Meets the Law of Language, 2010

A Case Study in Word Sketches ? Czech Verb videt 'see'.

[BibT_eX]

Proceedings of the A Way with Words, 2010

Semi-Automatic Dictionary Drafting.

[BibT_eX]

Adam Kilgarriff

Proceedings of the A Way with Words, 2010

2009

Discovering Grammatical Relations in Czech Sentences.

[BibT_eX]

[DOI]

Ales Horák

Proceedings of the 3rd Workshop on Recent Advances in Slavonic Natural Languages Processing, 2009

2008

A Lexicographer-Friendly Association Score.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Recent Advances in Slavonic Natural Languages Processing, 2008

Detecting Co-Derivative Documents in Large Text Collections.

[BibT_eX]

[DOI]

Jan Pomikálek

Proceedings of the International Conference on Language Resources and Evaluation, 2008

2007

Manatee/Bonito - A Modular Corpus Manager.

[BibT_eX]

[DOI]

Proceedings of the 1st Workshop on Recent Advances in Slavonic Natural Languages Processing, 2007

Morphological Analysis of Law Texts.

[BibT_eX]

[DOI]

Pavel Smerk

Proceedings of the 1st Workshop on Recent Advances in Slavonic Natural Languages Processing, 2007

An efficient algorithm for building a distributional thesaurus (and other Sketch Engine developments).

[BibT_eX]

[DOI]

Adam Kilgarriff

Proceedings of the ACL 2007, 2007

2006

WebBootCaT. Instant Domain-Specific Corpora to Support Human Translators.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual conference of the European Association for Machine Translation, 2006

2005

Chinese Sketch Engine and the Extraction of Grammatical Collocations.

[BibT_eX]

[DOI]

Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, 2005

2003

Text Corpus with Errors.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

2001

Finding Semantically Related Words in Large Corpora.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

1999

Dispersion of Words in a Language Corpus.

[BibT_eX]

[DOI]

Jaroslava Hlavácová

Proceedings of the Text, Speech and Dialogue - Second International Workshop, 1999

1998

Corpus Annotation in Inflectional Languages: Czech.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Workshop on Database and Expert Systems Applications, 1998

1997

DESAM - Annotated Corpus for Czech.

[BibT_eX]

[DOI]