2017
Accurate and efficient general-purpose boilerplate detection for crawled web corpora.
Lang. Resour. Evaluation, 2017

2016
CommonCOW: Massively Huge Web Corpora from CommonCrawl Data and a Method to Distribute them Freely under Restrictive EU Copyright Laws.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Automatic Classification by Topic Domain for Meta Data Generation, Web Corpus Evaluation, and Corpus Comparison.
Proceedings of the 10th Web as Corpus Workshop, 2016

On Bias-free Crawling and Representative Web Corpora.
Proceedings of the 10th Web as Corpus Workshop, 2016

2015
A High-Order Discontinuous Galerkin Discretization with Multiwavelet-Based Grid Adaptation for Compressible Flows.
J. Sci. Comput., 2015

2014
Adaptive multiresolution discontinuous Galerkin schemes for conservation laws.
Math. Comput., 2014

Focused Web Corpus Crawling.
Proceedings of the 9th Web as Corpus Workshop, 2014

2013
Web Corpus Construction
Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, ISBN: 978-3-031-02152-7, 2013

Scalable Construction of High-Quality Web Corpora.
J. Lang. Technol. Comput. Linguistics, 2013

2012
Building Large Corpora from the Web Using a New Efficient Tool Chain.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

2009
Adaptive Gain Modulation in V1 Explains Contextual Modifications during Bisection Learning.
PLoS Comput. Biol., 2009

2007
Perceptual Learning via Modification of Cortical Top-Down Signals.
PLoS Comput. Biol., 2007