Benno Stein

Orcid: 0000-0001-9033-2217

Affiliations:
  • Bauhaus University, Weimar, Germany


According to our database1, Benno Stein authored at least 544 papers between 1991 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Impact and development of an Open Web Index for open web search.
J. Assoc. Inf. Sci. Technol., May, 2024

Task-Oriented Paraphrase Analytics.
Dataset, May, 2024

Who Determines What Is Relevant? Humans or AI? Why Not Both?
Commun. ACM, April, 2024

Touché24-Image-Retrieval-and-Generation-for-Arguments.
Dataset, April, 2024

Interactive Abstract Interpretation with Demanded Summarization.
ACM Trans. Program. Lang. Syst., March, 2024

webis-de/WWW-24: Release 0.1.0.
Dataset, March, 2024

Webis Generated Native Ads 2024.
Dataset, March, 2024

PAN24 Multi-Author Writing Style Analysis.
Dataset, February, 2024

Webis-Follow-Up-Questions-24.
Dataset, February, 2024

Webis-Follow-Up-Questions-24.
Dataset, February, 2024

Touché24-Image-Retrieval-and-Generation-for-Arguments.
Dataset, February, 2024

Wikipedia CRISPR Innovation Tracing Data 2023.
Dataset, January, 2024


A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking.
CoRR, 2024

If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level.
CoRR, 2024

Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders.
CoRR, 2024

Argumentation in Waltz's "Emerging Structure of International Politics".
CoRR, 2024

Detecting Generated Native Ads in Conversational Search.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

A Mastodon Corpus to Evaluate Federated Microblog Search.
Proceedings of the first International Workshop on Open Web Search co-located with the 46th European Conference on Information Retrieval ECIR 2024, 2024

Integrating Query Interpretation Components into the Information Retrieval Experiment Platform.
Proceedings of the first International Workshop on Open Web Search co-located with the 46th European Conference on Information Retrieval ECIR 2024, 2024

Evaluating Generative Ad Hoc Information Retrieval.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Resources for Combining Teaching and Research in Information Retrieval Coursework.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Are Large Language Models Reliable Argument Quality Annotators?
Proceedings of the Robust Argumentation Machines - First International Conference, 2024

Classification of Shared Tasks Used in Teaching.
Proceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1, 2024

Speaking with Objects: Conversational Agents' Embodiment in Virtual Museums.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 2024

The Information Retrieval Experiment Platform (Extended Abstract).
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Improving Argument Effectiveness Across Ideologies using Instruction-tuned Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Simulating Follow-Up Questions in Conversational Search.
Proceedings of the Advances in Information Retrieval, 2024



The Open Web Index - Crawling and Indexing the Web for Public Use.
Proceedings of the Advances in Information Retrieval, 2024

Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search Engines.
Proceedings of the Advances in Information Retrieval, 2024

Overview of PAN 2024: Multi-author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification - Extended Abstract.
Proceedings of the Advances in Information Retrieval, 2024

Futuring Machines: An Interactive Framework for Participative Futuring Through Human-AI Collaborative Speculative Fiction Writing.
Proceedings of the ACM Conversational User Interfaces 2024, 2024

The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Task-Oriented Paraphrase Analytics.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Reference-guided Style-Consistent Content Transfer.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Overview of the Multi-Author Writing Style Analysis Task at PAN 2024.
Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024), 2024

De-noising Document Classification Benchmarks via Prompt-Based Rank Pruning: A Case Study.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2024

Who Will Evaluate the Evaluators? Exploring the Gen-IR User Simulation Space.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2024


Overview of the "Voight-Kampff" Generative AI Authorship Verification Task at PAN and ELOQUENT 2024.
Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024), 2024

Overview of PAN 2024: Multi-author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification Condensed Lab Overview.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2024

Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language.
Proceedings of the 2024 ACM SIGIR Conference on Human Information Interaction and Retrieval, 2024

Product Spam on YouTube: A Case Study.
Proceedings of the 2024 ACM SIGIR Conference on Human Information Interaction and Retrieval, 2024

2023
EMNLP-23-Bootstrapping-a-Violence-Detector-for-Fan-Fiction.
Dataset, October, 2023

Task-Oriented Paraphrase Analytics.
Dataset, October, 2023

TexBiG Dataset for Analysing Complex Document Layouts in the Digital Humanities.
Dataset, September, 2023

Touché23-Image-Retrieval-for-Arguments.
Dataset, September, 2023



ChatNoir Resiliparse.
Dataset, August, 2023

Report on the Dagstuhl Seminar on Frontiers of Information Access Experimentation for Research and Education.
SIGIR Forum, June, 2023

A diachronic perspective on citation latency in Wikipedia articles on CRISPR/Cas-9: an exploratory case study.
Scientometrics, June, 2023

Webis Wikipedia Innovation History 2023.
Dataset, June, 2023



Touché23-Image-Retrieval-for-Arguments.
Dataset, February, 2023

Webis Wikipedia-IPC.
Dataset, February, 2023



Webis-Nudged-Questions-23.
Dataset, January, 2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Trans. Mach. Learn. Res., 2023

The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

The Information Retrieval Experiment Platform.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

On Stance Detection in Image Retrieval for Argumentation.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

An Empirical Comparison of Web Content Extraction Algorithms.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

A New Dataset for Causality Identification in Argumentative Texts.
Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue, 2023

SemEval-2023 Task 4: ValueEval: Identification of Human Values Behind Arguments.
Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023

SemEval-2023 Task 5: Clickbait Spoiling.
Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023

The Information Retrieval Experiment Platform.
Proceedings of the Lernen, 2023

Mining the History Sections of Wikipedia Articles on Science and Technology.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2023

SMAuC - The Scientific Multi-Authorship Corpus.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2023

Perspectives on Large Language Models for Relevance Judgment.
Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval, 2023

Trigger Warnings: Bootstrapping a Violence Detector for Fan Fiction.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Unveiling the Power of Argument Arrangement in Online Persuasive Discussions.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Dynamic Exploratory Search for the Information Retrieval Anthology.
Proceedings of the Advances in Information Retrieval, 2023

Continuous Integration for Reproducible Shared Tasks with TIRA.io.
Proceedings of the Advances in Information Retrieval, 2023

Overview of Touché 2023: Argument and Causal Retrieval - Extended Abstract.
Proceedings of the Advances in Information Retrieval, 2023

Overview of PAN 2023: Authorship Verification, Multi-author Writing Style Analysis, Profiling Cryptocurrency Influencers, and Trigger Detection - Extended Abstract.
Proceedings of the Advances in Information Retrieval, 2023

Paraphrase Acquisition from Image Captions.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

Topic Ontologies for Arguments.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

Marco Polo's Travels Revisited: From Motion Event Detection to Optimal Path Computation in 3D Maps.
Proceedings of the Annual International Conference of the Alliance of Digital Humanities Organizations, 2023

Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object Detection with Repeated Labels.
Proceedings of the Pattern Recognition - 45th DAGM German Conference, 2023

Overview of the Multi-Author Writing Style Analysis Task at PAN 2023.
Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), 2023

Overview of the Trigger Detection Task at PAN 2023.
Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), 2023

Overview of the Authorship Verification Task at PAN 2023.
Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), 2023

Overview of PAN 2023: Authorship Verification, Multi-Author Writing Style Analysis, Profiling Cryptocurrency Influencers, and Trigger Detection - Condensed Lab Overview.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2023

Overview of Touché 2023: Argument and Causal Retrieval.
Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), 2023

Guiding Oral Conversations: How to Nudge Users Towards Asking Questions?
Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, 2023

The Infinite Index: Information Retrieval on Generative Text-To-Image Models.
Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, 2023

Exploring Hyperparameter Usage and Tuning in Machine Learning Research.
Proceedings of the 2nd IEEE/ACM International Conference on AI Engineering, 2023

Webis @ ImageArg 2023: Embedding-based Stance and Persuasiveness Classification.
Proceedings of the 10th Workshop on Argument Mining, 2023

Trigger Warning Assignment as a Multi-Label Document Classification Problem.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Shared Tasks as Tutorials: A Methodical Approach.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Touché23-Image-Retrieval-for-Arguments.
Dataset, November, 2022



TexBiG Dataset for Analysing Complex Document Layouts in the Digital Humanities.
Dataset, September, 2022

Webis Health CauseNet 2022.
Dataset, September, 2022

Webis-Persuasive-Debaters-on-Reddit-CMV-2022.
Dataset, August, 2022


Touché23-Human-Value-Detection.
Dataset, July, 2022

Touché23-Human-Value-Detection.
Dataset, July, 2022


Touché22-Image-Retrieval-for-Arguments.
Dataset, June, 2022

Touché22-Image-Retrieval-for-Arguments.
Dataset, June, 2022

Touché22-Image-Retrieval-for-Arguments.
Dataset, June, 2022

PAN22 Authorship Analysis: Style Change Detection.
Dataset, March, 2022




Framing in Communication: From Theories to Computation (Dagstuhl Seminar 22131).
Dagstuhl Reports, 2022

Trigger Warnings: Bootstrapping a Violence Detector for FanFiction.
CoRR, 2022

Webis at TREC 2022: Deep Learning and Health Misinformation.
Proceedings of the Thirty-First Text REtrieval Conference, 2022

Axiomatic Retrieval Experimentation with ir_axioms.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Identifying Argumentative Questions in Web Search Logs.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Differential Bias: On the Perceptibility of Stance Imbalance in Argumentation.
Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022

Visual Web Archive Quality Assessment.
Proceedings of the Linking Theory and Practice of Digital Libraries, 2022

Overview of Touché 2022: Argument Retrieval - Extended Abstract.
Proceedings of the Advances in Information Retrieval, 2022

Overview of PAN 2022: Authorship Verification, Profiling Irony and Stereotype Spreaders, Style Change Detection, and Trigger Detection - Extended Abstract.
Proceedings of the Advances in Information Retrieval, 2022

A Dataset for Analysing Complex Document Layouts in the Digital Humanities and Its Evaluation with Krippendorff's Alpha.
Proceedings of the Pattern Recognition, 2022

Analyzing Persuasion Strategies of Debaters on Social Media.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Mining Health-related Cause-Effect Statements with High Precision at Large Scale.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

CausalQA: A Benchmark for Causal Question Answering.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Overview of the Style Change Detection Task at PAN 2022.
Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to, 2022

Overview of the Authorship Verification Task at PAN 2022.
Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to, 2022

Overview of Touché 2022: Argument Retrieval.
Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to, 2022

Overview of PAN 2022: Authorship Verification, Profiling Irony and Stereotype Spreaders, and Style Change Detection.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2022

What is That? Crowdsourcing Questions to a Virtual Exhibition.
Proceedings of the CHIIR '22: ACM SIGIR Conference on Human Information Interaction and Retrieval, Regensburg, Germany, March 14, 2022

Identifying the Human Values behind Arguments.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Data for PAN at SemEval 2019 Task 4: Hyperpartisan News Detection.
Dataset, December, 2021

Touché22-Argument-Retrieval-for-Controversial-Questions.
Dataset, November, 2021

Touché22-Argument-Retrieval-for-Controversial-Questions.
Dataset, November, 2021

Touché21-Argument-Retrieval-for-Controversial-Questions.
Dataset, November, 2021

Touché21-Argument-Retrieval-for-Controversial-Questions.
Dataset, November, 2021

Webis-Exhibition-Questions-21.
Dataset, October, 2021


Webis-ArgImages-21.
Dataset, August, 2021

Webis-ArgImages-21.
Dataset, August, 2021

Webis-Conversational-Query-Reformulations-21.
Dataset, August, 2021

Conversational-Query-Reformulations-21.
Dataset, June, 2021

Webis-Dataset-Reviews-21.
Dataset, February, 2021

Webis-WebSeg-20-Algorithm-Segmentations.
Dataset, January, 2021

Visual Analysis of Argumentation in Essays.
IEEE Trans. Vis. Comput. Graph., 2021

Meta-Information in Conversational Search.
ACM Trans. Inf. Syst., 2021

The information retrieval anthology 2021: inaugural status report and challenges ahead.
SIGIR Forum, 2021

Predicting essay quality from search and writing behavior.
J. Assoc. Inf. Sci. Technol., 2021

Argumentation technology.
it Inf. Technol., 2021

Erratum zu: Editorial.
Datenbank-Spektrum, 2021

STEREO: Scientific Text Reuse in Open Access Publications.
CoRR, 2021

FastWARC: Optimizing Large-Scale Web Archive Analytics.
CoRR, 2021

The Impact of Main Content Extraction on Near-Duplicate Detection.
CoRR, 2021

Webis at TREC 2021: Deep Learning, Health Misinformation, and Podcasts Tracks.
Proceedings of the Thirtieth Text REtrieval Conference, 2021

The Information Retrieval Anthology.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

CopyCat: Near-Duplicates Within and Between the ClueWeb and the Common Crawl.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

Identifying Queries in Instant Search Logs.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021

Towards Axiomatic Explanations for Neural Ranking Models.
Proceedings of the ICTIR '21: The 2021 ACM SIGIR International Conference on the Theory of Information Retrieval, 2021

Controlled Neural Sentence-Level Reframing of News Articles.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

An Empirical Comparison of Web Page Segmentation Algorithms.
Proceedings of the Advances in Information Retrieval, 2021

Overview of Touché 2021: Argument Retrieval - Extended Abstract.
Proceedings of the Advances in Information Retrieval, 2021

Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection - Extended Abstract.
Proceedings of the Advances in Information Retrieval, 2021

Toward Conversational Query Reformulation.
Proceedings of the Second International Conference on Design of Experimental Search & Information REtrieval Systems, 2021

The Meant, the Said, and the Understood: Conversational Argument Search and Cognitive Biases.
Proceedings of the CUI 2021, 2021

Overview of the Style Change Detection Task at PAN 2021.
Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to, 2021

Overview of the Cross-Domain Authorship Verification Task at PAN 2021.
Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to, 2021

Overview of Touché 2021: Argument Retrieval.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2021

Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2021

Image Retrieval for Arguments Using Stance-Aware Query Expansion.
Proceedings of the 8th Workshop on Argument Mining, 2021

Beyond Metadata: What Paper Authors Say About Corpora They Use.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Employing Argumentation Knowledge Graphs for Neural Argument Generation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Same Side Stance Classification Challenge.
Dataset, December, 2020

Webis SCSmeta 2021.
Dataset, October, 2020

Webis-WebSeg-20-Algorithm-Segmentations.
Dataset, October, 2020

Webis-News-Bias-20.
Dataset, October, 2020


Touché20-Argument-Retrieval-for-Controversial-Questions.
Dataset, September, 2020

Touché20-Argument-Retrieval-for-Controversial-Questions.
Dataset, September, 2020

Touché20-Argument-Retrieval-for-Controversial-Questions.
Dataset, September, 2020



Webis Argument Quality Corpus 2020 (Webis-ArgQuality-20).
Dataset, May, 2020

Webis ChangeMyView Corpus 2020 (Webis-CMV-20).
Dataset, April, 2020


Disaster Tweet Corpus 2020.
Dataset, March, 2020

PAN20 Authorship Analysis: Celebrity Profiling.
Dataset, February, 2020

Webis Abstractive Snippet Corpus 2020.
Dataset, February, 2020


Webis-Voice-based-and-Conversational-Argument-Search-20.
Dataset, January, 2020

The dilemma of the direct answer.
SIGIR Forum, 2020

Dagstuhl seminar 19461 on conversational search: seminar goals and working group outcomes.
SIGIR Forum, 2020

On divergence-based author obfuscation: An attack on the state of the art in statistical authorship verification.
it Inf. Technol., 2020

Editorial.
Datenbank-Spektrum, 2020

Analyzing Political Bias and Unfairness in News Articles at Different Levels of Granularity.
CoRR, 2020

The Importance of Suppressing Domain Style in Authorship Analysis.
CoRR, 2020

Conversational Search - A Report from Dagstuhl Seminar 19461.
CoRR, 2020

Abstractive Snippet Generation.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Comparative Web Search Questions.
Proceedings of the WSDM '20: The Thirteenth ACM International Conference on Web Search and Data Mining, 2020

Webis at TREC 2020: Health Misinformation Track Extended Abstract.
Proceedings of the Twenty-Ninth Text REtrieval Conference, 2020

Towards Predicting the Subscription Status of Twitch.tv Users - ECML-PKDD ChAT Discovery Challenge 2020.
Proceedings of ECML-PKDD 2020 ChAT Discovery Challenge on Chat Analytics for Twitch co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2020 (ECML-PKDD 2020), 2020

Analysis of Detection Models for Disaster-Related Tweets.
Proceedings of the 17th International Conference on Information Systems for Crisis Response and Management, 2020

Task Proposal: Abstractive Snippet Generation for Web Pages.
Proceedings of the 13th International Conference on Natural Language Generation, 2020

Web Archive Analytics.
Proceedings of the 50. Jahrestagung der Gesellschaft für Informatik, INFORMATIK 2020 - Back to the Future, Karlsruhe, Germany, 28. September, 2020

Detecting Media Bias in News Articles using Gaussian Bias Distributions.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Touché: First Shared Task on Argument Retrieval.
Proceedings of the Advances in Information Retrieval, 2020


News Editorials: Towards Summarizing Long Argumentative Texts.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Overview of the Style Change Detection Task at PAN 2020.
Proceedings of the Working Notes of CLEF 2020, 2020

Overview of the Celebrity Profiling Task at PAN 2020.
Proceedings of the Working Notes of CLEF 2020, 2020

Overview of the Cross-Domain Authorship Verification Task at PAN 2020.
Proceedings of the Working Notes of CLEF 2020, 2020

Overview of Touché 2020: Argument Retrieval.
Proceedings of the Working Notes of CLEF 2020, 2020

Overview of Touché 2020: Argument Retrieval - Extended Abstract.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2020

Overview of PAN 2020: Authorship Verification, Celebrity Profiling, Profiling Fake News Spreaders on Twitter, and Style Change Detection.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2020

Web Page Segmentation Revisited: Evaluation Framework and Dataset.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

Investigating Expectations for Voice-based and Conversational Argument Search on the Web.
Proceedings of the CHIIR '20: Conference on Human Information Interaction and Retrieval, 2020

Exploiting Personal Characteristics of Debaters for Predicting Persuasiveness.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Efficient Pairwise Annotation of Argument Quality.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Crawling and Preprocessing Mailing Lists At Scale for Dialog Analysis.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Analyzing the Persuasive Effect of Style in News Editorial Argumentation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

End-to-End Argumentation Knowledge Graph Construction.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
PAN19 Authorship Analysis: Cross-Domain Authorship Attribution.
Dataset, November, 2019

Same Side Stance Classification Challenge.
Dataset, August, 2019

Webis-Argument-Framing-19.
Dataset, August, 2019

Webis-Argument-Framing-19.
Dataset, August, 2019


Webis Query-Task-Mapping Corpus 2019 (Webis-QTM-19).
Dataset, May, 2019

Webis-Web-Errors-19.
Dataset, April, 2019

Webis-Web-Archive-17 Content Error Annotations.
Dataset, March, 2019

PAN19 Authorship Analysis: Celebrity Profiling.
Dataset, January, 2019

PAN19 Authorship Analysis: Celebrity Profiling.
Dataset, January, 2019

Webis-Web-Archive-17 Content Error Annotations.
Dataset, January, 2019

Modeling the usefulness of search results as measured by information use.
Inf. Process. Manag., 2019

Conversational Search (Dagstuhl Seminar 19461).
Dagstuhl Reports, 2019

Webis at TREC 2019: Decision Track.
Proceedings of the Twenty-Eighth Text REtrieval Conference, 2019

Query-Task Mapping.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Argument Search: Assessing Argument Relevance.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

SemEval-2019 Task 4: Hyperpartisan News Detection.
Proceedings of the 13th International Workshop on Semantic Evaluation, 2019

Generalizing Unmasking for Short Texts.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Data Acquisition for Argument Search: The args.me Corpus.
Proceedings of the KI 2019: Advances in Artificial Intelligence, 2019

A Dataset for Content Error Detection in Web Archives.
Proceedings of the 19th ACM/IEEE Joint Conference on Digital Libraries, 2019

Towards Summarization for Social Media - Results of the TL;DR Challenge.
Proceedings of the 12th International Conference on Natural Language Generation, 2019

Computational Argumentation Synthesis as a Language Modeling Task.
Proceedings of the 12th International Conference on Natural Language Generation, 2019

Exploratory Search Pipes with Scoped Facets.
Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, 2019

Modeling Frames in Argumentation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

A Decade of Shared Tasks in Digital Text Forensics at PAN.
Proceedings of the Advances in Information Retrieval, 2019

Wikipedia Text Reuse: Within and Without.
Proceedings of the Advances in Information Retrieval, 2019

Overview of the Style Change Detection Task at PAN 2019.
Proceedings of the Working Notes of CLEF 2019, 2019

Overview of the Celebrity Profiling Task at PAN 2019.
Proceedings of the Working Notes of CLEF 2019, 2019

Overview of the Cross-domain Authorship Attribution Task at PAN 2019.
Proceedings of the Working Notes of CLEF 2019, 2019

Overview of PAN 2019: Bots and Gender Profiling, Celebrity Profiling, Cross-Domain Authorship Attribution and Style Change Detection.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2019

Clarifying False Memories in Voice-based Search.
Proceedings of the 2019 Conference on Human Information Interaction and Retrieval, 2019

Same Side Stance Classification.
Proceedings of the Same Side Stance Classification Shared Task organized as a part of the 6th Workshop on Argument Mining (ArgMining 2019) and co-located with the the 57th Annual Meeting of the Association for Computational Linguistics (ACL19), 2019

Continuous Quality Control and Advanced Text Segment Annotation with WAT-SL 2.0.
Proceedings of the 13th Linguistic Annotation Workshop, 2019

Celebrity Profiling.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Heuristic Authorship Obfuscation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Bias Analysis and Mitigation in the Evaluation of Authorship Verification.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Model-Based Diagnosis for Cyber-Physical Production Systems Based on Machine Learning and Residual-Based Diagnosis Models.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Evolution of the PAN Lab on Digital Text Forensics.
Proceedings of the Information Retrieval Evaluation in a Changing World, 2019

TIRA Integrated Research Architecture.
Proceedings of the Information Retrieval Evaluation in a Changing World, 2019

2018
Data for PAN at SemEval 2019 Task 4: Hyperpartisan News Detection.
Dataset, November, 2018

Webis-Bias-Flipper-18.
Dataset, November, 2018

Challenge or Empower: Revisiting Argumentation Quality in a News Editorial Corpus.
Dataset, October, 2018

PAN18 Multi-Author Analysis: Style-Change-Detection.
Dataset, September, 2018

PAN18 Author Identification: Attribution.
Dataset, September, 2018

Arg-Microtexts Synthesis Benchmark.
Dataset, August, 2018

ArguAna Counterargs.
Dataset, July, 2018



Webis Wikipedia Text Reuse Corpus 2018 (Webis-Wikipedia-Text-Reuse-18).
Dataset, July, 2018

Webis Wikipedia Text Reuse Corpus 2018 (Webis-Wikipedia-Text-Reuse-18).
Dataset, July, 2018



BuzzFeed-Webis Fake News Corpus 2016.
Dataset, February, 2018

Retrieval Models.
Proceedings of the Encyclopedia of Social Network Analysis and Mining, 2nd Edition, 2018

Weblog Analysis.
Proceedings of the Encyclopedia of Social Network Analysis and Mining, 2nd Edition, 2018

Reproducible Web Corpora: Interactive Archiving with Automatic Quality Assessment.
ACM J. Data Inf. Qual., 2018

The Clickbait Challenge 2017: Towards a Regression Model for Clickbait Strength.
CoRR, 2018

Heuristic Feature Selection for Clickbait Detection.
CoRR, 2018

Webis at TREC 2018: Common Core Track.
Proceedings of the Twenty-Seventh Text REtrieval Conference, 2018

Toward Voice Query Clarification.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

A User Study on Snippet Generation: Text Reuse vs. Paraphrases.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

SemEval-2018 Task 12: The Argument Reasoning Comprehension Task.
Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Before Name-Calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Task Proposal: The TL;DR Challenge.
Proceedings of the 11th International Conference on Natural Language Generation, 2018

Learning to Flip the Bias of News Headlines.
Proceedings of the 11th International Conference on Natural Language Generation, 2018

Pseudo Descriptions for Meta-Data Retrieval.
Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval, 2018

Integrating OWL Ontologies for Smart Services into AutomationML and OPC UA.
Proceedings of the 23rd IEEE International Conference on Emerging Technologies and Factory Automation, 2018

Predicting Retrieval Success Based on Information Use for Writing Tasks.
Proceedings of the Digital Libraries for Open Knowledge, 2018

Visualization of the Topic Space of Argument Search Results in args.me.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018

Cross-Reading News.
Proceedings of the Second International Workshop on Recent Trends in News Information Retrieval co-located with 40th European Conference on Information Retrieval (ECIR 2018), 2018

A Plan for Ancillary Copyright: Original Snippets.
Proceedings of the Second International Workshop on Recent Trends in News Information Retrieval co-located with 40th European Conference on Information Retrieval (ECIR 2018), 2018

Shaping the Information Nutrition Label.
Proceedings of the Second International Workshop on Recent Trends in News Information Retrieval co-located with 40th European Conference on Information Retrieval (ECIR 2018), 2018

Elastic ChatNoir: Search Engine for the ClueWeb and the Common Crawl.
Proceedings of the Advances in Information Retrieval, 2018

WASP: Web Archiving and Search Personalized.
Proceedings of the First Biennial Conference on Design of Experimental Search & Information Retrieval Systems, 2018

Challenge or Empower: Revisiting Argumentation Quality in a News Editorial Corpus.
Proceedings of the 22nd Conference on Computational Natural Language Learning, 2018

Argumentation Synthesis following Rhetorical Strategies.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

Crowdsourcing a Large Corpus of Clickbait on Twitter.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

Overview of PAN 2018 - Author Identification, Author Profiling, and Author Obfuscation.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2018

Overview of the Author Obfuscation Task at PAN 2018: A New Approach to Measuring Safety.
Proceedings of the Working Notes of CLEF 2018, 2018

Overview of the 6th Author Profiling Task at PAN 2018: Multimodal Gender Identification in Twitter.
Proceedings of the Working Notes of CLEF 2018, 2018

Overview of the Author Identification Task at PAN-2018: Cross-domain Authorship Attribution and Style Change Detection.
Proceedings of the Working Notes of CLEF 2018, 2018

Retrieval of the Best Counterargument without Prior Topic Knowledge.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Modeling Deliberative Argumentation Strategies on Wikipedia.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

A Stylometric Inquiry into Hyperpartisan and Fake News.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Webis-Web-Archive-17.
Dataset, October, 2017

Webis-Web-Archive-17.
Dataset, October, 2017

Webis-Web-Archive-17.
Dataset, October, 2017

PAN17 Author Identification: Clustering.
Dataset, September, 2017

Webis Query Spelling Corpus 2017 (Webis-QSpell-17).
Dataset, August, 2017

Webis Query Spelling Corpus 2017 (Webis-QSpell-17).
Dataset, August, 2017

Webis-ArgRank-17.
Dataset, April, 2017


Webis-Mnemonics-17.
Dataset, March, 2017

A Universal Model for Discourse-Level Argumentation Analysis.
ACM Trans. Internet Techn., 2017

An Information Nutritional Label for Online Documents.
SIGIR Forum, 2017

Overview of the Wikidata Vandalism Detection Task at WSDM Cup 2017.
CoRR, 2017

The Argument Reasoning Comprehension Task.
CoRR, 2017

Webis at TREC 2017: Open Search and Core Tracks.
Proceedings of The Twenty-Sixth Text REtrieval Conference, 2017

A Large-Scale Query Spelling Correction Corpus.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017

A Large-scale Analysis of the Mnemonic Password Advice.
Proceedings of the 24th Annual Network and Distributed System Security Symposium, 2017

Spatio-Temporal Analysis of Reverted Wikipedia Edits.
Proceedings of the Eleventh International Conference on Web and Social Media, 2017

The Impact of Modeling Overall Argumentation with Tree Kernels.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

TL;DR: Mining Reddit to Learn Automatic Summarization.
Proceedings of the Workshop on New Frontiers in Summarization, 2017

Patterns of Argumentation Strategies across Topics.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Computational Argumentation Quality Assessment in Natural Language.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

"PageRank" for Argument Relevance.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

WAT-SL: A Customizable Web Annotation Tool for Segment Labeling.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Overview of the Author Identification Task at PAN-2017: Style Breach Detection and Author Clustering.
Proceedings of the Working Notes of CLEF 2017, 2017

Overview of PAN'17 - Author Identification, Author Profiling, and Author Obfuscation.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2017

Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter.
Proceedings of the Working Notes of CLEF 2017, 2017

Overview of the Author Obfuscation Task at PAN 2017: Safety Evaluation Revisited.
Proceedings of the Working Notes of CLEF 2017, 2017

Webis at the CLEF 2017 Dynamic Search Lab.
Proceedings of the Working Notes of CLEF 2017, 2017

Source Retrieval for Web-Scale Text Reuse Detection.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

Building an Argument Search Engine for the Web.
Proceedings of the 4th Workshop on Argument Mining, 2017

Unit Segmentation of Argumentative Texts.
Proceedings of the 4th Workshop on Argument Mining, 2017

Argumentation Quality Assessment: Theory vs. Practice.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016

Wikidata Vandalism Corpus 2016 (WDVC-16).
Dataset, September, 2016


Webis Clickbait Corpus 2016 (Webis-Clickbait-16).
Dataset, March, 2016


Editorial.
Datenbank-Spektrum, 2016

Webis at TREC 2016: Tasks, Total Recall, and Open Search Tracks.
Proceedings of The Twenty-Fifth Text REtrieval Conference, 2016

Cross-Domain Mining of Argumentative Text through Distant Supervision.
Proceedings of the NAACL HLT 2016, 2016

Clickbait Detection.
Proceedings of the Advances in Information Retrieval, 2016

Who Wrote the Web? Revisiting Influential Author Identification Research Applicable to Information Retrieval.
Proceedings of the Advances in Information Retrieval, 2016

Supporting Scholarly Search with Keyqueries.
Proceedings of the Advances in Information Retrieval, 2016

Topical Sequence Profiling.
Proceedings of the 27th International Workshop on Database and Expert Systems Applications, 2016

Using Argument Mining to Assess the Argumentation Quality of Essays.
Proceedings of the COLING 2016, 2016

A News Editorial Corpus for Mining Argumentation Strategies.
Proceedings of the COLING 2016, 2016

Clustering by Authorship Within and Across Documents.
Proceedings of the Working Notes of CLEF 2016, 2016

Overview of PAN'16 - New Challenges for Authorship Analysis: Cross-Genre Profiling, Clustering, Diarization, and Obfuscation.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2016

Author Obfuscation: Attacking the State of the Art in Authorship Verification.
Proceedings of the Working Notes of CLEF 2016, 2016

Overview of the 4th Author Profiling Task at PAN 2016: Cross-Genre Evaluations.
Proceedings of the Working Notes of CLEF 2016, 2016

Vandalism Detection in Wikidata.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

Axiomatic Result Re-Ranking.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

How Writers Search: Analyzing the Search and Writing Logs of Non-fictional Essays.
Proceedings of the 2016 ACM Conference on Human Information Interaction and Retrieval, 2016

Simulating Ideal and Average Users.
Proceedings of the Information Retrieval Technology, 2016

Keyqueries for Clustering and Labeling.
Proceedings of the Information Retrieval Technology, 2016

2015
PAN15 Author Identification: Verification.
Dataset, September, 2015

Wikidata Vandalism Corpus 2015 (WDVC-15).
Dataset, August, 2015

Webis Known-Item Question Corpus 2013 (Webis-KIQC-13).
Dataset, April, 2015

Debating Technologies (Dagstuhl Seminar 15512).
Dagstuhl Reports, 2015

Visual Assessment of Alleged Plagiarism Cases.
Comput. Graph. Forum, 2015

The Eras and Trends of Automatic Short Answer Grading.
Int. J. Artif. Intell. Educ., 2015

Webis at TREC 2015: Tasks and Total Recall Tracks.
Proceedings of The Twenty-Fourth Text REtrieval Conference, 2015

Towards Vandalism Detection in Knowledge Bases: Corpus Construction and Analysis.
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015

Webis: An Ensemble for Twitter Sentiment Detection.
Proceedings of the 9th International Workshop on Semantic Evaluation, 2015

What was the Query? Generating Queries for Document Sets with Applications in Cluster Labeling.
Proceedings of the Natural Language Processing and Information Systems, 2015

A Shared Task on Argumentation Mining in Newspaper Editorials.
Proceedings of the 2nd Workshop on Argumentation Mining, 2015

Sentiment Flow - A General Model of Web Review Argumentation.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

A Corpus of Realistic Known-Item Topics with Associated Web Pages in the ClueWeb09.
Proceedings of the Advances in Information Retrieval, 2015

Twitter Sentiment Detection via Ensemble Classification Using Averaged Confidence Scores.
Proceedings of the Advances in Information Retrieval, 2015

Overview of the PAN/CLEF 2015 Evaluation Lab.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2015

Overview of the Author Identification Task at PAN 2015.
Proceedings of the Working Notes of CLEF 2015, 2015

Towards Data Submissions for Shared Tasks: First Experiences for the Task of Text Alignment.
Proceedings of the Working Notes of CLEF 2015, 2015

Overview of the 3rd Author Profiling Task at PAN 2015.
Proceedings of the Working Notes of CLEF 2015, 2015

Source Retrieval for Plagiarism Detection from Large Web Corpora: Recent Approaches.
Proceedings of the Working Notes of CLEF 2015, 2015

What Users Ask a Search Engine: Analyzing One Billion Russian Question Queries.
Proceedings of the 24th ACM International Conference on Information and Knowledge Management, 2015

2014

PAN14 Originality: Text Alignment.
Dataset, September, 2014


Webis Tripad Sentiment Corpus 2013 (Webis-Tripad-13-Sentiment).
Dataset, April, 2014

Retrieval Models.
Encyclopedia of Social Network Analysis and Mining, 2014

Weblog Analysis.
Encyclopedia of Social Network Analysis and Mining, 2014

A Keyquery-Based Classification System for CORE.
D Lib Mag., 2014

Webis at TREC 2014: Web, Session, and Contextual Suggestion Tracks.
Proceedings of The Twenty-Third Text REtrieval Conference, 2014

Dynamic taxonomy composition via keyqueries.
Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

On the Use of Reliable-Negatives Selection Strategies in the PU Learning Approach for Quality Flaws Prediction in Wikipedia.
Proceedings of the 25th International Workshop on Database and Expert Systems Applications, 2014

Modeling Review Argumentation for Robust Sentiment Analysis.
Proceedings of the COLING 2014, 2014

Generating Acrostics via Paraphrasing and Heuristic Search.
Proceedings of the COLING 2014, 2014

Improving Cloze Test Performance of Language Learners Using Web N-Grams.
Proceedings of the COLING 2014, 2014

Overview of the Author Identification Task at PAN 2014.
Proceedings of the Working Notes for CLEF 2014 Conference, 2014

Overview of the 6th International Competition on Plagiarism Detection.
Proceedings of the Working Notes for CLEF 2014 Conference, 2014

Improving the Reproducibility of PAN's Shared Tasks: - Plagiarism Detection, Author Identification, and Author Profiling.
Proceedings of the Information Access Evaluation. Multilinguality, Multimodality, and Interaction, 2014

Overview of the Author Profiling Task at PAN 2014.
Proceedings of the Working Notes for CLEF 2014 Conference, 2014

A Review Corpus for Argumentation Analysis.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2014

2013

Webis Crowd Paraphrase Corpus 2011 (Webis-CPC-11).
Dataset, June, 2013

Webis Search Mission Corpus 2012 (Webis-SMC-12).
Dataset, May, 2013

Paraphrase acquisition via crowdsourcing and machine learning.
ACM Trans. Intell. Syst. Technol., 2013

CLEF 2013: information access evaluation meets multilinguality, multimodality, and visualization.
SIGIR Forum, 2013

Webis at TREC 2013-Session and Web Track.
Proceedings of The Twenty-Second Text REtrieval Conference, 2013

From keywords to keyqueries: content descriptors for the web.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

From search session detection to search mission detection.
Proceedings of the Open research Areas in Information Retrieval, 2013

Learning Overlap Optimization for Domain Decomposition Methods.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2013

Learning Efficient Information Extraction on Heterogeneous Texts.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

Exploratory Search Missions for TREC Topics.
Proceedings of the 3rd European Workshop on Human-Computer Interaction and Information Retrieval, 2013

Overview of the 5th International Competition on Plagiarism Detection.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013

Recent Trends in Digital Text Forensics and Its Evaluation - Plagiarism Detection, Author Identification, and Author Profiling.
Proceedings of the Information Access Evaluation. Multilinguality, Multimodality, and Visualization, 2013

Information extraction as a filtering task.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

Crowdsourcing Interaction Logs to Understand Text Reuse from the Web.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2012
Webis Text Reuse Corpus 2012.
Dataset, September, 2012

PAN Wikipedia Quality Flaw Corpus 2012 (PAN-WQF-12).
Dataset, September, 2012

Webis Simulation Data Mining Bridge Models Corpus 2012 (Webis-SDMbridge-12).
Dataset, June, 2012

Webis Patent Retrieval Corpus 2012 (Webis-PRA-12).
Dataset, April, 2012

Paderborn Genre Analysis Corpus 2012 (PaGA-12).
Dataset, January, 2012

WORDGRAPH: Keyword-in-Context Visualization for NETSPEAK's Wildcard Search.
IEEE Trans. Vis. Comput. Graph., 2012

Information Retrieval in the Commentsphere.
ACM Trans. Intell. Syst. Technol., 2012

The optimum clustering framework: implementing the cluster hypothesis.
Inf. Retr., 2012

Measuring the quality of web content using factual information.
Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality, 2012

A breakdown of quality flaws in Wikipedia.
Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality, 2012

Webis at the TREC 2012 Session Track.
Proceedings of The Twenty-First Text REtrieval Conference, 2012

ChatNoir: a search engine for the ClueWeb09 corpus.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

Cluster-based one-class ensemble for classification problems in information retrieval.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

Ousting ivory tower research: towards a web framework for providing experiments as a service.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

First Experiences with TIRA for Reproducible Evaluation in Information Retrieval.
Proceedings of the SIGIR 2012 Workshop on Open Source Information Retrieval, 2012

Predicting quality flaws in user-generated content: the case of wikipedia.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

Solving Modeling Problems with Machine Learning -- A Classification Scheme of Model Learning Approaches for Technical Systems.
Proceedings of the Dagstuhl-Workshop MBEES: Modellbasierte Entwicklung eingebetteter Systeme VIII, 2012

Towards realistic known-item topics for the ClueWeb.
Proceedings of the Information Interaction in Context: 2012, 2012

Estimating the Expected Effectiveness of Text Classification Solutions under Subclass Distribution Shifts.
Proceedings of the 12th IEEE International Conference on Data Mining, 2012

The Impact of Spelling Errors on Patent Search.
Proceedings of the EACL 2012, 2012

TIRA: Configuring, Executing, and Disseminating Information Retrieval Experiments.
Proceedings of the 23rd International Workshop on Database and Expert Systems Applications, 2012

Optimal Scheduling of Information Extraction Algorithms.
Proceedings of the COLING 2012, 2012

Overview of the 4th International Competition on Plagiarism Detection.
Proceedings of the CLEF 2012 Evaluation Labs and Workshop, 2012

Overview of the 1th International Competition on Quality Flaw Prediction in Wikipedia.
Proceedings of the CLEF 2012 Evaluation Labs and Workshop, 2012

Search result presentation based on faceted clustering.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Towards optimum query segmentation: in doubt without.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Learning Behavior Models for Hybrid Timed Systems.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011
PAN Wikipedia Vandalism Corpus 2011 (PAN-WVC-11).
Dataset, July, 2011

PAN Plagiarism Corpus 2011 (PAN-PC-11).
Dataset, June, 2011

Cross-Lingual Adaptation Using Structural Correspondence Learning.
ACM Trans. Intell. Syst. Technol., 2011

Fourth international workshop on uncovering plagiarism, authorship, and social software misuse.
SIGIR Forum, 2011

Intrinsic plagiarism analysis.
Lang. Resour. Evaluation, 2011

Cross-language plagiarism detection.
Lang. Resour. Evaluation, 2011

Challenges in Document Mining (Dagstuhl Seminar 11171).
Dagstuhl Reports, 2011

Query segmentation revisited.
Proceedings of the 20th International Conference on World Wide Web, 2011

Towards automatic quality assurance in Wikipedia.
Proceedings of the 20th International Conference on World Wide Web, 2011

Webis at the TREC 2011 Session Track.
Proceedings of The Twentieth Text REtrieval Conference, 2011

Candidate Document Retrieval for Web-Scale Text Reuse Detection.
Proceedings of the String Processing and Information Retrieval, 2011

Applying the User-over-Ranking Hypothesis to Query Formulation.
Proceedings of the Advances in Information Retrieval Theory, 2011

Introducing the User-over-Ranking Hypothesis.
Proceedings of the Advances in Information Retrieval, 2011

Classifying with Co-stems - A New Representation for Information Filtering.
Proceedings of the Advances in Information Retrieval, 2011

Robust Models in Information Retrieval.
Proceedings of the 2011 Database and Expert Systems Applications, 2011

Overview of the 3rd International Competition on Plagiarism Detection.
Proceedings of the CLEF 2011 Labs and Workshop, 2011

Constructing efficient information extraction pipelines.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Beyond precision@10: clustering the long tail of web search results.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Query session detection as a cascade.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Insights into explicit semantic analysis.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Detection of text quality flaws as a one-class classification problem.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Simulation Data Mining for Supporting Bridge Design.
Proceedings of the Ninth Australasian Data Mining Conference, AusDM 2011, Ballarat, 2011

The NETSPEAK WORDGRAPH: Visualizing keywords in context.
Proceedings of the IEEE Pacific Visualization Symposium, 2011

Web Genre Analysis: Use Cases, Retrieval Models, and Implementation Issues.
Proceedings of the Genres on the Web, 2011

Teaching IR: Curricular Considerations.
Proceedings of the Teaching and Learning in Information Retrieval, 2011

2010
Webis-Revenue-10.
Dataset, August, 2010

PAN Wikipedia Vandalism Corpus 2010 (PAN-WVC-10).
Dataset, July, 2010

Webis Query Segmentation Corpus 2010 (Webis-QSeC-10).
Dataset, July, 2010

Webis Open Directory Project Corpus 2010 (Webis-ODP-10).
Dataset, June, 2010

PAN Plagiarism Corpus 2010 (PAN-PC-10).
Dataset, May, 2010

Editorial.
Datenbank-Spektrum, 2010

Towards comment-based cross-media retrieval.
Proceedings of the 19th International Conference on World Wide Web, 2010

Identifying featured articles in wikipedia: writing style matters.
Proceedings of the 19th International Conference on World Wide Web, 2010

Making the Most of a Web Search Session.
Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence, 2010

Webis at the TREC 2010 Sessions Track.
Proceedings of The Nineteenth Text REtrieval Conference, 2010

The power of naive query segmentation.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

Evaluating Humour Features on Web Comments.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Corpus and Evaluation Measures for Automatic Plagiarism Detection.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Capacity-Constrained Query Formulation.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2010

Retrieving Customary Web Language to Assist Writers.
Proceedings of the Advances in Information Retrieval, 2010

Netspeak - Assisting Writers in Choosing Words.
Proceedings of the Advances in Information Retrieval, 2010

Cross-Language High Similarity Search: Why No Sub-linear Time Bound Can Be Expected.
Proceedings of the Advances in Information Retrieval, 2010

Search Strategies for Keyword-based Queries.
Proceedings of the Database and Expert Systems Applications, 2010

Efficient Statement Identification for Automatic Market Forecasting.
Proceedings of the COLING 2010, 2010

An Evaluation Framework for Plagiarism Detection.
Proceedings of the COLING 2010, 2010

Overview of the 1st International Competition on Wikipedia Vandalism Detection.
Proceedings of the CLEF 2010 LABs and Workshops, 2010

Overview of the 2nd International Competition on Plagiarism Detection.
Proceedings of the CLEF 2010 LABs and Workshops, 2010

Cross-Language Text Classification Using Structural Correspondence Learning.
Proceedings of the ACL 2010, 2010

2009
PAN Plagiarism Corpus 2009 (PAN-PC-09).
Dataset, September, 2009

Information Retrieval: Concepts and Practical Considerations for Teaching a Rising Topic.
Datenbank-Spektrum, 2009

The ESA retrieval model revisited.
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009

Using Models for Dynamic System Diagnosis: A Case Study in Automotive Engineering.
Proceedings of the Dagstuhl-Workshop MBEES: Modellbasierte Entwicklung eingebetteter Systeme V, 2009

Collection-Relative Representations: A Unifying View to Retrieval Models.
Proceedings of the Database and Expert Systems Applications, 2009

Evaluating Cross-Language Explicit Semantic Analysis and Cross Querying at TEL@CLEF 2009.
Proceedings of the Working Notes for CLEF 2009 Workshop co-located with the 13th European Conference on Digital Libraries (ECDL 2009) , Corfù, Greece, September 30, 2009

Evaluating Cross-Language Explicit Semantic Analysis and Cross Querying.
Proceedings of the Multilingual Information Access Evaluation I. Text Retrieval Experiments, 2009

2008
Model Construction for Knowledge-Intensive Engineering Tasks.
Proceedings of the Advances of Computational Intelligence in Industrial Systems, 2008

Coping with large design spaces: design problem solving in fluidic engineering.
Int. J. Softw. Tools Technol. Transf., 2008

Retrieval Models for Genre Classification.
Scand. J. Inf. Syst., 2008

Retrieval-Technologien für die Plagiaterkennung in Programmen.
Proceedings of the LWA 2008, 2008

Automatic Vandalism Detection in Wikipedia.
Proceedings of the Advances in Information Retrieval , 2008

A Wikipedia-Based Multilingual Retrieval Model.
Proceedings of the Advances in Information Retrieval , 2008

Meta Analysis within Authorship Verification.
Proceedings of the 19th International Workshop on Database and Expert Systems Applications (DEXA 2008), 2008

2007
Webis Wikipedia Vandalism Corpus (Webis-WVC-07).
Dataset, January, 2007

Plagiarism analysis, authorship identification, and near-duplicate detection PAN'07.
SIGIR Forum, 2007

Topic Identification.
Künstliche Intell., 2007

Strategies for retrieving plagiarized documents.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

Intrinsic Plagiarism Analysis with Meta Learning.
Proceedings of the SIGIR 2007 International Workshop on Plagiarism Analysis, 2007

Principles of hash-based text retrieval.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

New Issues in Near-duplicate Detection.
Proceedings of the Data Analysis, Machine Learning and Applications, 2007

An MDA Approach to Implement Personal IR Tools.
Proceedings of the 18th International Workshop on Database and Expert Systems Applications (DEXA 2007), 2007

2006
Realization of Web-based simulation services.
Comput. Ind., 2006

Hashing-basierte Indizierung: Anwendungsszenarien, Theorie und Methoden.
Proceedings of the LWA 2006: Lernen - Wissensentdeckung - Adaptivität, Hildesheim, Deutschland, October 9th-11th 2006, joint workshop event of several interest groups of the German Society for Informatics (GI) - 14th Workshop on Adaptivity and User Modeling in Interactive Systems (ABIS 2006) - Workshop Information Retrieval 2006 of the Special Interest Group Information Retrieval (FGIR 2006) - Workshop on Knowledge and Experience Management (FGWM 2006), 2006

Service-orientierte Architekturen für Information Retrieval.
Proceedings of the LWA 2006: Lernen - Wissensentdeckung - Adaptivität, Hildesheim, Deutschland, October 9th-11th 2006, joint workshop event of several interest groups of the German Society for Informatics (GI) - 14th Workshop on Adaptivity and User Modeling in Interactive Systems (ABIS 2006) - Workshop Information Retrieval 2006 of the Special Interest Group Information Retrieval (FGIR 2006) - Workshop on Knowledge and Experience Management (FGWM 2006), 2006

Putting Successor Variety Stemming to Work.
Proceedings of the Advances in Data Analysis, 2006

Plagiarism Detection Without Reference Collections.
Proceedings of the Advances in Data Analysis, 2006

Intrinsic Plagiarism Detection.
Proceedings of the Advances in Information Retrieval, 2006

Is Web Genre Identification Feasible?
Proceedings of the ECAI 2006, 17th European Conference on Artificial Intelligence, August 29, 2006

Phonetic Spelling and Heuristic Search.
Proceedings of the ECAI 2006, 17th European Conference on Artificial Intelligence, August 29, 2006

AI and Music: Toward a Taxonomy of Problem Classes.
Proceedings of the ECAI 2006, 17th European Conference on Artificial Intelligence, August 29, 2006

Speeding Up Model-based Diagnosis by a Heuristic Approach to Solving SAT.
Proceedings of the IASTED International Conference on Artificial Intelligence and Applications, 2006

2005
Near Similarity Search and Plagiarism Analysis.
Proceedings of the From Data and Information Analysis to Knowledge Engineering, 2005

2004
Genre-KI-04.
Dataset, September, 2004

Automatische Kategorisierung für Web-basierte Suche - Einführung, Techniken und Projekte.
Künstliche Intell., 2004

Genre Classification of Web Pages..
Proceedings of the KI 2004: Advances in Artificial Intelligence, 2004

Engineers Don't Search.
Proceedings of the Logic versus Approximation, 2004

2003
Automatic Document Categorization: Interpreting the Perfomance of Clustering Algorithms.
Proceedings of the KI 2003: Advances in Artificial Intelligence, 2003

2002
Design Problem Solving by Functional Abstraction.
Proceedings of the Proceedings Workshop Planen und Konfigurieren (PuK-2002), 2002

Analysis of Clustering Algorithms for Web-Based Search.
Proceedings of the Practical Aspects of Knowledge Management, 4th International Conference, 2002

2001
Modeling Design Knowledge on Structure.
Proceedings of the Modellierung 2001, 2001

Generation of Similarity Measures from Different Sources.
Proceedings of the Engineering of Intelligent Systems, 2001

Visualization of traffic structures.
Proceedings of the IEEE International Conference on Communications, 2001

2000
Hybrid Constraints in Automated Model Synthesis and Model Processing.
Electron. Notes Discret. Math., 2000

On automated design in chemical engineering.
Proceedings of the Fourth International Conference on Knowledge-Based Intelligent Information Engineering Systems & Allied Technologies, 2000

A Meta Heuristic for Graph Drawing.
Proceedings of the working conference on Advanced visual interfaces, 2000

1999
Generating Heuristics to Control Configuration Processes.
Appl. Intell., 1999

On the Nature of Structure and Its Identification.
Proceedings of the Graph-Theoretic Concepts in Computer Science, 1999

On Adaptation in Case-based Design.
Proceedings of the Third ICSC Symposia on Intelligent Industrial Automation (IIA'99) and Soft Computing (SOCO'99), 1999

1998
Selection of Numerical Methods in Specific Simulation Applications.
Proceedings of the Tasks and Methods in Applied Artificial Intelligence, 1998

1995
Entwurfsunterstützung in der Hydraulik mit dem System art deco.
Künstliche Intell., 1995

Functional models in configuration systems.
PhD thesis, 1995

1992
A Theoretical Framework for Configuration.
Proceedings of the Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, 1992

1991
Problemklassen in Expertenssystemen: Einordnung und Diskussion des GWAI'90-Workshops.
Künstliche Intell., 1991


  Loading...