Felix Naumann

Orcid: 0000-0002-4483-1389

Affiliations:
  • Hasso Plattner Institute, Potsdam, Germany


According to our database1, Felix Naumann authored at least 269 papers between 1998 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
PRISMA: A Privacy-Preserving Schema Matcher using Functional Dependencies.
Proceedings of the Proceedings 28th International Conference on Extending Database Technology, 2025

2024
AutoTSAD: Unsupervised Holistic Anomaly Detection for Time Series Data.
Proc. VLDB Endow., July, 2024

Determining the Largest Overlap between Tables.
Proc. ACM Manag. Data, February, 2024

Discovering Functional Dependencies through Hitting Set Enumeration.
Proc. ACM Manag. Data, February, 2024

Repairing Databases over Metric Spaces with Coincidence Constraints.
CoRR, 2024

Enabling Data Dependency-based Query Optimization.
CoRR, 2024

Data Quality Assessment: Challenges and Opportunities.
CoRR, 2024

Overlap-Based Duplicate Table Detection.
Proceedings of the 32nd Symposium of Advanced Database Systems, 2024

Discovering Denial Constraints in Dynamic Datasets.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Efficient Discovery of Temporal Inclusion Dependencies in Wikipedia Tables.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

TASHEEH: Repairing Row-Structure in Raw CSV Files.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

2023
Editorial: Special Issue for Selected Papers of VLDB 2021.
VLDB J., November, 2023

Correction to: Data dependencies for query optimization: a survey.
VLDB J., March, 2023

BrewER: Entity Resolution On-Demand.
Proc. VLDB Endow., 2023

Pollock: A Data Loading Benchmark.
Proc. VLDB Endow., 2023

Discovering Similarity Inclusion Dependencies.
Proc. ACM Manag. Data, 2023

Matching Roles from Temporal Data: Why Joe Biden is not only President, but also Commander-in-Chief.
Proc. ACM Manag. Data, 2023

Preface QDB.
Proceedings of the Joint Proceedings of Workshops at the 49th International Conference on Very Large Data Bases (VLDB 2023), Vancouver, Canada, August 28, 2023

BCNF* - From Normalized- to Star-Schemas and Back Again.
Proceedings of the Companion of the 2023 International Conference on Management of Data, 2023

Entity Resolution On-Demand for Querying Dirty Datasets.
Proceedings of the 31st Symposium of Advanced Database Systems, 2023

Detecting Stale Data in Wikipedia Infoboxes.
Proceedings of the Proceedings 26th International Conference on Extending Database Technology, 2023

MORPHER: Structural Transformation of Ill-formed Rows.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

ExtracTable: Extracting Tables from Raw Data Files.
Proceedings of the Datenbanksysteme für Business, 2023

2022
Data dependencies for query optimization: a survey.
VLDB J., 2022

Diversity and Inclusion Activities in Database Conferences: A 2021 Report.
SIGMOD Rec., 2022

Entity Resolution On-Demand.
Proc. VLDB Endow., 2022

Fast Algorithms for Denial Constraint Discovery.
Proc. VLDB Endow., 2022

Frost: A Platform for Benchmarking and Exploring Data Matching Results.
Proc. VLDB Endow., 2022

AI Compliance - Challenges of Bridging Data Science and Law.
ACM J. Data Inf. Qual., 2022

Data Errors: Symptoms, Causes and Origins.
IEEE Data Eng. Bull., 2022

The Effects of Data Quality on ML-Model Performance.
CoRR, 2022

Mondrian: Spreadsheet Layout Detection.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Mining Change Rules.
Proceedings of the 25th International Conference on Extending Database Technology, 2022

SURAGH: Syntactic Pattern Matching to Identify Ill-Formed Records.
Proceedings of the 25th International Conference on Extending Database Technology, 2022

Aggregation Detection in CSV Files.
Proceedings of the 25th International Conference on Extending Database Technology, 2022

Exploring and Analyzing Change: The Janus Project.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

Workload-driven, Lazy Discovery of Data Dependencies for Query Optimization.
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022

2021
Discovering Relaxed Functional Dependencies Based on Multi-Attribute Dominance.
IEEE Trans. Knowl. Data Eng., 2021

VLDB 2021: Designing a Hybrid Conference.
SIGMOD Rec., 2021

How Inclusive are We?
SIGMOD Rec., 2021

Detecting Layout Templates in Complex Multiregion Files.
Proc. VLDB Endow., 2021

Fast Detection of Denial Constraint Violations.
Proc. VLDB Endow., 2021

Front Matter.
Proc. VLDB Endow., 2021

Knowledge Transfer for Entity Resolution with Siamese Neural Networks.
ACM J. Data Inf. Qual., 2021

Ein Data Engineering Kurs für 10.000 Teilnehmer.
Datenbank-Spektrum, 2021

Frost: Benchmarking and Exploring Data Matching Results.
CoRR, 2021

Few-Shot Knowledge Validation using Rules.
Proceedings of the WWW '21: The Web Conference 2021, 2021

The Secret Life of Wikipedia Tables.
Proceedings of the 2nd Workshop on Search, 2021

Evaluation of Duplicate Detection Algorithms: From Quality Measures to Test Data Generation.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Relational Header Discovery using Similarity Search in a Table Corpus.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Discovering Relaxed Functional Dependencies based on Multi-attribute Dominance [Extended Abstract].
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Structured Object Matching across Web Page Revisions.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Structure Detection in Verbose CSV Files.
Proceedings of the 24th International Conference on Extending Database Technology, 2021

2020
RHEEMix in the data jungle: a cost-based optimizer for cross-platform systems.
VLDB J., 2020

Efficient Discovery of Matching Dependencies.
ACM Trans. Database Syst., 2020

Data Preparation: A Survey of Commercial Tools.
SIGMOD Rec., 2020

MDedup: Duplicate Detection with Matching Dependencies.
Proc. VLDB Endow., 2020

Hitting Set Enumeration with Partial Information for Unique Column Combination Discovery.
Proc. VLDB Endow., 2020

Holistic primary key and foreign key detection.
J. Intell. Inf. Syst., 2020

Data Preparation for Duplicate Detection.
ACM J. Data Inf. Qual., 2020

Transforming Pairwise Duplicates to Entity Clusters for High-quality Duplicate Detection.
ACM J. Data Inf. Qual., 2020

Explainable AI under contract and tort law: legal incentives and technical challenges.
Artif. Intell. Law, 2020

Natural Key Discovery in Wikipedia Tables.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Data Profiling in the Relational World (invited paper).
Proceedings of the Joint Proceedings of Workshops AI4LEGAL2020, 2020

Sense Tree: Discovery of New Word Senses with Graph-based Scoring.
Proceedings of the Conference "Lernen, 2020

Discovering Biased News Articles Leveraging Multiple Human Annotations.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Efficient Detection of Data Dependency Violations.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

2019
Discovery of Approximate (and Exact) Denial Constraints.
Proc. VLDB Endow., 2019

Editorial.
Datenbank-Spektrum, 2019

Exploring Change.
Proceedings of the 27th Italian Symposium on Advanced Database Systems, 2019

A Scoring-based Approach for Data Preparator Suggestion.
Proceedings of the Conference on "Lernen, Wissen, Daten, Analysen", Berlin, Germany, September 30, 2019

Optimizing Cross-Platform Data Movement.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

DynFD: Functional Dependency Discovery in Dynamic Datasets.
Proceedings of the Advances in Database Technology, 2019

Inclusion Dependency Discovery: An Experimental Evaluation of Thirteen Algorithms.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

DBChEx: Interactive Exploration of Data and Schema Change.
Proceedings of the 9th Biennial Conference on Innovative Data Systems Research, 2019

The relational database management systems genealogy.
Proceedings of the Making Databases Work: the Pragmatic Wisdom of Michael Stonebraker, 2019

2018
Data Profiling
Synthesis Lectures on Data Management, Morgan & Claypool Publishers, ISBN: 978-3-031-01865-7, 2018

Exploring Change - A New Dimension of Data Analytics.
Proc. VLDB Endow., 2018

Discovery of Genuine Functional Dependencies from Relational Data with Missing Values.
Proc. VLDB Endow., 2018

Efficient Discovery of Approximate Dependencies.
Proc. VLDB Endow., 2018

Experience: Enhancing Address Matching with Geocoding and Similarity Measure Selection.
ACM J. Data Inf. Qual., 2018

Data Change Exploration Using Time Series Clustering.
Datenbank-Spektrum, 2018

RHEEMix in the Data Jungle - A Cross-Platform Query Optimizer -.
CoRR, 2018

Where in the World Is Carmen Sandiego?: Detecting Person Locations via Social Media Discussions.
Proceedings of the 10th ACM Conference on Web Science, 2018

The Challenges of Creating, Maintaining and Exploring Graphs of Financial Entities.
Proceedings of the Fourth International Workshop on Data Science for Macro-Modeling with Financial and Economic Datasets, 2018

Towards Progressive Search-driven Entity Resolution.
Proceedings of the 26th Italian Symposium on Advanced Database Systems, 2018

Dissecting Company Names using Sequence Labeling.
Proceedings of the Conference "Lernen, Wissen, Daten, Analysen", 2018

Piggyback Profiling: Enhancing Query Results with Metadata.
Proceedings of the Conference "Lernen, Wissen, Daten, Analysen", 2018

CurEx: A System for Extracting, Curating, and Exploring Domain-Specific Knowledge Graphs from Text.
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018

2017
Detecting Inclusion Dependencies on Very Many Tables.
ACM Trans. Database Syst., 2017

Data Quality: The Role of Empiricism.
SIGMOD Rec., 2017

Cardinality Estimation: An Experimental Survey.
Proc. VLDB Endow., 2017

Efficient Denial Constraint Discovery with Hydra.
Proc. VLDB Endow., 2017

Das Fachgebiet "Informationssysteme" am Hasso-Plattner-Institut.
Datenbank-Spektrum, 2017

What was Hillary Clinton doing in Katy, Texas?
Proceedings of the 26th International Conference on World Wide Web Companion, 2017

Enabling Change Exploration: Vision Paper.
Proceedings of the ExploreDB'17, Chicago, IL, USA, May 19, 2017, 2017

Data Profiling: A Tutorial.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

Uncovering Business Relationships: Context-sensitive Relationship Extraction for Difficult Relationship Types.
Proceedings of the Lernen, 2017

Identifying Media Bias by Analyzing Reported Speech.
Proceedings of the 2017 IEEE International Conference on Data Mining, 2017

Data-driven Schema Normalization.
Proceedings of the 20th International Conference on Extending Database Technology, 2017

Improving Company Recognition from Unstructured Text by using Dictionaries.
Proceedings of the 20th International Conference on Extending Database Technology, 2017

Metacrate: Organize and Analyze Millions of Data Profiles.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

A Hybrid Approach for Efficient Unique Column Combination Discovery.
Proceedings of the Datenbanksysteme für Business, 2017

Fast Approximate Discovery of Inclusion Dependencies.
Proceedings of the Datenbanksysteme für Business, 2017

2016
CohEEL: Coherent and efficient named entity linking through random walks.
J. Web Semant., 2016

Efficient order dependency detection.
VLDB J., 2016

The Information Systems Group at HPI.
SIGMOD Rec., 2016

Data Anamnesis: Admitting Raw Data into an Organization.
IEEE Data Eng. Bull., 2016

Which Answer is Best?: Predicting Accepted Answers in MOOC Forums.
Proceedings of the 25th International Conference on World Wide Web, 2016

A Hybrid Approach to Functional Dependency Discovery.
Proceedings of the 2016 International Conference on Management of Data, 2016

RDFind: Scalable Conditional Inclusion Dependency Discovery in RDF Datasets.
Proceedings of the 2016 International Conference on Management of Data, 2016

Topic Shifts in StackOverflow: Ask it Like Socrates.
Proceedings of the Natural Language Processing and Information Systems, 2016

Cluster-Based Sorted Neighborhood for Efficient Duplicate Detection.
Proceedings of the IEEE International Conference on Data Mining Workshops, 2016

Data profiling.
Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016

Holistic Data Profiling: Simultaneous Discovery of Various Metadata.
Proceedings of the 19th International Conference on Extending Database Technology, 2016

Combination of Rule-based and Textual Similarity Approaches to Match Financial Entities.
Proceedings of the Second International Workshop on Data Science for Macro-Modeling, 2016

Approximate Discovery of Functional Dependencies for Large Datasets.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

2015
Profiling relational data: a survey.
VLDB J., 2015

Progressive Duplicate Detection.
IEEE Trans. Knowl. Data Eng., 2015

Divide & Conquer-based Inclusion Dependency Discovery.
Proc. VLDB Endow., 2015

Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms.
Proc. VLDB Endow., 2015

Data Profiling with Metanome.
Proc. VLDB Endow., 2015

Front Matter.
Proc. VLDB Endow., 2015

SOFA: An extensible logical optimizer for UDF-heavy data flows.
Inf. Syst., 2015

Who wants a computer to be a millionaire?
Inf. Process. Lett., 2015

Uniqueness, Density, and Keyness: Exploring Class Hierarchies.
Proceedings of the 6th International Workshop on Consuming Linked Data (COLD 2015) co-located with 14th International Semantic Web Conference (ISWC 2015), 2015

Exploring Linked Data Graph Structures.
Proceedings of the ISWC 2015 Posters & Demonstrations Track co-located with the 14th International Semantic Web Conference (ISWC-2015), 2015

A Serendipity Model for News Recommendation.
Proceedings of the KI 2015: Advances in Artificial Intelligence, 2015

Estimating Data Integration and Cleaning Effort.
Proceedings of the 18th International Conference on Extending Database Technology, 2015

Scaling Out the Discovery of Inclusion Dependencies.
Proceedings of the Datenbanksysteme für Business, 2015

2014
The Stratosphere platform for big data analytics.
VLDB J., 2014

Reach for gold: An annealing standard to evaluate duplicate detection results.
ACM J. Data Inf. Qual., 2014

Editorial.
ACM J. Data Inf. Qual., 2014

Ein Datenbankkurs mit 6000 Teilnehmern - Erfahrungen auf der openHPI MOOC Plattform.
Inform. Spektrum, 2014

Semi-Supervised Consensus Clustering: Reducing Human Effort.
Proceedings of the 2014 IEEE International Conference on Data Mining Workshops, 2014

Bootstrapping Wikipedia to answer ambiguous person name queries.
Proceedings of the Workshops Proceedings of the 30th International Conference on Data Engineering Workshops, 2014

Detecting unique column combinations on dynamic data.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014

Profiling and mining RDF data with ProLOD++.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014

LODOP - Multi-Query Optimization for Linked Data Profiling Queries.
Proceedings of the 1st International Workshop on Dataset PROFIling & fEderated Search for Linked Data co-located with the 11th Extended Semantic Web Conference, 2014

Amending RDF Entities with New Facts.
Proceedings of the 3rd Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data co-located with 11th Extended Semantic Web Conference (ESWC 2014), 2014

BEL: Bagging for Entity Linking.
Proceedings of the COLING 2014, 2014

Estimating the Number and Sizes of Fuzzy-Duplicate Clusters.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

DFD: Efficient Functional Dependency Discovery.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

2013
Topic modeling for expert finding using latent Dirichlet allocation.
WIREs Data Mining Knowl. Discov., 2013

Data profiling revisited.
SIGMOD Rec., 2013

Scalable Discovery of Unique Column Combinations.
Proc. VLDB Endow., 2013

Fusion Cubes: Towards Self-Service Business Intelligence.
Int. J. Data Warehous. Min., 2013

Cross-lingual entity matching and infobox alignment in Wikipedia.
Inf. Syst., 2013

Cost-aware query planning for similarity search.
Inf. Syst., 2013

Improving RDF Data Through Association Rule Mining.
Datenbank-Spektrum, 2013

SOFA: An Extensible Logical Optimizer for UDF-heavy Dataflows.
CoRR, 2013

Bootstrapped Grouping of Results to Ambiguous Person Name Queries.
CoRR, 2013

Analyzing and predicting viral tweets.
Proceedings of the 22nd International World Wide Web Conference, 2013

Bulk sorted access for efficient top-k retrieval.
Proceedings of the Conference on Scientific and Statistical Database Management, 2013

On choosing thresholds for duplicate detection.
Proceedings of the 18th International Conference on Information Quality, 2013

Systematic ETL management - Experiences with high-level operators.
Proceedings of the 18th International Conference on Information Quality, 2013

Caching and Prefetching Strategies for SPARQL Queries.
Proceedings of the Semantic Web: ESWC 2013 Satellite Events, 2013

Detecting SPARQL Query Templates for Data Prefetching.
Proceedings of the Semantic Web: Semantics and Big Data, 10th International Conference, 2013

Synonym Analysis for Predicate Expansion.
Proceedings of the Semantic Web: Semantics and Big Data, 10th International Conference, 2013

Duplicate Detection on GPUs.
Proceedings of the Datenbanksysteme für Business, 2013

2012
Integrating open government data with stratosphere for more transparency.
J. Web Semant., 2012

Scalable Iterative Graph Duplicate Detection.
IEEE Trans. Knowl. Data Eng., 2012

The data analytics group at the qatar computing research institute.
SIGMOD Rec., 2012

Holistic and Scalable Ontology Alignment for Linked Open Data.
Proceedings of the WWW2012 Workshop on Linked Data on the Web, 2012

GovWILD: integrating open government data for transparency.
Proceedings of the 21st World Wide Web Conference, 2012

Efficient Similarity Search in Very Large String Sets.
Proceedings of the Scientific and Statistical Database Management, 2012

The Quality of Web Data.
Proceedings of the 17th International Conference on Information Quality, 2012

Adaptive Windows for Duplicate Detection.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

Scalable peer-to-peer-based RDF management.
Proceedings of the I-SEMANTICS 2012 - 8th International Conference on Semantic Systems, 2012

Schema Decryption for Large Extract-Transform-Load Systems.
Proceedings of the Conceptual Modeling, 2012

LINDA: distributed web-of-data-scale entity matching.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Latent topics in graph-structured data.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Discovering conditional inclusion dependencies.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Reconciling ontologies and the web of data.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

2011
Creating voiD descriptions for Web-scale data.
J. Web Semant., 2011

Eliminating NULLs with Subsumption and Complementation.
IEEE Data Eng. Bull., 2011

Projektseminar "Similarity Search Algorithms".
Datenbank-Spektrum, 2011

Kurz erklärt: Datenfusion.
Datenbank-Spektrum, 2011

Instance-Based 'One-to-Some' Assignment of Similarity Measures to Attributes - (Short Paper).
Proceedings of the On the Move to Meaningful Internet Systems: OTM 2011, 2011

A generalization of blocking and windowing algorithms for duplicate detection.
Proceedings of the 2011 International Conference on Data and Knowledge Engineering, 2011

Dr. Crowdsource: or how i learned to stop worrying and love web data.
Proceedings of the 2nd International Workshop on Business intelligencE and the WEB, 2011

SPRINT: ranking search results by paths.
Proceedings of the EDBT 2011, 2011

Extreme web data integration.
Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management, 2011

Black swan: augmenting statistics with event data.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Efficient similarity search: arbitrary similarity measures, arbitrary composition.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Frequency-aware similarity measures: why Arnold Schwarzenegger is always a duplicate.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Advancing the discovery of unique column combinations.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Improving Service Discovery through Enriched Service Descriptions.
Proceedings of the Datenbanksysteme für Business, 2011

2010
An Introduction to Duplicate Detection
Synthesis Lectures on Data Management, Morgan & Claypool Publishers, ISBN: 978-3-031-01835-0, 2010

13th international workshop on the web and databases: WebDB 2010.
SIGMOD Rec., 2010

Graph-based concept identification and disambiguation for enterprise search.
Proceedings of the 19th International Conference on World Wide Web, 2010

ECIR - A Lightweight Approach for Entity-Centric Information Retrieval.
Proceedings of The Nineteenth Text REtrieval Conference, 2010

Towards a diamond SOA operational model.
Proceedings of the IEEE International Conference on Service-Oriented Computing and Applications, 2010

Collecting, Annotating, and Classifying Public Web Services.
Proceedings of the On the Move to Meaningful Internet Systems: OTM 2010, 2010

Profiling linked open data with ProLOD.
Proceedings of the Workshops Proceedings of the 26th International Conference on Data Engineering, 2010

Complement union for data integration.
Proceedings of the Workshops Proceedings of the 26th International Conference on Data Engineering, 2010

Linking open government data: what journalists wish they had known.
Proceedings of the Proceedings the 6th International Conference on Semantic Systems, 2010

Towards Granular Data Placement Strategies for Cloud Platforms.
Proceedings of the 2010 IEEE International Conference on Granular Computing, 2010

Subsumption and complementation as data fusion operators.
Proceedings of the EDBT 2010, 2010

Dynamic tags for dynamic data web services.
Proceedings of the 5th Workshop on Emerging Web Services Technology, 2010

Extracting structured information from Wikipedia articles to populate infoboxes.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

2009
Data fusion - Resolving Data Conflicts for Integration.
Proc. VLDB Endow., 2009

Guest Editorial for the Special Issue on Data Quality in Databases.
ACM J. Data Inf. Qual., 2009

A Machine Learning Approach to Foreign Key Discovery.
Proceedings of the 12th International Workshop on the Web and Databases, 2009

METL: Managing and Integrating ETL Processes.
Proceedings of the VLDB 2009 PhD Workshop. Co-located with the 35th International Conference on Very Large Data Bases (VLDB 2009). Lyon, 2009

Encapsulating Multi-stepped Web Forms as Web Services.
Proceedings of the Service-Oriented Computing. ICSOC/ServiceWave 2009 Workshops, 2009

Information Quality.
Proceedings of the Database Technologies: Concepts, 2009

2008
Industry-scale duplicate detection.
Proc. VLDB Endow., 2008

A research agenda for query processing in large-scale peer data management systems.
Inf. Syst., 2008

Data fusion.
ACM Comput. Surv., 2008

Managing ETL Processes.
Proceedings of the International Workshop on New Trends in Information Integration, 2008

Scaling up duplicate detection in graph data.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

2007
Datenqualität.
Inform. Spektrum, 2007

Peer-Daten-Management-Systems - PDMS (Kurz erklärt).
Datenbank-Spektrum, 2007

FuSem - Exploring Different Semantics of Data Fusion.
Proceedings of the 33rd International Conference on Very Large Data Bases, 2007

Networked PIM Using PDMS.
Proceedings of the Third International Workshop on Networking Meets Databases, 2007

Rule-Based Measurement Of Data Quality In Nominal Data.
Proceedings of the 12th International Conference on Information Quality, 2007

Emergent Data Quality Annotation And Visualization.
Proceedings of the 12th International Conference on Information Quality, 2007

Efficiently Detecting Inclusion Dependencies.
Proceedings of the 23rd International Conference on Data Engineering, 2007

System P: Completeness-driven Query Answering in Peer Data Management Systems.
Proceedings of the Datenbanksysteme in Business, 2007

Schema- und Metadatenmanagement in Peer Data Management Systemen.
Proceedings of the Datenbanksysteme in Business, 2007

A Classification of Schema Mappings and Analysis of Mapping Tools.
Proceedings of the Datenbanksysteme in Business, 2007

Informationsintegration - Architekturen und Methoden zur Integration verteilter und heterogener Datenquellen.
dpunkt.verlag, 2007

2006
Data Fusion in Three Steps: Resolving Schema, Tuple, and Value Inconsistencies.
IEEE Data Eng. Bull., 2006

Detecting Duplicates in Complex XML Data.
Proceedings of the 22nd International Conference on Data Engineering, 2006

XStruct: Efficient Schema Extraction from Multiple and Large XML Documents.
Proceedings of the 22nd International Conference on Data Engineering Workshops, 2006

Efficiently Computing Inclusion Dependencies for Schema Discovery.
Proceedings of the 22nd International Conference on Data Engineering Workshops, 2006

XML Duplicate Detection Using Sorted Neighborhoods.
Proceedings of the Advances in Database Technology, 2006

Query Planning in the Presence of Overlapping Sources.
Proceedings of the Advances in Database Technology, 2006

Assessing the Completeness of Sensor Data.
Proceedings of the Database Systems for Advanced Applications, 2006

Informationsintegration: Architekturen und Methoden zur Integration verteilter und heterogener Datenquellen.
dpunkt, ISBN: 3-89864-400-6, 2006

2005
Ein Data-Quality-Wettbewerb.
Datenbank-Spektrum, 2005

A Data Model and Query Language to Explore Enhanced Links and Paths in Life Science Sources.
Proceedings of the Eight International Workshop on the Web & Databases (WebDB 2005), 2005

Automatic Data Fusion with HumMer.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

DogmatiX Tracks down Duplicates in XML.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

Clio: A Schema Mapping Tool for Information Integration.
Proceedings of the 8th International Symposium on Parallel Architectures, 2005

Schema Matching using Duplicates.
Proceedings of the 21st International Conference on Data Engineering, 2005

Benefit and Cost of Query Answering in PDMS.
Proceedings of the Databases, 2005

(Almost) Hands-Off Information Integration for the Life Sciences.
Proceedings of the Second Biennial Conference on Innovative Data Systems Research, 2005

Self-Extending Peer Data Management.
Proceedings of the Datenbanksysteme in Business, 2005

Declarative Data Fusion - Syntax, Semantics, and Implementation.
Proceedings of the Advances in Databases and Information Systems, 2005

2004
BioFast: Challenges in Exploring Linked Life Science Sources.
SIGMOD Rec., 2004

Completeness of integrated information sources.
Inf. Syst., 2004

Eine Übung zur Vorlesung Informationsintegration.
Datenbank-Spektrum, 2004

Detecting Duplicate Objects in XML Documents.
Proceedings of the IQIS 2004, 2004

Information Quality: How Good Are Off-The-Shelf DBMS?
Proceedings of the Ninth International Conference on Information Quality (ICIQ 2004), 2004

Qualitäts- und Semantik-gesteuerte Anfragebearbeitung für Peer-basierte Datenmanagementsysteme (PDMS).
Proceedings of the 34. Jahrestagung der Gesellschaft für Informatik, 2004

FUSE BY: Syntax und Semantik zur Informationsfusion in SQL.
Proceedings of the 34. Jahrestagung der Gesellschaft für Informatik, 2004

Links and Paths through Life Sciences Data Sources.
Proceedings of the Data Integration in the Life Sciences, First International Workshop, 2004

Labeling and Enhancing Life Sciences Links.
Proceedings of the 3rd International IEEE Computer Society Computational Systems Bioinformatics Conference, 2004

2003
Qualitätsgesteuerte Anfragebearbeitung für Integrierte Informationssysteme.
it Inf. Technol., 2003

Data Quality in Genome Databases.
Proceedings of the Eighth International Conference on Information Quality (ICIQ 2003), 2003

Exploring Life Sciences Data Sources.
Proceedings of IJCAI-03 Workshop on Information Integration on the Web (IIWeb-03), 2003

Super-Fast XML Wrapper Generation in DB2: A Demonstration.
Proceedings of the 19th International Conference on Data Engineering, 2003

Semantic Overlay Clusters within Super-Peer Networks.
Proceedings of the Databases, 2003

2002
Schema Management.
IEEE Data Eng. Bull., 2002

Declarative Data Merging with Conflict Resolution.
Proceedings of the Seventh International Conference on Information Quality (ICIQ 2002), 2002

Attribute Classification Using Feature Analysis.
Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26, 2002

Mapping XML and Relational Schemas with Clio.
Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26, 2002

Quality-Driven Query Answering for Integrated Information Systems
Lecture Notes in Computer Science 2261, Springer, ISBN: 3-540-43349-X, 2002

2001
From Databases to Information Systems - Information Quality Makes the Difference.
Proceedings of the Sixth Conference on Information Quality (IQ 2001), 2001

2000
Assessment Methods for Information Quality Criteria.
Proceedings of the Fifth Conference on Information Quality (IQ 2000), 2000

Qualitätsgesteuerte Anfragebearbeitung für Integrierte Informationssysteme.
Proceedings of the Ausgezeichnete Informatikdissertationen 2000, 2000

Query Planning with Information Quality Bounds.
Proceedings of the Flexible Query Answering Systems, 2000

Quality-driven Query Planning.
Proceedings of the 7th EDBT 2000 PhD Workshop, March 31 - April 1, 2000. Konstanz, Germany, 2000

1999
Quality-driven Integration of Heterogenous Information Systems.
Proceedings of the VLDB'99, 1999

Do Metadata Models meet IQ Requirements?
Proceedings of the Fourth Conference on Information Quality (IQ 1999), 1999

Density Scores for Cooperative Query Answering.
Proceedings of the 4. Workshop Föderierte Datenbanken, 1999

1998
Quality Driven Source Selection Using Data Envelope Analysis.
Proceedings of the Third Conference on Information Quality (IQ 1998), 1998


  Loading...