Renée J. Miller

Orcid: 0000-0002-1484-4787

Affiliations:
  • Northeastern University, Boston, MA, USA
  • University of Toronto, Canada (former)


According to our database1, Renée J. Miller authored at least 163 papers between 1993 and 2025.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2009, "For innovations in metadata management, especially the creation of tools to integrate, transform, query and analyze information.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Fantastic Tables and Where to Find Them: Table Search in Semantic Data Lakes.
Proceedings of the Proceedings 28th International Conference on Extending Database Technology, 2025

2024
Model Lakes.
CoRR, 2024

ALT-GEN: Benchmarking Table Union Search using Large Language Models.
Proceedings of Workshops at the 50th International Conference on Very Large Data Bases, 2024

Finding Support for Tabular LLM Outputs.
Proceedings of Workshops at the 50th International Conference on Very Large Data Bases, 2024

A Large Scale Test Corpus for Semantic Table Search.
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Comparing Incomplete Database Instances.
Proceedings of the 32nd Symposium of Advanced Database Systems, 2024

Gen-T: Table Reclamation in Data Lakes.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Similarity Measures For Incomplete Database Instances.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

2023
DomainNet: Homograph Detection and Understanding in Data Lake Disambiguation.
ACM Trans. Database Syst., September, 2023



Data Lake Organization.
IEEE Trans. Knowl. Data Eng., 2023

Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V.
Proc. VLDB Endow., 2023

Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning.
Proc. VLDB Endow., 2023

SANTOS: Relationship-based Semantic Table Union Search.
Proc. ACM Manag. Data, 2023

Blend: A Unified Data Discovery System.
CoRR, 2023

Generative Benchmark Creation for Table Union Search.
CoRR, 2023

Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V (Technical Report).
CoRR, 2023

DIALITE: Discover, Align and Integrate Open Data Tables.
Proceedings of the Companion of the 2023 International Conference on Management of Data, 2023

Table Discovery in Data Lakes: State-of-the-art and Future Directions.
Proceedings of the Companion of the 2023 International Conference on Management of Data, 2023

2022
Integrating Data Lake Tables.
Proc. VLDB Endow., 2022

2021
RONIN: Data Lake Exploration.
Proc. VLDB Endow., 2021

Towards Knowledge Exchange: State-of-the-Art and Open Problems.
Proceedings of the SOFSEM 2021: Theory and Practice of Computer Science, 2021

DomainNet: Homograph Detection for Data Lake Disambiguation.
Proceedings of the 24th International Conference on Extending Database Technology, 2021

2020
Pytheas: Pattern-based Table Discovery in CSV Files.
Proc. VLDB Endow., 2020

Knowledge Translation.
Proc. VLDB Endow., 2020

Knowledge Translation: Extended Technical Report.
CoRR, 2020

Organizing Data Lakes for Navigation.
Proceedings of the 2020 International Conference on Management of Data, 2020

2019
Thematic issue on data management for graphs.
VLDB J., 2019

A Collective, Probabilistic Approach to Schema Mapping Using Diverse Noisy Evidence.
IEEE Trans. Knowl. Data Eng., 2019

Data Lake Management: Challenges and Opportunities.
Proc. VLDB Endow., 2019

VISE: Vehicle Image Search Engine with Traffic Camera.
Proc. VLDB Endow., 2019

JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes.
Proceedings of the 2019 International Conference on Management of Data, 2019

Towards a Benchmark for Knowledge Base Exchange.
Proceedings of the 1st International Workshop on Challenges and Experiences from Data Integration to Knowledge Graphs co-located with the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2019), 2019

2018
Schema Mapping.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Table Union Search on Open Data.
Proc. VLDB Endow., 2018

Open Data Integration.
Proc. VLDB Endow., 2018

Making Open Data Transparent: Data Discovery on Open Data.
IEEE Data Eng. Bull., 2018

Optimizing Organizations for Navigating Data Lakes.
CoRR, 2018

Let's Make It Dirty with BART!
Proceedings of the 26th Italian Symposium on Advanced Database Systems, 2018

2017
Data Quality: The Role of Empiricism.
SIGMOD Rec., 2017

Interactive Navigation of Open Data Linkages.
Proc. VLDB Endow., 2017

A Collective, Probabilistic Approach to Schema Mapping: Appendix.
CoRR, 2017

The Future of Data Integration.
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

VIQS: Visual Interactive Exploration of Query Semantics.
Proceedings of the 2017 ACM Workshop on Exploratory Search and Interactive Data Analytics, 2017

A Collective, Probabilistic Approach to Schema Mapping.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

DeepSea: Progressive Workload-Aware Partitioning of Materialized Views in Scalable Data Analytics.
Proceedings of the 20th International Conference on Extending Database Technology, 2017

Second annual workshop on data driven knowledge mobilization.
Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, 2017

2016
Data Driven Discovery of Attribute Dictionaries.
Trans. Comput. Collect. Intell., 2016

LSH Ensemble: Internet-Scale Domain Search.
Proc. VLDB Endow., 2016

Benchmarking Data Curation Systems.
IEEE Data Eng. Bull., 2016

BART in Action: Error Generation and Empirical Evaluations of Data-Cleaning Systems.
Proceedings of the 2016 International Conference on Management of Data, 2016

Data-driven knowledge mobilization.
Proceedings of the 26th Annual International Conference on Computer Science and Software Engineering, 2016

2015
Combining Quantitative and Logical Data Cleaning.
Proc. VLDB Endow., 2015

Messing Up with BART: Error Generation for Evaluating Data-Cleaning Algorithms.
Proc. VLDB Endow., 2015

The iBench Integration Metadata Generator.
Proc. VLDB Endow., 2015

Gain Control over your Integration Evaluations.
Proc. VLDB Endow., 2015

VizCurator: A Visual Tool for Curating Open Data.
Proceedings of the 24th International Conference on World Wide Web Companion, 2015

LinkedCT Live: Platform for Online Curation of Clinical Trials Data.
Proceedings of the ISWC 2015 Posters & Demonstrations Track co-located with the 14th International Semantic Web Conference (ISWC-2015), 2015

Automatic Curation of Clinical Trials Data in LinkedCT.
Proceedings of the Semantic Web - ISWC 2015, 2015

LabBook: Metadata-driven social collaborative data analysis.
Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29, 2015

2014
Continuous data cleaning.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014

VoidWiz: Resolving incompleteness using network effects.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014

Big Data Curation.
Proceedings of the 20th International Conference on Management of Data, 2014

2013
Perspectives on Business Intelligence
Synthesis Lectures on Data Management, Morgan & Claypool Publishers, ISBN: 978-3-031-01848-0, 2013

Modeling the execution semantics of stream processing engines with SECRET.
VLDB J., 2013

Publishing bibliographic data on the Semantic Web using BibBase.
Semantic Web, 2013

Discovering Linkage Points over Web Data.
Proc. VLDB Endow., 2013

Provenance for Data Mining.
Proceedings of the 5th Workshop on the Theory and Practice of Provenance, 2013

Value invention in data exchange.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

Using SQL for Efficient Generation and Querying of Provenance Information.
Proceedings of the In Search of Elegance in the Theory and Practice of Computation, 2013

2012
Automated dictionary discovery for the online marketplace.
Proceedings of the iConference 2012, Toronto, Ontario, Canada, February 7-10, 2012, 2012

AutoDict: Automated Dictionary Discovery.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

The Vivification Problem in Real-Time Business Intelligence: A Vision.
Proceedings of the Enabling Real-Time Business Intelligence - 6th International Workshop, 2012

Semantic Link Discovery over Relational Data.
Proceedings of the Semantic Search over the Web, 2012

2011
Debugging Data Exchange with Vagabond.
Proc. VLDB Endow., 2011

Linking Semistructured Data on the Web.
Proceedings of the 14th International Workshop on the Web and Databases 2011, 2011

Reexamining Some Holy Grails of Data Provenance.
Proceedings of the 3rd Workshop on the Theory and Practice of Provenance, 2011

Active repair of data quality rules.
Proceedings of the 16th International Conference on Information Quality, 2011

A unified model for data and constraint repair.
Proceedings of the 27th International Conference on Data Engineering, 2011

NSERC business intelligence network: selected topics.
Proceedings of the Center for Advanced Studies on Collaborative Research, 2011

2010
Exploring XML web collections with DescribeX.
ACM Trans. Web, 2010

Just-in-time Data Integration in Action.
Proc. VLDB Endow., 2010

TRAMP: Understanding the Behavior of Schema Mappings through Provenance.
Proc. VLDB Endow., 2010

SECRET: A Model for Analysis of the Execution Semantics of Stream Processing Systems.
Proc. VLDB Endow., 2010

Publishing Bibliographic Data on the Semantic Web using BibBase.
Proceedings of the ISWC 2010 Posters & Demonstrations Track: Collected Abstracts, 2010

Composing local-as-view mappings: closure and applications.
Proceedings of the Database Theory, 2010

A first step towards integration independence.
Proceedings of the Workshops Proceedings of the 26th International Conference on Data Engineering, 2010

BibBase triplified.
Proceedings of the Proceedings the 6th International Conference on Semantic Systems, 2010

Stream schema: providing and exploiting static metadata for data stream processing.
Proceedings of the EDBT 2010, 2010

Online annotation of text streams with structured entities.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

Information Integration: a Vision for Integration Independence and Linking Open Data.
Proceedings of the 4th Alberto Mendelzon International Workshop on Foundations of Data Management, 2010

2009
Schema Mapping.
Proceedings of the Encyclopedia of Database Systems, 2009

Creating probabilistic databases from duplicated data.
VLDB J., 2009

Linkage Query Writer.
Proc. VLDB Endow., 2009

Framework for Evaluating Clustering Algorithms in Duplicate Detection.
Proc. VLDB Endow., 2009

LinkedCT: A Linked Data Space for Clinical Trials
CoRR, 2009

Schema AND Data: A Holistic Approach to Mapping, Resolution and Fusion in Information Integration.
Proceedings of the Conceptual Modeling, 2009

A framework for semantic link discovery over relational data.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

YAM: a schema matcher factory.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

(Not) yet another matcher.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

Clio: Schema Mapping Creation and Data Exchange.
Proceedings of the Conceptual Modeling: Foundations and Applications, 2009

2008
Guest editorial: special issue on metadata management.
VLDB J., 2008

Discovering data quality rules.
Proc. VLDB Endow., 2008

Muse: a system for understanding and designing mappings.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Muse: Mapping Understanding and deSign by Example.
Proceedings of the 24th International Conference on Data Engineering, 2008

2007
First-order query rewriting for inconsistent databases.
J. Comput. Syst. Sci., 2007

Geographically-Sensitive Link Analysis.
Proceedings of the 2007 IEEE / WIC / ACM International Conference on Web Intelligence, 2007

Leveraging data and structure in ontology integration.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

Management of Inconsistent and Uncertain Data.
Proceedings of the Fifth International Workshop on Quality in Databases, 2007

Accuracy of Approximate String Joins Using Grams.
Proceedings of the Fifth International Workshop on Quality in Databases, 2007

Creating Nested Mappings with Clio.
Proceedings of the 23rd International Conference on Data Engineering, 2007

A Semantic Approach to Discovering Schema Mapping Expressions.
Proceedings of the 23rd International Conference on Data Engineering, 2007

Retrospective on Clio: Schema Mapping and Data Exchange in Practice.
Proceedings of the 2007 International Workshop on Description Logics (DL2007), 2007

2006
Peer data exchange.
ACM Trans. Database Syst., 2006

Nested Mappings: Schema Mapping Reloaded.
Proceedings of the 32nd International Conference on Very Large Data Bases, 2006

Clean Answers over Dirty Databases: A Probabilistic Approach.
Proceedings of the 22nd International Conference on Data Engineering, 2006

Authorization-Transparent Access Control for XML Under the Non-Truman Model.
Proceedings of the Advances in Database Technology, 2006

2005
Special issue: Best papers of VLDB 2004.
VLDB J., 2005

Data exchange: semantics and query answering.
Theor. Comput. Sci., 2005

In memoriam Alberto Oscar Mendelzon.
SIGMOD Rec., 2005

Data Sharing in the Hyperion Peer Database System.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

ConQuer: A System for Efficient Querying Over Inconsistent Databases.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

ConQuer: Efficient Management of Inconsistent Databases.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

Representing and Querying Data Transformations.
Proceedings of the 21st International Conference on Data Engineering, 2005

2004
Preserving mapping consistency under schema changes.
VLDB J., 2004

Kanata: Adaptation and Evolution in Data Sharing Systems.
SIGMOD Rec., 2004

Information-Theoretic Tools for Mining Database Structure from Large Data Sets.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2004

ToMAS: A System for Adapting Mappings while Schemas Evolve.
Proceedings of the 20th International Conference on Data Engineering, 2004

LIMBO: Scalable Clustering of Categorical Data.
Proceedings of the Advances in Database Technology, 2004

2003
Mining for empty spaces in large data sets.
Theor. Comput. Sci., 2003

The hyperion project: from data integration to data coordination.
SIGMOD Rec., 2003

Schema Discovery.
IEEE Data Eng. Bull., 2003

Mapping Adaptation under Evolving Schemas.
Proceedings of 29th International Conference on Very Large Data Bases, 2003

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues.
Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, 2003

Towards Inconsistency Management in Data Integration Systems.
Proceedings of IJCAI-03 Workshop on Information Integration on the Web (IIWeb-03), 2003

Using Categorical Clustering in Schema Discovery.
Proceedings of IJCAI-03 Workshop on Information Integration on the Web (IIWeb-03), 2003

Managing Data Mappings in the Hyperion Project.
Proceedings of the 19th International Conference on Data Engineering, 2003

2002
Reminiscences on Influential Papers.
SIGMOD Rec., 2002

Letter from the Special Issue Editor.
IEEE Data Eng. Bull., 2002

Schema Management.
IEEE Data Eng. Bull., 2002

Translating Web Data.
Proceedings of 28th International Conference on Very Large Data Bases, 2002

Similarity Search Over Time-Series Data Using Wavelets.
Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26, 2002

Mapping XML and Relational Schemas with Clio.
Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26, 2002

2001
The Clio Project: Managing Heterogeneity.
SIGMOD Rec., 2001

Data-Driven Understanding and Refinement of Schema Mappings.
Proceedings of the 2001 ACM SIGMOD international conference on Management of data, 2001

Clio: A Semi-Automatic Tool For Schema Mapping.
Proceedings of the 2001 ACM SIGMOD international conference on Management of data, 2001

Reverse Engineering Meets Data Analysis.
Proceedings of the 9th International Workshop on Program Comprehension (IWPC 2001), 2001

Mining for Empty Rectangles in Large Data Sets.
Proceedings of the Database Theory, 2001

2000
Schema Mapping as Query Discovery.
Proceedings of the VLDB 2000, 2000

Approximate Query Answering in High-Dimensional Data Cubes.
Proceedings of the 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, 2000

1999
Mining for Program Structure.
Int. J. Softw. Eng. Knowl. Eng., 1999

Transforming Heterogeneous Data with Database Middleware: Beyond Integration.
IEEE Data Eng. Bull., 1999

1998
Querying multimedia presentations.
Comput. Commun., 1998

Using Schematically Heterogeneous Structures.
Proceedings of the SIGMOD 1998, 1998

1997
DataWeb: Customizable Database Publishing for the Web.
IEEE Multim., 1997

Association Rules over Interval Data.
Proceedings of the SIGMOD 1997, 1997

1996
Using Metadata to Address Problems of Semantic Interoperability in Large Object Systems.
Proceedings of the 1st IEEE Metadata Conference 1996, MD 1996, Silver Spring, 1996

1994
Schema equivalence in heterogeneous systems: bridging theory and practice.
Inf. Syst., 1994

Schema Equivalence in Heterogeneous Systems: Bridging Theory and Practice (Extended Abstract).
Proceedings of the Advances in Database Technology, 1994

1993
Desktop Experiment Management.
IEEE Data Eng. Bull., 1993

The Use of Information Capacity in Schema Integration and Translation.
Proceedings of the 19th International Conference on Very Large Data Bases, 1993

Understanding Schemas.
Proceedings of the RIDE-IMS '93, 1993


  Loading...