2025
Evaluating SQL Understanding in Large Language Models.
Proceedings of the Proceedings 28th International Conference on Extending Database Technology, 2025
2024
A Data Generator to Explore the Interactions Between Concept Drifts and Anomalies [Demo].
Proceedings of Workshops at the 50th International Conference on Very Large Data Bases, 2024
2023
Data Anonymization With Diversity Constraints.
IEEE Trans. Knowl. Data Eng., April, 2023
Inconsistency Detection with Temporal Graph Functional Dependencies.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023
Will my Flight be on Time? Learning from Part Failures to Predict Future Reliability.
Proceedings of the IEEE International Conference on Big Data, 2023
2022
Contextual Data Cleaning with Ontology Functional Dependencies.
ACM J. Data Inf. Qual., 2022
Discovery of Keys for Graphs [Extended Version].
CoRR, 2022
Confidence Bounded Replica Currency Estimation.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022
Efficient Action Recognition Using Confidence Distillation.
Proceedings of the 26th International Conference on Pattern Recognition, 2022
Discovery of Keys for Graphs.
Proceedings of the Big Data Analytics and Knowledge Discovery, 2022
2021
Temporal Graph Functional Dependencies - Technical Report.
CoRR, 2021
Discovery and Contextual Data Cleaning with Ontology Functional Dependencies.
CoRR, 2021
Preserving Diversity in Anonymized Data.
Proceedings of the 24th International Conference on Extending Database Technology, 2021
Discovery of Temporal Graph Functional Dependencies.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
2020
Privacy-aware data cleaning-as-a-service.
Inf. Syst., 2020
Privacy-Aware Data Cleaning-as-a-Service (Extended Version).
CoRR, 2020
Diversifying Anonymized Data with Diversity Constraints.
CoRR, 2020
2019
Ontology-based Entity Matching in Attributed Graphs.
Proc. VLDB Endow., 2019
Restoring Consistency in Ontological Multidimensional Data Models via Weighted Repairs.
Proceedings of the Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 23rd International Conference KES-2019, 2019
CurrentClean: Spatio-Temporal Cleaning of Stale Data.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019
CurrentClean: Interactive Change Exploration and Cleaning of Stale Data.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019
2018
InfoClean: Protecting Sensitive Information in Data Cleaning.
ACM J. Data Inf. Qual., 2018
Contextual Data Cleaning.
Proceedings of the 34th IEEE International Conference on Data Engineering Workshops, 2018
FastOFD: Contextual Data Cleaning with Ontology Functional Dependencies.
Proceedings of the 21st International Conference on Extending Database Technology, 2018
PACAS: Privacy-Aware, Data Cleaning-as-a-Service.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018
2017
Refining Duplicate Detection for Improved Data Quality.
Proceedings of the Joint Proceedings of the 1st Workshop on Temporal Dynamics in Digital Libraries (TDDL 2017), 2017
Privacy aware web services in the cloud.
Proceedings of the 2017 IEEE Conference on Communications and Network Security, 2017
Efficient Discovery of Ontology Functional Dependencies.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017
Quantifying duplication to improve data quality.
Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, 2017
2016
Data Driven Discovery of Attribute Dictionaries.
Trans. Comput. Collect. Intell., 2016
Unifying Data and Constraint Repairs.
ACM J. Data Inf. Qual., 2016
Efficient Discovery of Ontology Functional Dependencies.
CoRR, 2016
PARC: Privacy-Aware Data Cleaning.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016
2015
Combining Quantitative and Logical Data Cleaning.
Proc. VLDB Endow., 2015
Towards a Unified Framework for Data Cleaning and Data Privacy.
Proceedings of the Web Information Systems Engineering - WISE 2015, 2015
A Data Quality Framework for Customer Relationship Analytics.
Proceedings of the Web Information Systems Engineering - WISE 2015, 2015
2014
Repairing integrity rules for improved data quality.
Int. J. Inf. Qual., 2014
Models for Distributed, Large Scale Data Cleaning.
Proceedings of the Trends and Applications in Knowledge Discovery and Data Mining, 2014
Continuous data cleaning.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014
CONDOR: A System for CONstraint DiscOvery and Repair.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014
2013
An Algebraic Approach Towards Data Cleaning.
Proceedings of the 4th International Conference on Emerging Ubiquitous Systems and Pervasive Networks (EUSPN-2013) and the 3rd International Conference on Current and Future Trends of Information and Communication Technologies in Healthcare (ICTH), 2013
2012
Data Quality Through Active Constraint Discovery and Maintenance.
PhD thesis, 2012
Automated dictionary discovery for the online marketplace.
Proceedings of the iConference 2012, Toronto, Ontario, Canada, February 7-10, 2012, 2012
AutoDict: Automated Dictionary Discovery.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012
2011
Active repair of data quality rules.
Proceedings of the 16th International Conference on Information Quality, 2011
A unified model for data and constraint repair.
Proceedings of the 27th International Conference on Data Engineering, 2011
2009
Framework for Evaluating Clustering Algorithms in Duplicate Detection.
Proc. VLDB Endow., 2009
2008
Discovering data quality rules.
Proc. VLDB Endow., 2008
An xml index advisor for DB2.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008
XML Index Recommendation with Tight Optimizer Coupling.
Proceedings of the 24th International Conference on Data Engineering, 2008
2007
Seeking Stable Clusters in the Blogosphere.
Proceedings of the 33rd International Conference on Very Large Data Bases, 2007