Steven Euijong Whang

IEEE Data Eng. Bull., 2024

PFGuard: A Generative Framework with Privacy and Fairness Safeguards.

[BibT_eX]

[DOI]

CoRR, 2024

Fair Class-Incremental Learning using Sample Weighting.

[BibT_eX]

[DOI]

Jaeyoung Park

Minsu Kim

CoRR, 2024

ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

RC-Mixup: A Data Augmentation Strategy against Noisy Data for Regression Tasks.

[BibT_eX]

[DOI]

Seonghyeon Hwang

Minsu Kim

Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Quilt: Robust Data Segment Selection against Concept Drifts.

[BibT_eX]

[DOI]

Minsu Kim

Seonghyeon Hwang

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Data collection and quality challenges in deep learning: a data-centric AI perspective.

[BibT_eX]

[DOI]

VLDB J., July, 2023

Dr-Fairness: Dynamic Data Ratio Adjustment for Fair Training on Real and Generated Data.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Front Matter.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2023

iFlipper: Label Flipping for Individual Fairness.

[BibT_eX]

[DOI]

Proc. ACM Manag. Data, 2023

Personalized DP-SGD using Sampling Mechanisms.

[BibT_eX]

[DOI]

Junseok Seo

CoRR, 2023

Improving Fair Training under Correlation Shifts.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

XClusters: Explainability-First Clustering.

[BibT_eX]

[DOI]

Hyunseung Hwang

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Redactor: A Data-Centric and Individualized Defense against Inference Attacks.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Redactor: Targeted Disinformation Generation using Probabilistic Decision Boundaries.

[BibT_eX]

[DOI]

CoRR, 2022

2021

A Survey on Data Collection for Machine Learning: A Big Data - AI Integration Perspective.

[BibT_eX]

[DOI]

Yuji Roh

IEEE Trans. Knowl. Data Eng., 2021

Responsible AI Challenges in End-to-end Machine Learning.

[BibT_eX]

[DOI]

IEEE Data Eng. Bull., 2021

MixRL: Data Mixing Augmentation for Regression using Reinforcement Learning.

[BibT_eX]

[DOI]

Seonghyeon Hwang

CoRR, 2021

Slice Tuner: A Selective Data Acquisition Framework for Accurate and Fair Machine Learning Models.

[BibT_eX]

[DOI]

Ki Hyun Tae

Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Sample Selection for Fair and Robust Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Machine Learning Robustness, Fairness, and their Convergence.

[BibT_eX]

[DOI]

Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

FairBatch: Batch Selection for Model Fairness.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Automated Data Slicing for Model Validation: A Big Data - AI Integration Approach.

[BibT_eX]

[DOI]

IEEE Trans. Knowl. Data Eng., 2020

Data Collection and Quality Challenges for Deep Learning.

[BibT_eX]

[DOI]

Jae-Gil Lee

Proc. VLDB Endow., 2020

Inspector Gadget: A Data Programming-based Labeling System for Industrial Images.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2020

Slice Tuner: A Selective Data Collection Framework for Accurate and Fair Machine Learning Models.

[BibT_eX]

[DOI]

Ki Hyun Tae

CoRR, 2020

FR-Train: A Mutual Information-Based Approach to Fair and Robust Training.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Open-World COVID-19 Data Visualization [Extended Abstract].

[BibT_eX]

[DOI]

Hyunseung Hwang

Proceedings of the Heterogeneous Data Management, Polystores, and Analytics for Healthcare, 2020

2019

Data Cleaning for Accurate, Fair, and Robust Models: A Big Data - AI Integration Approach.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, 2019

Data Validation for Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

Slice Finder: Automated Data Slicing for Model Validation.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

2018

Data Lifecycle Challenges in Production Machine Learning: A Survey.

[BibT_eX]

[DOI]

SIGMOD Rec., 2018

Slice Finder: Automated Data Slicing for Model Validation.

[BibT_eX]

[DOI]

CoRR, 2018

2017

Data Management Challenges in Production Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM International Conference on Management of Data, 2017

TFX: A TensorFlow-Based Production-Scale Machine Learning Platform.

[BibT_eX]

[DOI]

Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

2016

Managing Google's data lake: an overview of the Goods system.

[BibT_eX]

[DOI]

IEEE Data Eng. Bull., 2016

Discovering Structure in the Universe of Attribute Names.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on World Wide Web, 2016

Goods: Organizing Google's Datasets.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Management of Data, 2016

LONLIES: Estimating Property Values for Long Tail Entities.

[BibT_eX]

[DOI]

Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

2015

Discovering Subsumption Relationships for Web-Based Ontologies.

[BibT_eX]

[DOI]

Dana Movshovitz-Attias

Natalya Fridman Noy

Alon Y. Halevy

Proceedings of the 18th International Workshop on Web and Databases, 2015

2014

Incremental entity resolution on rules and data.

[BibT_eX]

[DOI]

VLDB J., 2014

Biperpedia: An Ontology for Search Applications.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2014

ReNoun: Fact Extraction for Nominal Attributes.

[BibT_eX]

[DOI]

Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

2013

Joint entity resolution on multiple datasets.

[BibT_eX]

[DOI]

VLDB J., 2013

Pay-As-You-Go Entity Resolution.

[BibT_eX]

[DOI]

David Marmaros

IEEE Trans. Knowl. Data Eng., 2013

Question Selection for Crowd Entity Resolution.

[BibT_eX]

[DOI]

Peter Lofgren

Proc. VLDB Endow., 2013

Disinformation techniques for entity resolution.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

2012

Data analytics: integration and privacy.

[BibT_eX]

[DOI]

PhD thesis, 2012

A Model for Quantifying Information Leakage.

[BibT_eX]

[DOI]

Proceedings of the Secure Data Management - 9th VLDB Workshop, 2012

Joint Entity Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

2011

Developments in Generic Entity Resolution.

[BibT_eX]

[DOI]

IEEE Data Eng. Bull., 2011

Managing Information Leakage.

[BibT_eX]

[DOI]

Proceedings of the Fifth Biennial Conference on Innovative Data Systems Research, 2011

2010

Entity Resolution with Evolving Rules.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2010

Evaluating Entity Resolution Results.

[BibT_eX]

[DOI]

David Menestrina

Proc. VLDB Endow., 2010

2009

Generic entity resolution with negative rules.

[BibT_eX]

[DOI]

Omar Benjelloun

VLDB J., 2009

Swoosh: a generic approach to entity resolution.

[BibT_eX]

[DOI]

VLDB J., 2009

Indexing Boolean Expressions.

[BibT_eX]

[DOI]