Steven Euijong Whang

Orcid: 0000-0001-6419-931X

Affiliations:
  • Korea Advanced Institute of Science and Technology (KAIST), Korea
  • Stanford University, CA, USA (former)


According to our database1, Steven Euijong Whang authored at least 62 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Falcon: Fair Active Learning using Multi-armed Bandits.
Proc. VLDB Endow., January, 2024

Letter from the Special Issue Editor.
IEEE Data Eng. Bull., 2024

PFGuard: A Generative Framework with Privacy and Fairness Safeguards.
CoRR, 2024

Fair Class-Incremental Learning using Sample Weighting.
CoRR, 2024

ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models.
CoRR, 2024

RC-Mixup: A Data Augmentation Strategy against Noisy Data for Regression Tasks.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Quilt: Robust Data Segment Selection against Concept Drifts.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Data collection and quality challenges in deep learning: a data-centric AI perspective.
VLDB J., July, 2023

Dr-Fairness: Dynamic Data Ratio Adjustment for Fair Training on Real and Generated Data.
Trans. Mach. Learn. Res., 2023

Front Matter.
Proc. VLDB Endow., 2023

iFlipper: Label Flipping for Individual Fairness.
Proc. ACM Manag. Data, 2023

Personalized DP-SGD using Sampling Mechanisms.
CoRR, 2023

Improving Fair Training under Correlation Shifts.
Proceedings of the International Conference on Machine Learning, 2023

XClusters: Explainability-First Clustering.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Redactor: A Data-Centric and Individualized Defense against Inference Attacks.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Redactor: Targeted Disinformation Generation using Probabilistic Decision Boundaries.
CoRR, 2022

2021
A Survey on Data Collection for Machine Learning: A Big Data - AI Integration Perspective.
IEEE Trans. Knowl. Data Eng., 2021

Responsible AI Challenges in End-to-end Machine Learning.
IEEE Data Eng. Bull., 2021

MixRL: Data Mixing Augmentation for Regression using Reinforcement Learning.
CoRR, 2021

Slice Tuner: A Selective Data Acquisition Framework for Accurate and Fair Machine Learning Models.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Sample Selection for Fair and Robust Training.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Machine Learning Robustness, Fairness, and their Convergence.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

FairBatch: Batch Selection for Model Fairness.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
Automated Data Slicing for Model Validation: A Big Data - AI Integration Approach.
IEEE Trans. Knowl. Data Eng., 2020

Data Collection and Quality Challenges for Deep Learning.
Proc. VLDB Endow., 2020

Inspector Gadget: A Data Programming-based Labeling System for Industrial Images.
Proc. VLDB Endow., 2020

Slice Tuner: A Selective Data Collection Framework for Accurate and Fair Machine Learning Models.
CoRR, 2020

FR-Train: A Mutual Information-Based Approach to Fair and Robust Training.
Proceedings of the 37th International Conference on Machine Learning, 2020

Open-World COVID-19 Data Visualization [Extended Abstract].
Proceedings of the Heterogeneous Data Management, Polystores, and Analytics for Healthcare, 2020

2019
Data Cleaning for Accurate, Fair, and Robust Models: A Big Data - AI Integration Approach.
Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, 2019

Data Validation for Machine Learning.
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019

Slice Finder: Automated Data Slicing for Model Validation.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

2018
Data Lifecycle Challenges in Production Machine Learning: A Survey.
SIGMOD Rec., 2018

Slice Finder: Automated Data Slicing for Model Validation.
CoRR, 2018

2017
Data Management Challenges in Production Machine Learning.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

TFX: A TensorFlow-Based Production-Scale Machine Learning Platform.
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

2016
Managing Google's data lake: an overview of the Goods system.
IEEE Data Eng. Bull., 2016

Discovering Structure in the Universe of Attribute Names.
Proceedings of the 25th International Conference on World Wide Web, 2016

Goods: Organizing Google's Datasets.
Proceedings of the 2016 International Conference on Management of Data, 2016

LONLIES: Estimating Property Values for Long Tail Entities.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

2015
Discovering Subsumption Relationships for Web-Based Ontologies.
Proceedings of the 18th International Workshop on Web and Databases, 2015

2014
Incremental entity resolution on rules and data.
VLDB J., 2014

Biperpedia: An Ontology for Search Applications.
Proc. VLDB Endow., 2014

ReNoun: Fact Extraction for Nominal Attributes.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

2013
Joint entity resolution on multiple datasets.
VLDB J., 2013

Pay-As-You-Go Entity Resolution.
IEEE Trans. Knowl. Data Eng., 2013

Question Selection for Crowd Entity Resolution.
Proc. VLDB Endow., 2013

Disinformation techniques for entity resolution.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

2012
Data analytics: integration and privacy.
PhD thesis, 2012

A Model for Quantifying Information Leakage.
Proceedings of the Secure Data Management - 9th VLDB Workshop, 2012

Joint Entity Resolution.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

2011
Developments in Generic Entity Resolution.
IEEE Data Eng. Bull., 2011

Managing Information Leakage.
Proceedings of the Fifth Biennial Conference on Innovative Data Systems Research, 2011

2010
Entity Resolution with Evolving Rules.
Proc. VLDB Endow., 2010

Evaluating Entity Resolution Results.
Proc. VLDB Endow., 2010

2009
Generic entity resolution with negative rules.
VLDB J., 2009

Swoosh: a generic approach to entity resolution.
VLDB J., 2009

Indexing Boolean Expressions.
Proc. VLDB Endow., 2009

Entity resolution with iterative blocking.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2009

QuickStart: An Upfront Client-Based Design Advisor for Parallel Data Warehouses.
Proceedings of the 25th International Conference on Data Engineering, 2009

2006
A Practitioner's Approach to Normalizing XQuery Expressions.
Proceedings of the Database Systems for Advanced Applications, 2006


  Loading...