Chengliang Chai

Orcid: 0009-0003-5386-1330

Affiliations:
  • Tsinghua University, China


According to our database1, Chengliang Chai authored at least 68 papers between 2016 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
LakeCompass: An End-to-End System for Table Maintenance, Search and Analysis in Data Lakes.
Proc. VLDB Endow., August, 2024

The Dawn of Natural Language to SQL: Are We Fully Ready? [Experiment, Analysis \u0026 Benchmark ].
Proc. VLDB Endow., July, 2024

LakeBench: A Benchmark for Discovering Joinable and Unionable Tables in Data Lakes.
Proc. VLDB Endow., April, 2024

MisDetect: Iterative Mislabel Detection using Early Loss.
Proc. VLDB Endow., February, 2024

PACE: Poisoning Attacks on Learned Cardinality Estimation.
Proc. ACM Manag. Data, February, 2024

Cardinality estimation using normalizing flow.
VLDB J., 2024

Harnessing Diversity for Important Data Selection in Pretraining Large Language Models.
CoRR, 2024

The Dawn of Natural Language to SQL: Are We Fully Ready?
CoRR, 2024

IDE: A System for Iterative Mislabel Detection.
Proceedings of the Companion of the 2024 International Conference on Management of Data, 2024

Separation Is for Better Reunion: Data Lake Storage at Huawei.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

DMRNet: Effective Network for Accurate Discharge Medication Recommendation.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Representation Learning for Entity Alignment in Knowledge Graph: A Design Space Exploration.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Cost-Effective In-Context Learning for Entity Resolution: A Design Space Exploration.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Mitigating Data Scarcity in Supervised Machine Learning Through Reinforcement Learning Guided Data Generation.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

Applications and Challenges for Large Language Models: From Data Management Perspective.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

2023
An enhanced Elo-based student model for polychotomously scored items in adaptive educational system.
Interact. Learn. Environ., December, 2023

HOFD: An Outdated Fact Detector for Knowledge Bases.
IEEE Trans. Knowl. Data Eng., October, 2023

Data Management for Machine Learning: A Survey.
IEEE Trans. Knowl. Data Eng., May, 2023

Learned Data-aware Image Representations of Line Charts for Similarity Search.
Proc. ACM Manag. Data, 2023

HAIPipe: Combining Human-generated and Machine-generated Pipelines for Data Preparation.
Proc. ACM Manag. Data, 2023

GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data.
Proc. ACM Manag. Data, 2023

Demystifying Artificial Intelligence for Data Preparation.
Proceedings of the Companion of the 2023 International Conference on Management of Data, 2023

Efficient Coreset Selection with Cluster-based Methods.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Database Meets Artificial Intelligence: A Survey (Extended Abstract).
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

AutoCE: An Accurate and Efficient Model Advisor for Learned Cardinality Estimation.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

A Topic-Aware Data Generation Framework for Math Word Problems.
Proceedings of the Database Systems for Advanced Applications, 2023

2022
RNE: computing shortest paths using road network embedding.
VLDB J., 2022

Interactively discovering and ranking desired tuples by data exploration.
VLDB J., 2022

Natural Language to Visualization by Neural Machine Translation.
IEEE Trans. Vis. Comput. Graph., 2022

Database Meets Artificial Intelligence: A Survey.
IEEE Trans. Knowl. Data Eng., 2022

Steerable Self-Driving Data Visualization.
IEEE Trans. Knowl. Data Eng., 2022

Cost-based or Learning-based? A Hybrid Query Optimizer for Query Plan Selection.
Proc. VLDB Endow., 2022

Coresets over Multiple Tables for Feature-rich and Data-efficient Machine Learning.
Proc. VLDB Endow., 2022

DADER: Hands-Off Entity Resolution with Domain Adaptation.
Proc. VLDB Endow., 2022

Selective Data Acquisition in the Wild for Model Charging.
Proc. VLDB Endow., 2022

Preface.
J. Comput. Sci. Technol., 2022

AlphaQO: Robust Learned Query Optimizer.
Int. J. Softw. Informatics, 2022

LearnedSQLGen: Constraint-aware SQL Generation using Reinforcement Learning.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Domain Adaptation for Deep Entity Resolution.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Synthesizing Privacy Preserving Entity Resolution Datasets.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Feature Augmentation with Reinforcement Learning.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

RW-Tree: A Learned Workload-aware Framework for R-tree Construction.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Learned Query Optimizer: At the Forefront of AI-Driven Databases.
Proceedings of the 25th International Conference on Extending Database Technology, 2022

2021
CrowdChart: Crowdsourced Data Extraction From Visualization Charts.
IEEE Trans. Knowl. Data Eng., 2021

A Learned Query Rewrite System using Monte Carlo Tree Search.
Proc. VLDB Endow., 2021

FACE: A Normalizing Flow based Cardinality Estimator.
Proc. VLDB Endow., 2021

Automatic Data Acquisition for Deep Learning.
Proc. VLDB Endow., 2021

A Tree-Based Indexing Approach for Diverse Textual Similarity Search.
IEEE Access, 2021

Synthesizing Natural Language to Visualization (NL2VIS) Benchmarks from NL2SQL Benchmarks.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Ranking Desired Tuples by Database Exploration.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

2020
VisClean: Interactive Cleaning for Progressive Visualization.
Proc. VLDB Endow., 2020

Human-in-the-loop Techniques in Machine Learning.
IEEE Data Eng. Bull., 2020

Interactively Discovering and Ranking Desired Tuples without Writing SQL Queries.
Proceedings of the 2020 International Conference on Management of Data, 2020

Human-in-the-loop Outlier Detection.
Proceedings of the 2020 International Conference on Management of Data, 2020

Reinforcement Learning with Tree-LSTM for Join Order Selection.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Interactive Cleaning for Progressive Visualization through Composite Questions.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Outdated Fact Detection in Knowledge Bases.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Crowdsourcing-based Data Extraction from Visualization Charts.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Manually Detecting Errors for Data Cleaning Using Adaptive Crowdsourcing Strategies.
Proceedings of the 23rd International Conference on Extending Database Technology, 2020

2019
AnalyticDB: Real-time OLAP Database System at Alibaba Cloud.
Proc. VLDB Endow., 2019

Towards Automatic Mathematical Exercise Solving.
Data Sci. Eng., 2019

Crowdsourcing Database Systems: Overview and Challenges.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

2018
A partial-order-based framework for cost-effective crowdsourced entity resolution.
VLDB J., 2018

CDB: A Crowd-Powered Database System.
Proc. VLDB Endow., 2018

Crowd-Powered Data Mining.
CoRR, 2018

Incentive-Based Entity Collection Using Crowdsourcing.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

2017
CDB: Optimizing Queries with Crowd-Based Selections and Joins.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

2016
Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach.
Proceedings of the 2016 International Conference on Management of Data, 2016


  Loading...