Yeye He

Orcid: 0000-0003-2824-5299

According to our database1, Yeye He authored at least 47 papers between 2009 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Auto-Tables: Relationalize Tables without Using Examples.
SIGMOD Rec., March, 2024

Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks.
Proc. ACM Manag. Data, 2024

Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations.
Proc. ACM Manag. Data, 2024

Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning.
CoRR, 2024

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models.
CoRR, 2024

Vision Language Models for Spreadsheet Understanding: Challenges and Opportunities.
CoRR, 2024

Encoding Spreadsheets for Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Auto-BI: Automatically Build BI-Models Leveraging Local Join Prediction and Global Schema Graph.
Proc. VLDB Endow., 2023

Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples.
Proc. VLDB Endow., 2023

Predicate Pushdown for Data Science Pipelines.
Proc. ACM Manag. Data, 2023

Ground Truth Inference for Weakly Supervised Entity Matching.
Proc. ACM Manag. Data, 2023

Table-GPT: Table-tuned GPT for Diverse Table Tasks.
CoRR, 2023

Auto-Validate by-History: Auto-Program Data Quality Constraints to Validate Recurring Data Pipelines.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

2022
PACk: An Efficient Partition-based Distributed Agglomerative Hierarchical Clustering Algorithm for Deduplication.
Proc. VLDB Endow., 2022

2021
Auto-Pipeline: Synthesize Data Pipelines By-Target Using Reinforcement Learning and Search.
Proc. VLDB Endow., 2021

Demonstration of Panda: A Weakly Supervised Entity Matching System.
Proc. VLDB Endow., 2021

Auto-Tag: Tagging-Data-By-Example in Data Lakes.
CoRR, 2021

AutoPipeline: Synthesize Data Pipelines By-Target Using Reinforcement Learning and Search.
CoRR, 2021

Auto-Validate: Unsupervised Data Validation Using Data-Domain Patterns Inferred from Data Lakes.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

2020
Auto-Transform: Learning-to-Transform by Patterns.
Proc. VLDB Endow., 2020

Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks.
Proceedings of the 2020 International Conference on Management of Data, 2020

2019
Auto-EM: End-to-end Fuzzy Entity-Matching using Pre-trained Deep Models and Transfer Learning.
Proceedings of the World Wide Web Conference, 2019

Uni-Detect: A Unified Approach to Automated Error Detection in Tables.
Proceedings of the 2019 International Conference on Management of Data, 2019

2018
Transform-Data-by-Example (TDE): An Extensible Search Engine for Data Transformations.
Proc. VLDB Endow., 2018

Synthesizing Type-Detection Logic for Rich Semantic Data Types using Open-source Code.
Proceedings of the 2018 International Conference on Management of Data, 2018

Auto-Detect: Data-Driven Error Detection in Tables.
Proceedings of the 2018 International Conference on Management of Data, 2018

Transform-Data-by-Example (TDE): Extensible Data Transformation in Excel.
Proceedings of the 2018 International Conference on Management of Data, 2018

2017
Auto-Join: Joining Tables by Leveraging Transformations.
Proc. VLDB Endow., 2017

Synthesizing Mapping Relationships Using Table Corpus.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

Discovering Enterprise Concepts Using Spreadsheet Tables.
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

2016
Data services leveraging Bing's data assets.
IEEE Data Eng. Bull., 2016

Automatic Discovery of Attribute Synonyms Using Query Logs and Table Corpora.
Proceedings of the 25th International Conference on World Wide Web, 2016

2015
SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora.
Proc. VLDB Endow., 2015

Annotating Database Schemas to Help Enterprise Search.
Proc. VLDB Endow., 2015

Concept Expansion Using Web Tables.
Proceedings of the 24th International Conference on World Wide Web, 2015

TEGRA: Table Extraction by Global Record Alignment.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

2014
ClusterJoin: A Similarity Joins Framework using Map-Reduce.
Proc. VLDB Endow., 2014

On Load Shedding in Complex Event Processing.
Proceedings of the Proc. 17th International Conference on Database Theory (ICDT), 2014

2013
Mining acronym expansions and their meanings using query click log.
Proceedings of the 22nd International World Wide Web Conference, 2013

Crawling deep web entity pages.
Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, 2013

Utility-maximizing event stream suppression.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

2011
SEISA: set expansion by iterative similarity aggregation.
Proceedings of the 20th International Conference on World Wide Web, 2011

On the complexity of privacy-preserving complex event processing.
Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2011

Preventing equivalence attacks in updated, anonymized data.
Proceedings of the 27th International Conference on Data Engineering, 2011

2010
Keyword++: A Framework to Improve Keyword Search Over Entity Databases.
Proc. VLDB Endow., 2010

2009
Anonymization of Set-Valued Data via Top-Down, Local Generalization.
Proc. VLDB Endow., 2009


  Loading...