Dustin Lange

According to our database1, Dustin Lange authored at least 18 papers between 2010 and 2021.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2021
Automated Data Validation in Machine Learning Systems.
IEEE Data Eng. Bull., 2021

2019
DataWig: Missing Value Imputation for Tables.
J. Mach. Learn. Res., 2019

Unit Testing Data with Deequ.
Proceedings of the 2019 International Conference on Management of Data, 2019

Differential Data Quality Verification on Partitioned Data.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

2018
Automating Large-Scale Data Quality Verification.
Proc. VLDB Endow., 2018

"Deep" Learning for Missing Value Imputationin Tables with Non-Numerical Data.
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018

2017
Probabilistic Demand Forecasting at Scale.
Proc. VLDB Endow., 2017

An interpretable latent variable model for attribute applicability in the Amazon catalogue.
CoRR, 2017

2014
Reach for gold: An annealing standard to evaluate duplicate detection results.
ACM J. Data Inf. Qual., 2014

2013
Effective and efficient similarity search in databases.
PhD thesis, 2013

Cross-lingual entity matching and infobox alignment in Wikipedia.
Inf. Syst., 2013

Cost-aware query planning for similarity search.
Inf. Syst., 2013

Bulk sorted access for efficient top-k retrieval.
Proceedings of the Conference on Scientific and Statistical Database Management, 2013

2012
Efficient Similarity Search in Very Large String Sets.
Proceedings of the Scientific and Statistical Database Management, 2012

2011
Projektseminar "Similarity Search Algorithms".
Datenbank-Spektrum, 2011

Efficient similarity search: arbitrary similarity measures, arbitrary composition.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Frequency-aware similarity measures: why Arnold Schwarzenegger is always a duplicate.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

2010
Extracting structured information from Wikipedia articles to populate infoboxes.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010


  Loading...