Christopher Ré

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, 2021

PipeMare: Asynchronous Pipeline Parallel DNN Training.

[BibT_eX]

[DOI]

Proceedings of the Fourth Conference on Machine Learning and Systems, 2021

Observational Supervision for Medical Image Classification Using Gaze Data.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021

Catformer: Designing Stable Transformers via Sensitivity Analysis.

[BibT_eX]

[DOI]

Jared Quincy Davis

Albert Gu

Krzysztof Choromanski

Proceedings of the 38th International Conference on Machine Learning, 2021

Mandoline: Model Evaluation under Distribution Shift.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Cut out the annotator, keep the cutout: better segmentation with weak supervision.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Model Patching: Closing the Subgroup Performance Gap with Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training.

[BibT_eX]

[DOI]

Anshumali Shrivastava

Proceedings of the 9th International Conference on Learning Representations, 2021

Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation.

[BibT_eX]

[DOI]

Proceedings of the 11th Conference on Innovative Data Systems Research, 2021

Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Creating Hardware Component Knowledge Bases with Training Data Generation and Multi-task Learning.

[BibT_eX]

[DOI]

Philip Alexander Levis

ACM Trans. Embed. Comput. Syst., 2020

Leveraging Organizational Resources to Adapt Models to New Data Modalities.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2020

Cross-Modal Data Programming Enables Rapid Medical Machine Learning.

[BibT_eX]

[DOI]

Christopher Lee-Messer

Matthew P. Lungren

Daniel L. Rubin

Patterns, 2020

Weak supervision as an efficient approach for automated seizure detection in electroencephalography.

[BibT_eX]

[DOI]

Christopher Lee-Messer

npj Digit. Medicine, 2020

Classifying non-small cell lung cancer types and transcriptomic subtypes using convolutional neural networks.

[BibT_eX]

[DOI]

J. Am. Medical Informatics Assoc., 2020

Sharp Bias-variance Tradeoffs of Hard Parameter Sharing in High-dimensional Linear Regression.

[BibT_eX]

[DOI]

CoRR, 2020

Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings.

[BibT_eX]

[DOI]

CoRR, 2020

Assessing Robustness to Noise: Low-Cost Head CT Triage.

[BibT_eX]

[DOI]

CoRR, 2020

Extracting chemical reactions from text using Snorkel.

[BibT_eX]

[DOI]

Emily K. Mallory

Matthieu de Rochemonteix

BMC Bioinform., 2020

No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

HiPPO: Recurrent Memory with Optimal Polynomial Projections.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Understanding the Downstream Instability of Word Embeddings.

[BibT_eX]

[DOI]

Proceedings of the Third Conference on Machine Learning and Systems, 2020

On the Generalization Effects of Linear Transformations in Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Understanding and Improving Information Transfer in Multi-Task Learning.

[BibT_eX]

[DOI]

Sen Wu

Hongyang R. Zhang

Proceedings of the 8th International Conference on Learning Representations, 2020

Sparse Recovery for Orthogonal Polynomial Transforms.

[BibT_eX]

[DOI]

Proceedings of the 47th International Colloquium on Automata, Languages, and Programming, 2020

Overton: A Data System for Monitoring and Improving Machine-Learned Products.

[BibT_eX]

[DOI]

Proceedings of the 10th Conference on Innovative Data Systems Research, 2020

Hidden stratification causes clinically meaningful failures in machine learning for medical imaging.

[BibT_eX]

[DOI]

Proceedings of the ACM CHIL '20: ACM Conference on Health, 2020

Ivy: Instrumental Variable Synthesis for Causal Inference.

[BibT_eX]

[DOI]

Aldo Córdova-Palomera

Jared Dunnmon

James Priest

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Low-Dimensional Hyperbolic Knowledge Graph Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Contextual Embeddings: When Are They Worth It?

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark.

[BibT_eX]

[DOI]

ACM SIGOPS Oper. Syst. Rev., 2019

The Seattle Report on Database Research.

[BibT_eX]

[DOI]

SIGMOD Rec., 2019

Medical device surveillance with electronic health records.

[BibT_eX]

[DOI]

Alison Callahan

Jason A. Fries

James I Huddleston III

Nicholas J. Giori

Scott L. Delp

Nigam H. Shah

npj Digit. Medicine, 2019

Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels.

[BibT_eX]

[DOI]

CoRR, 2019

Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices.

[BibT_eX]

[DOI]

CoRR, 2019

Overton: A Data System for Monitoring and Improving Machine-Learned Products.

[BibT_eX]

[DOI]

Christopher Richard Aberger

Feng Niu

Pallavi Gudipati

Charles Srisuwananukorn

CoRR, 2019

Low-Memory Neural Network Training: A Technical Report.

[BibT_eX]

[DOI]

Nimit Sharad Sohoni

Megan Leszczynski

Jian Zhang

Dimitris S. Papailiopoulos

CoRR, 2019

SysML: The New Frontier of Machine Learning Systems.

[BibT_eX]

[DOI]

Alexandros G. Dimakis

Anastasios Kyrillidis

Shivaram Venkataraman

CoRR, 2019

Osprey: Weak Supervision of Imbalanced Extraction Problems without Code.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, 2019

Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale.

[BibT_eX]

[DOI]

Proceedings of the 2019 International Conference on Management of Data, 2019

Multi-Resolution Weak Supervision for Sequential Data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

On the Downstream Performance of Compressed Word Embeddings.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Hyperbolic Graph Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Doubly Weak Supervision of Deep Learning Models for Head CT.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Automating the generation of hardware component knowledge bases.

[BibT_eX]

[DOI]

Philip Alexander Levis

Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, 2019

Utilizing Weak Supervision to Infer Complex Objects and Situations in Autonomous Driving Data.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Intelligent Vehicles Symposium, 2019

Learning Dependency Structures for Weak Supervision Models.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

A Kernel Theory of Modern Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Learning Mixed-Curvature Representations in Product Spaces.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

A Formal Framework for Probabilistic Unclean Databases.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Database Theory, 2019

Scene Graph Prediction with Limited Labels.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

The Role of Massively Multi-Task and Weak Supervision in Software 2.0.

[BibT_eX]

[DOI]

Alexander J. Ratner

Braden Hancock

Proceedings of the 9th Biennial Conference on Innovative Data Systems Research, 2019

Classifying Non-Small Cell Lung Cancer Histopathology Types and Transcriptomic Subtypes using Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the AMIA 2019, 2019

Low-Precision Random Fourier Features for Memory-constrained Kernel Approximation.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Training Complex Models with Multi-Task Weak Supervision.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

A Relational Framework for Classifier Engineering.

[BibT_eX]

[DOI]

Benny Kimelfeld

SIGMOD Rec., 2018

Knowledge Base Construction in the Machine-learning Era.

[BibT_eX]

[DOI]

Alexander Ratner

ACM Queue, 2018

Snuba: Automating Weak Supervision to Label Training Data.

[BibT_eX]

[DOI]

Paroma Varma

Proc. VLDB Endow., 2018

It's All a Matter of Degree - Using Degree Information to Optimize Multiway Joins.

[BibT_eX]

[DOI]

Sai Vikneshwar Mani Jayaraman

Theory Comput. Syst., 2018

Worst-case Optimal Join Algorithms.

[BibT_eX]

[DOI]

J. ACM, 2018

Hypertree Decompositions Revisited for PGMs.

[BibT_eX]

[DOI]

Aarthy Shivram Arun

Atri Rudra

CoRR, 2018

High-Accuracy Low-Precision Training.

[BibT_eX]

[DOI]

CoRR, 2018

Research for practice: knowledge base construction in the machine-learning era.

[BibT_eX]

[DOI]

Alexander Ratner

Peter Bailis

Commun. ACM, 2018

A Two-pronged Progress in Structured Dense Matrix Vector Multiplication.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 2018

Exploring the Utility of Developer Exhaust.

[BibT_eX]

[DOI]

Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, 2018

Snorkel MeTaL: Weak Supervision for Multi-Task Learning.

[BibT_eX]

[DOI]

Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, 2018

Fonduer: Knowledge Base Construction from Richly Formatted Data.

[BibT_eX]

[DOI]

Philip Alexander Levis

Proceedings of the 2018 International Conference on Management of Data, 2018

Machine learning and deep analytics for biocomputing: Call for better explainability.

[BibT_eX]

[DOI]

Dragutin Petkovic

Lester Kobzik

Proceedings of the Biocomputing 2018: Proceedings of the Pacific Symposium, 2018

Learning Compressed Transforms with Low Displacement Rank.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Software 2.0 and Snorkel: Beyond Hand-Labeled Data.

[BibT_eX]

[DOI]

Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Representation Tradeoffs for Hyperbolic Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Learning Invariance with Compact Transforms.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

LevelHeaded: A Unified Engine for Business Intelligence and Linear Algebra Querying.

[BibT_eX]

[DOI]

Andrew Lamb

Proceedings of the 34th IEEE International Conference on Data Engineering, 2018

Unraveling the Molecular Basis of Lung Adenocarcinoma Dedifferentiation and Prognosis by Integrating Omics and Histopathology.

[BibT_eX]

[DOI]

Proceedings of the AMIA 2018, 2018

Accelerated Stochastic Power Iteration.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

Training Classifiers with Natural Language Explanations.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017

Incremental knowledge base construction using DeepDive.

[BibT_eX]

[DOI]

VLDB J., 2017

EmptyHeaded: A Relational Engine for Graph Processing.

[BibT_eX]

[DOI]

ACM Trans. Database Syst., 2017

Report from the third workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR'16).

[BibT_eX]

[DOI]

SIGMOD Rec., 2017

HoloClean: Holistic Data Repairs with Probabilistic Inference.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2017

Snorkel: Rapid Training Data Creation with Weak Supervision.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2017

Mind the Gap: Bridging Multi-Domain Query Workloads with EmptyHeaded.

[BibT_eX]

[DOI]

Andrew Lamb

Proc. VLDB Endow., 2017

Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2017

LevelHeaded: Making Worst-Case Optimal Joins Work in the Common Case.

[BibT_eX]

[DOI]

Andrew Lamb

CoRR, 2017

YellowFin and the Art of Momentum Tuning.

[BibT_eX]

[DOI]

Jian Zhang

Ioannis Mitliagkas

CoRR, 2017

SwellShark: A Generative Model for Biomedical Named Entity Recognition without Labeled Data.

[BibT_eX]

[DOI]

CoRR, 2017

Infrastructure for Usable Machine Learning: The Stanford DAWN Project.

[BibT_eX]

[DOI]

CoRR, 2017

DeepDive: declarative knowledge base construction.

[BibT_eX]

[DOI]

Commun. ACM, 2017

Snorkel: Beyond Hand-labeled Data.

[BibT_eX]

Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017

Flipper: A Systematic Approach to Debugging Training Sets.

[BibT_eX]

[DOI]

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics, 2017

SLiMFast: Guaranteed Results for Data Fusion and Source Reliability.

[BibT_eX]

[DOI]

Theodoros Rekatsinas

Hector Garcia-Molina

Aditya G. Parameswaran

Proceedings of the 2017 ACM International Conference on Management of Data, 2017

Snorkel: Fast Training Set Generation for Information Extraction.

[BibT_eX]

[DOI]

Proceedings of the 2017 ACM International Conference on Management of Data, 2017

Inferring Generative Model Structure with Static Analysis.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Learning to Compose Domain-Specific Transformations for Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Gaussian Quadrature for Kernel Features.

[BibT_eX]

[DOI]

Tri Dao

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

ShortFuse: Biomedical Time Series Representations in the Presence of Structured Information.

[BibT_eX]

[DOI]

Madalina Fiterau

Suvrat Bhooshan

Jason A. Fries

Charles Bournhonesque

Proceedings of the Machine Learning for Health Care Conference, 2017

Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Learning the Structure of Generative Models without Labeled Data.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

GYM: A Multiround Distributed Join Algorithm.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Database Theory, 2017

Snorkel: A System for Lightweight Extraction.

[BibT_eX]

[DOI]

Proceedings of the 8th Biennial Conference on Innovative Data Systems Research, 2017

Predicting Non-Small Cell Lung Cancer Diagnosis and Prognosis by Fully Automated Microscopic Pathology Image Features.

[BibT_eX]

[DOI]

Proceedings of the AMIA 2017, 2017

2016

Materialization Optimizations for Feature Selection Workloads.

[BibT_eX]

[DOI]

Arun Kumar

Dimitris S. Papailiopoulos

ACM Trans. Database Syst., 2016

Joins via Geometric Resolutions: Worst Case and Beyond.

[BibT_eX]

[DOI]

ACM Trans. Database Syst., 2016

DeepDive: Declarative Knowledge Base Construction.

[BibT_eX]

[DOI]

SIGMOD Rec., 2016

Parallel SGD: When does averaging help?

[BibT_eX]

[DOI]

CoRR, 2016

Socratic Learning.

[BibT_eX]

[DOI]

CoRR, 2016

CYCLADES: Conflict-free Asynchronous Machine Learning.

[BibT_eX]

[DOI]

Xinghao Pan

Maximilian Lam

Stephen Tu

CoRR, 2016

Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs.

[BibT_eX]

[DOI]

CoRR, 2016

Recurrence Width for Structured Dense Matrix Vector Multiplication.

[BibT_eX]

[DOI]

CoRR, 2016

Large-scale extraction of gene interactions from full-text literature using DeepDive.

[BibT_eX]

[DOI]

Bioinform., 2016

Weighted SGD for ℓp Regression with Randomized Preconditioning.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, 2016

Extracting Databases from Dark Data with DeepDive.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Management of Data, 2016

Data programming with DDLite: putting humans in a different part of the loop.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Human-In-the-Loop Data Analytics, 2016

EmptyHeaded: A Relational Engine for Graph Processing.

[BibT_eX]

[DOI]

Susan Tu

Proceedings of the 2016 International Conference on Management of Data, 2016

AJAR: Aggregations and Joins over Annotated Relations.

[BibT_eX]

[DOI]

Manas R. Joglekar

Rohan Puttagunta

Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, 2016

Sub-sampled Newton Methods with Non-uniform Sampling.

[BibT_eX]

[DOI]

Peng Xu

Jiyan Yang

Farbod Roosta-Khorasani

Michael W. Mahoney

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Data Programming: Creating Large Training Sets, Quickly.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

High Performance Parallel Stochastic Gradient Descent in Shared Memory.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016

Wikipedia Knowledge Graph with DeepDive.

[BibT_eX]

[DOI]

Proceedings of the Wiki, 2016

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling.

[BibT_eX]

[DOI]

Proceedings of the 33nd International Conference on Machine Learning, 2016

Dark Data: Are we solving the right problems?

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016

Old techniques for new join algorithms: A case study in RDF processing.

[BibT_eX]

[DOI]

Susan Tu

Proceedings of the 32nd IEEE International Conference on Data Engineering Workshops, 2016

Asynchrony begets momentum, with an application to deep learning.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Allerton Conference on Communication, 2016

2015

Incremental Knowledge Base Construction Using DeepDive.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2015

Mindtagger: A Demonstration of Data Labeling in Knowledge Base Construction.

[BibT_eX]

[DOI]

Jaeho Shin

Michael J. Cafarella

Proc. VLDB Endow., 2015

An asynchronous parallel stochastic coordinate descent algorithm.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2015

The mobilize center: an NIH big data to knowledge center to advance human movement research and improve mobility.

[BibT_eX]

[DOI]

J. Am. Medical Informatics Assoc., 2015

Building a Large-scale Multimodal Knowledge Base for Visual Question Answering.

[BibT_eX]

[DOI]

CoRR, 2015

Incremental Knowledge Base Construction Using DeepDive.

[BibT_eX]

[DOI]

CoRR, 2015

Exploiting Features for Data Source Quality Estimation.

[BibT_eX]

[DOI]

Theodoros Rekatsinas

Hector Garcia-Molina

Aditya G. Parameswaran

CoRR, 2015

Aggregations over Generalized Hypertree Decompositions.

[BibT_eX]

[DOI]

Rohan Puttagunta

CoRR, 2015

EmptyHeaded: Boolean Algebra Based Graph Processing.

[BibT_eX]

[DOI]

Andres Nötzli

CoRR, 2015

Energy-Efficient Abundant-Data Computing: The N3XT 1, 000x.

[BibT_eX]

[DOI]

Computer, 2015

DunceCap: Query Plans Using Generalized Hypertree Decompositions.

[BibT_eX]

[DOI]

Susan Tu

Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Machine Learning and Databases: The Sound of Things to Come or a Cacophony of Hype?

[BibT_eX]

[DOI]

Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

DunceCap: Compiling Worst-Case Optimal Query Plans.

[BibT_eX]

[DOI]

Adam Perelman

Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Join Processing for Graph Patterns: An Old Dog with New Tricks.

[BibT_eX]

[DOI]

Proceedings of the Third International Workshop on Graph Data Management Experiences and Systems, 2015

Exploiting Correlations for Expensive Predicate Evaluation.

[BibT_eX]

[DOI]

Hector Garcia-Molina

Aditya G. Parameswaran

Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Caffe con Troll: Shallow Ideas to Speed Up Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the Fourth Workshop on Data analytics in the Cloud, 2015

Taming the Wild: A Unified Analysis of Hogwild-Style Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Asynchronous stochastic convex optimization: the noise is in the noise and SGD don't care.

[BibT_eX]

[DOI]

Sorathan Chaturapruek

John C. Duchi

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Machine Learning, 2015

Jedi: A Storage Manager for SIMD-aware, Worst-case Optimal Join Processing.

[BibT_eX]

[DOI]

Proceedings of the Workshops of the EDBT/ICDT 2015 Joint Conference (EDBT/ICDT), 2015

A Database Framework for Classifier Engineering.

[BibT_eX]

[DOI]

Benny Kimelfeld

Proceedings of the 9th Alberto Mendelzon International Workshop on Foundations of Data Management, Lima, Peru, May 6, 2015

2014

The Beckman Report on Database Research.

[BibT_eX]

[DOI]

SIGMOD Rec., 2014

DimmWitted: A Study of Main-Memory Statistical Analytics.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2014

Transducing Markov sequences.

[BibT_eX]

[DOI]

Benny Kimelfeld

J. ACM, 2014

Approximation trade-offs in a Markovian stream warehouse: An empirical study.

[BibT_eX]

[DOI]

Inf. Syst., 2014

Feature Engineering for Knowledge Base Construction.

[BibT_eX]

[DOI]

IEEE Data Eng. Bull., 2014

Global Convergence of Stochastic Gradient Descent for Some Nonconvex Matrix Problems.

[BibT_eX]

[DOI]

CoRR, 2014

A machine-compiled macroevolutionary history of Phanerozoic life.

[BibT_eX]

[DOI]

CoRR, 2014

GYM: A Multiround Join Algorithm In MapReduce.

[BibT_eX]

[DOI]

CoRR, 2014

Tradeoffs in Main-Memory Statistical Analytics from Impala to DimmWitted.

[BibT_eX]

[DOI]

Proceedings of the 2nd International Workshop on In Memory Data Management and Analytics, 2014

Beyond worst-case analysis for joins with minesweeper.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2014

Parallel Feature Selection Inspired by Group Testing.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Effectively Creating Weakly Labeled Training Examples via Approximate Domain Knowledge.

[BibT_eX]

[DOI]

Proceedings of the Inductive Logic Programming - 24th International Conference, 2014

The Theory of Zeta Graphs with an Application to Random Networks.

[BibT_eX]

[DOI]

Proceedings of the Proc. 17th International Conference on Database Theory (ICDT), 2014

Links between Join Processing and Convex Geometry.

[BibT_eX]

[DOI]

Proceedings of the Proc. 17th International Conference on Database Theory (ICDT), 2014

2013

Probabilistic Web Data Management.

[BibT_eX]

[DOI]

World Wide Web, 2013

Skew strikes back: new developments in the theory of join algorithms.

[BibT_eX]

[DOI]

Hung Q. Ngo

Atri Rudra

SIGMOD Rec., 2013

Hazy: Making it Easier to Build and Maintain Big-data Analytics.

[BibT_eX]

[DOI]

Arun Kumar

Feng Niu

ACM Queue, 2013

Feature Selection in Enterprise Analytics: A Demonstration using an R-based Data Analytics System.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2013

Ringtail: A Generalized Nowcasting System.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2013

Parallel stochastic gradient algorithms for large-scale matrix completion.

[BibT_eX]

[DOI]

Benjamin Recht

Math. Program. Comput., 2013

Towards Instance Optimal Join Algorithms for Data in Indexes

[BibT_eX]

[DOI]

CoRR, 2013

An Approximate, Efficient Solver for LP Rounding.

[BibT_eX]

[DOI]

CoRR, 2013

Ringtail: Feature Selection For Easier Nowcasting.

[BibT_eX]

[DOI]

Proceedings of the 16th International Workshop on the Web and Databases 2013, 2013

Bootstrapping Knowledge Base Acceleration.

[BibT_eX]

[DOI]

Proceedings of The Twenty-Second Text REtrieval Conference, 2013

Evaluating Stream Filtering for Entity Profile Updates for TREC 2013.

[BibT_eX]

[DOI]

Proceedings of The Twenty-Second Text REtrieval Conference, 2013

Towards high-throughput gibbs sampling at scale: a study across storage managers.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

GeoDeepDive: statistical inference using familiar data-processing languages.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

An Approximate, Efficient LP Solver for LP Rounding.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Brainwash: A Data System for Feature Engineering.

[BibT_eX]

[DOI]

Proceedings of the Sixth Biennial Conference on Innovative Data Systems Research, 2013

A Tutorial on Trained Systems: A New Generation of Data Management Systems?

[BibT_eX]

[DOI]

Proceedings of the Big Data - 29th British National Conference on Databases, 2013

Understanding Tables in Context Using Standard NLP Toolkits.

[BibT_eX]

[DOI]

Vidhya Govindaraju

Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

Using Commonsense Knowledge to Automatically Create (Noisy) Training Examples from Text.

[BibT_eX]

[DOI]

Proceedings of the Statistical Relational Artificial Intelligence, 2013

2012

Understanding cardinality estimation using entropy maximization.

[BibT_eX]

[DOI]

ACM Trans. Database Syst., 2012

The MADlib Analytics Library or MAD Skills, the SQL.

[BibT_eX]

[DOI]

Joseph M. Hellerstein

Proc. VLDB Endow., 2012

Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences.

[BibT_eX]

[DOI]

Benjamin Recht

Proceedings of the COLT 2012, 2012

Elementary: Large-Scale Knowledge-Base Construction via Machine Learning and Statistical Inference.

[BibT_eX]

[DOI]

Int. J. Semantic Web Inf. Syst., 2012

DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference.

[BibT_eX]

[DOI]

Proceedings of the Second International Workshop on Searching and Integrating New Web Data Sources, 2012

Building an Entity-Centric Stream Filtering Test Collection for TREC 2012.

[BibT_eX]

[DOI]

Proceedings of The Twenty-First Text REtrieval Conference, 2012

Towards a unified architecture for in-RDBMS analytics.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012

Worst-case optimal join algorithms: [extended abstract].

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2012

Factoring nonnegative matrices with linear programs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Scaling Inference for Markov Logic via Dual Decomposition.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE International Conference on Data Mining, 2012

Optimizing Statistical Information Extraction Programs over Evolving Text.

[BibT_eX]

[DOI]

Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

Big Data versus the Crowd: Looking for Relationships in All the Right Places.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011

Probabilistic Databases

[BibT_eX]

[DOI]

Synthesis Lectures on Data Management, Morgan & Claypool Publishers, ISBN: 978-3-031-01879-4, 2011

Tuffy: Scaling up Statistical Inference in Markov Logic Networks using an RDBMS.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2011

Probabilistic Management of OCR Data using an RDBMS.

[BibT_eX]

[DOI]

Arun Kumar

Proc. VLDB Endow., 2011

Incrementally maintaining classification using an RDBMS.

[BibT_eX]

[DOI]

Mehmet Levent Koc

Proc. VLDB Endow., 2011

Automatic Optimization for MapReduce Programs.

[BibT_eX]

[DOI]

Eaman Jahani

Michael J. Cafarella

Proc. VLDB Endow., 2011

Queries and materialized views on probabilistic databases.

[BibT_eX]

[DOI]

J. Comput. Syst. Sci., 2011

Felix: Scaling Inference for Markov Logic with an Operator-based Approach

[BibT_eX]

[DOI]

CoRR, 2011

Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

2010

Manimal: Relational Optimization for Data-Intensive Programs.

[BibT_eX]

[DOI]

Michael J. Cafarella

Proceedings of the 13th International Workshop on the Web and Databases 2010, 2010

Approximation trade-offs in Markovian stream processing: An empirical study.

[BibT_eX]

[DOI]

Proceedings of the 26th International Conference on Data Engineering, 2010

2009

The trichotomy of HAVING queries on a probabilistic database.

[BibT_eX]

[DOI]

VLDB J., 2009

Repeatability & workability evaluation of SIGMOD 2009.

[BibT_eX]

[DOI]

Marios Hadjieleftheriou

Stavros Harizopoulos

Panos Kalnis

Konstantinos Karanasos

SIGMOD Rec., 2009

Lahar Demonstration: Warehousing Markovian Streams.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2009

Probabilistic databases: diamonds in the dirt.

[BibT_eX]

[DOI]

Commun. ACM, 2009

Query Containment of Tier-2 Queries over a Probabilistic Database.

[BibT_eX]

[DOI]

Proceedings of the Third VLDB workshop on Management of Uncertain Data (MUD2009) in conjunction with VLDB 2009, 2009

Access Methods for Markovian Streams.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Data Engineering, 2009

Large-Scale Deduplication with Constraints Using Dedupalog.

[BibT_eX]

[DOI]

Arvind Arasu

Proceedings of the 25th International Conference on Data Engineering, 2009

General Database Statistics Using Entropy Maximization.

[BibT_eX]

[DOI]

Raghav Kaushik

Proceedings of the Database Programming Languages, 2009

2008

Approximate lineage for probabilistic databases.

[BibT_eX]

[DOI]

Proc. VLDB Endow., 2008

Systems aspects of probabilistic data management.

[BibT_eX]

[DOI]

Magdalena Balazinska

Proc. VLDB Endow., 2008

Challenges for Event Queries over Markovian Streams.

[BibT_eX]

[DOI]

IEEE Internet Comput., 2008

Managing Probabilistic Data with MystiQ: The Can-Do, the Could-Do, and the Can't-Do.

[BibT_eX]

[DOI]

Proceedings of the Scalable Uncertainty Management, Second International Conference, 2008

Event queries on correlated probabilistic streams.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

A demonstration of Cascadia through a digital diary application.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Implementing NOT EXISTS Predicates over a Probabilistic Database.

[BibT_eX]

[DOI]

Ting-You Wang

Proceedings of the International Workshop on Quality in Databases and Management of Uncertain Data, 2008

08421 Working Group: Report of the Probabilistic Databases Benchmarking.

[BibT_eX]

[DOI]

Proceedings of the Uncertainty Management in Information Systems, 12.10. - 17.10.2008, 2008

2007

Managing Uncertainty in Social Networks.

[BibT_eX]

[DOI]

Eytan Adar

IEEE Data Eng. Bull., 2007

Materialized Views in Probabilistic Databases for Information Exchange and Query Optimization.

[BibT_eX]

[DOI]

Proceedings of the 33rd International Conference on Very Large Data Bases, 2007

Efficient Top-k Query Evaluation on Probabilistic Data.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Data Engineering, 2007

Efficient Evaluation of.

[BibT_eX]

[DOI]

Proceedings of the Database Programming Languages, 11th International Symposium, 2007

Management of data with uncertainties.

[BibT_eX]

[DOI]

Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

Structured Querying of Web Text Data: A Technical Challenge.

[BibT_eX]

[DOI]

Proceedings of the Third Biennial Conference on Innovative Data Systems Research, 2007

2006

Query Evaluation on Probabilistic Databases.

[BibT_eX]

[DOI]

IEEE Data Eng. Bull., 2006

A Complete and Efficient Algebraic Compiler for XQuery.

[BibT_eX]

[DOI]

Jérôme Siméon

Mary F. Fernández

Proceedings of the 22nd International Conference on Data Engineering, 2006

XQuery!: An XML Query Language with Side Effects.

[BibT_eX]

[DOI]

Giorgio Ghelli

Jérôme Siméon

Proceedings of the Current Trends in Database Technology - EDBT 2006, 2006

2005

A Framework for XML-Based Integration of Data, Visualization and Analysis in a Biomedical Domain.

[BibT_eX]

[DOI]

Proceedings of the Database and XML Technologies, 2005

MYSTIQ: a system for finding more answers by using probabilities.

[BibT_eX]

[DOI]

Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

Supporting workflow in a course management system.

[BibT_eX]

[DOI]

Proceedings of the 36th SIGCSE Technical Symposium on Computer Science Education, 2005

2003

WS-Membership - Failure Management in a Web-Services World.

[BibT_eX]

[DOI]

Werner Vogels