2024
Proc. ACM Manag. Data, February, 2024
Sub-optimal Join Order Identification with L1-error.
Proc. ACM Manag. Data, February, 2024
Spanning Tree-based Query Plan Enumeration.
CoRR, 2024
Simpli-Squared: Optimizing Without Cardinality Estimates.
Proceedings of the 2nd Workshop on Simplicity in Management of Data, SiMoD 2024, 2024
2023
Multidimensional Array Data Management.
Found. Trends Databases, 2023
Analyzing Query Optimizer Performance in the Presence and Absence of Cardinality Estimates.
CoRR, 2023
2022
Adaptive Optimization for Sparse Data on Heterogeneous GPUs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
2021
Simpli-Squared: A Very Simple Yet Unexpectedly Powerful Join Ordering Algorithm Without Cardinality Estimates.
CoRR, 2021
Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous Multi-GPU Servers.
CoRR, 2021
Online Sketch-based Query Optimization.
CoRR, 2021
DJEnsemble: a Cost-Based Selection and Allocation of a Disjoint Ensemble of Spatio-temporal Models.
Proceedings of the SSDBM 2021: 33rd International Conference on Scientific and Statistical Database Management, 2021
COMPASS: Online Sketch-based Query Optimization for In-Memory Databases.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021
Adaptive Stochastic Gradient Descent for Deep Learning on Heterogeneous CPU+GPU Architectures.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021
2020
DJEnsemble: On the Selection of a Disjoint Ensemble of Deep Learning Black-Box Spatio-temporal Models.
CoRR, 2020
Heterogeneous CPU+GPU Stochastic Gradient Descent Algorithms.
CoRR, 2020
2019
Special issue on scientific and statistical data management.
Distributed Parallel Databases, 2019
In-Depth Benchmarking of Graph Database Systems with the Linked Data Benchmark Council (LDBC) Social Network Benchmark (SNB).
CoRR, 2019
Exact Selectivity Computation for Modern In-Memory Database Query Optimization.
CoRR, 2019
PrivateJobMatch: a privacy-oriented deferred multi-match recommender system for stable employment.
Proceedings of the 13th ACM Conference on Recommender Systems, 2019
Stochastic Gradient Descent on Modern Hardware: Multi-core CPU or GPU? Synchronous or Asynchronous?
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019
Selectivity Computation for In-Memory Query Optimization.
Proceedings of the 9th Biennial Conference on Innovative Data Systems Research, 2019
2018
Automatic identification and classification of Palomar Transient Factory astrophysical objects in GLADE.
Int. J. Comput. Sci. Eng., 2018
Progressive Data Science: Potential and Challenges.
CoRR, 2018
Distributed Caching for Complex Querying of Raw Arrays.
CoRR, 2018
Stochastic Gradient Descent on Highly-Parallel Architectures.
CoRR, 2018
Distributed caching for processing raw arrays.
Proceedings of the 30th International Conference on Scientific and Statistical Database Management, 2018
2017
Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models.
Proc. VLDB Endow., 2017
Special issue on in-database analytics.
Distributed Parallel Databases, 2017
OLA-RAW: Scalable Exploration over Raw Data.
CoRR, 2017
Dot-Product Join: Scalable In-Database Linear Algebra for Big Model Analytics.
Proceedings of the 29th International Conference on Scientific and Statistical Database Management, 2017
Bi-Level Online Aggregation on Raw Data.
Proceedings of the 29th International Conference on Scientific and Statistical Database Management, 2017
Incremental View Maintenance over Array Data.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017
ArrayUDF: User-Defined Scientific Data Analysis on Arrays.
Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, 2017
Scalable In-Situ Exploration over Raw Data.
Proceedings of the 8th Biennial Conference on Innovative Data Systems Research, 2017
2016
Dot-Product Join: An Array-Relation Join Operator for Big Model Analytics.
CoRR, 2016
Similarity Join over Array Data.
Proceedings of the 2016 International Conference on Management of Data, 2016
Performance Implications of Processing-in-Memory Designs on Data-Intensive Applications.
Proceedings of the 45th International Conference on Parallel Processing Workshops, 2016
2015
Workload-Driven Antijoin Cardinality Estimation.
ACM Trans. Database Syst., 2015
SCANRAW: A Database Meta-Operator for Parallel In-Situ Processing and Loading.
ACM Trans. Database Syst., 2015
Formal representation of the SS-DB benchmark and experimental evaluation in EXTASCID.
Distributed Parallel Databases, 2015
Scalable Analytics Model Calibration with Online Aggregation.
IEEE Data Eng. Bull., 2015
Workload-Driven Vertical Partitioning for Effective Query Processing over Raw Data.
CoRR, 2015
Speculative Approximations for Terascale Analytics.
CoRR, 2015
Vertical partitioning for query processing over raw data.
Proceedings of the 27th International Conference on Scientific and Statistical Database Management, 2015
Speculative Approximations for Terascale Distributed Gradient Descent Optimization.
Proceedings of the Fourth Workshop on Data analytics in the Cloud, 2015
2014
PF-OLA: a high-performance framework for parallel online aggregation.
Distributed Parallel Databases, 2014
Parallel in-situ data processing with speculative loading.
Proceedings of the International Conference on Management of Data, 2014
Implementing the Palomar Transient Factory Real-Time Detection Pipeline in GLADE: Results and Observations.
Proceedings of the Databases in Networked Information Systems - 9th International Workshop, 2014
2013
A Survey on Array Storage, Query Languages, and Systems
CoRR, 2013
Parallel online aggregation in action.
Proceedings of the Conference on Scientific and Statistical Database Management, 2013
Astronomical data processing in EXTASCID.
Proceedings of the Conference on Scientific and Statistical Database Management, 2013
Scalable I/O-bound parallel incremental gradient descent for big data analytics in GLADE.
Proceedings of the Second Workshop on Data Analytics in the Cloud, 2013
Sampling Estimators for Parallel Online Aggregation.
Proceedings of the Big Data - 29th British National Conference on Databases, 2013
2012
GLADE: a scalable framework for efficient analytics.
ACM SIGOPS Oper. Syst. Rev., 2012
PF-OLA: A High-Performance Framework for Parallel On-Line Aggregation
CoRR, 2012
GLADE: big data analytics made easy.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012
2009
Turbo-Charging Estimate Convergence in DBO.
Proc. VLDB Endow., 2009
Sketching Sampled Data Streams.
Proceedings of the 25th International Conference on Data Engineering, 2009
2008
Sketches for size of join estimation.
ACM Trans. Database Syst., 2008
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008
2007
Pseudo-random number generation for sketch-based estimations.
ACM Trans. Database Syst., 2007
Statistical analysis of sketch estimators.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007
2006
Fast range-summable random variables for efficient aggregate estimation.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2006