Magdalena Balazinska

Orcid: 0000-0002-6805-0325

Affiliations:
  • University of Washington, Seattle, Washington, USA


According to our database1, Magdalena Balazinska authored at least 159 papers between 1999 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2019, "For contributions to scalable distributed data systems".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Demonstration of MaskSearch: Efficiently Querying Image Masks for Machine Learning Workflows.
Proc. VLDB Endow., August, 2024

RACOON: An LLM-based Framework for Retrieval-Augmented Column Type Annotation with a Knowledge Graph.
CoRR, 2024

Galley: Modern Query Optimization for Sparse Tensor Programs.
CoRR, 2024

Self-Enhancing Video Data Management System for Compositional Events with Large Language Models [Technical Report].
CoRR, 2024

2023
EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions.
Proc. VLDB Endow., 2023

EQUI-VOCAL Demonstration: Synthesizing Video Queries from User Interactions.
Proc. VLDB Endow., 2023

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building.
Proc. VLDB Endow., 2023

SafeBound: A Practical System for Generating Cardinality Bounds.
Proc. ACM Manag. Data, 2023

MaskSearch: Querying Image Masks at Scale.
CoRR, 2023

EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions [Technical Report].
CoRR, 2023

Degree Sequence Bound for Join Cardinality Estimation.
Proceedings of the 26th International Conference on Database Theory, 2023

2022
Editorial for S.I.: VLDB 2020.
VLDB J., 2022

Cloud Data Systems: What are the Opportunities for the Database Research Community?
Proc. VLDB Endow., 2022

The DB Community vis-à-vis Environmental, Health, and Societal Grand Challenges: Innovation Engine, Plumber, or Bystander?
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

VOCAL: Video Organization and Interactive Compositional AnaLytics.
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022

2021
Congratulations! You Have Become a Senior Researcher. Now What?
SIGMOD Rec., 2021

Demonstration of Apperception: A Database Management System for Geospatial Video Data.
Proc. VLDB Endow., 2021

DeepEverest: Accelerating Declarative Top-K Queries for Deep Neural Network Interpretation.
Proc. VLDB Endow., 2021

DeepEverest: Accelerating Declarative Top-K Queries for Deep Neural Network Interpretation [Technical Report].
CoRR, 2021

VSS: A Storage System for Video Analytics [Technical Report].
CoRR, 2021

VSS: A Storage System for Video Analytics.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

TASM: A Tile-Based Storage Manager for Video Analytics.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

2020
EntropyDB: a probabilistic approach to approximate query processing.
VLDB J., 2020

Winds from Seattle: Database Research Directions.
Proc. VLDB Endow., 2020

TASM: A Tile-Based Storage Manager for Video Analytics.
CoRR, 2020

Sample Debiasing in the Themis Open World Database System (Extended Version).
CoRR, 2020

Sampling for Deep Learning Model Diagnosis (Technical Report).
CoRR, 2020

Deluceva: Delta-Based Neural Network Inference for Fast Video Analytics.
Proceedings of the SSDBM 2020: 32nd International Conference on Scientific and Statistical Database Management, 2020

Sample Debiasing in the Themis Open World Database System.
Proceedings of the 2020 International Conference on Management of Data, 2020

The Next 5 Years: What Opportunities Should the Database Community Seize to Maximize its Impact?
Proceedings of the 2020 International Conference on Management of Data, 2020

Toward Sampling for Deep Learning Model Diagnosis.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

Mosaic: A Sample-Based Database System for Open World Query Processing.
Proceedings of the 10th Conference on Innovative Data Systems Research, 2020

VisualWorldDB: A DBMS for the Visual World.
Proceedings of the 10th Conference on Innovative Data Systems Research, 2020

2019
The Seattle Report on Database Research.
SIGMOD Rec., 2019

Front Matter.
Proc. VLDB Endow., 2019

An Empirical Analysis of Deep Learning for Cardinality Estimation.
CoRR, 2019

Vignette: Perceptual Compression for Video Storage and Processing Systems.
CoRR, 2019

Visual Road: A Video Data Management Benchmark.
Proceedings of the 2019 International Conference on Management of Data, 2019

Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities.
Proceedings of the 2019 International Conference on Management of Data, 2019

Perceptual Compression for Video Storage and Processing Systems.
Proceedings of the ACM Symposium on Cloud Computing, SoCC 2019, 2019

Databases meet the stream processing era.
Proceedings of the Making Databases Work: the Pragmatic Wisdom of Michael Stonebraker, 2019

2018
Fault Tolerance and High Availability in Data Stream Management Systems.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

LightDB: A DBMS for Virtual Reality Video.
Proc. VLDB Endow., 2018

Cuttlefish: A Lightweight Primitive for Adaptive Query Processing.
CoRR, 2018

SLAOrchestrator: Reducing the Cost of Performance SLAs for Cloud Data Analytics.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

Learning State Representations for Query Optimization with Deep Reinforcement Learning.
Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, 2018

2017
Probabilistic Database Summarization for Interactive Data Exploration.
Proc. VLDB Endow., 2017

Comparative Evaluation of Big-Data Systems on Scientific Image Analytics Workloads.
Proc. VLDB Endow., 2017

Elastic Memory Management for Cloud Data Analytics.
Proceedings of the 2017 USENIX Annual Technical Conference, 2017

VisualCloud Demonstration: A DBMS for Virtual Reality.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

A Demonstration of Interactive Analysis of Performance Measurements with Viska.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017

Keynote: Research with Real Users.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

The Myria Big Data Management and Analytics System and Cloud Services.
Proceedings of the 8th Biennial Conference on Innovative Data Systems Research, 2017

A Visual Cloud for Virtual Reality Applications.
Proceedings of the 8th Biennial Conference on Innovative Data Systems Research, 2017

2016
Price-Optimal Querying with Data APIs.
Proc. VLDB Endow., 2016

PerfEnforce: A Dynamic Scaling Engine for Analytics with Performance Guarantees.
CoRR, 2016

View-Driven Deduplication with Active Learning.
CoRR, 2016

Toward elastic memory management for cloud data analytics.
Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond, 2016

PerfEnforce Demonstration: Data Analytics with Performance Guarantees.
Proceedings of the 2016 International Conference on Management of Data, 2016

PipeGen: Data Pipe Generator for Hybrid Analytics.
Proceedings of the Seventh ACM Symposium on Cloud Computing, 2016

The Aurora and Borealis Stream Processing Engines.
Proceedings of the Data Stream Management - Processing High-Speed Data Streams, 2016

2015
The BigDAWG Polystore System.
SIGMOD Rec., 2015

Asynchronous and Fault-Tolerant Recursive Datalog Evaluation in Shared-Nothing Engines.
Proc. VLDB Endow., 2015

A Demonstration of the BigDAWG Polystore System.
Proc. VLDB Endow., 2015

Big Data Research: Will Industry Solve all the Problems?
Proc. VLDB Endow., 2015

Front Matter.
Proc. VLDB Endow., 2015

Query-Based Data Pricing.
J. ACM, 2015

Automated Analysis of Muscle X-ray Diffraction Imaging with MCMC.
Proceedings of the Biomedical Data Management and Graph Online Querying, 2015

Gaussian Mixture Models Use-Case: In-Memory Analysis with Myria.
Proceedings of the 3rd VLDB Workshop on In-Memory Data Mangement and Analytics, 2015

Efficient iterative processing in the SciDB parallel array engine.
Proceedings of the 27th International Conference on Scientific and Statistical Database Management, 2015

Automatic Enforcement of Data Use Policies with DataLawyer.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Machine Learning and Databases: The Sound of Things to Come or a Cacophony of Hype?
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Changing the Face of Database Cloud Services with Personalized Service Level Agreements.
Proceedings of the Seventh Biennial Conference on Innovative Data Systems Research, 2015

2014
Public Data and Visualizations: How are Many Eyes and Tableau Public Used for Collaborative Analytics?
SIGMOD Rec., 2014

The database group at the University of Washington.
SIGMOD Rec., 2014

The Beckman Report on Database Research.
SIGMOD Rec., 2014

Support the Data Enthusiast: Challenges for Next-Generation Data-Analysis Systems.
Proc. VLDB Endow., 2014

Approximation trade-offs in a Markovian stream warehouse: An empirical study.
Inf. Syst., 2014

Affordable Analytics on Expensive Data.
Proceedings of the First International Workshop on Bringing the Value of "Big Data" to Users, 2014

Big-Data Management Use-Case: A Cloud Service for Creating and Analyzing Galactic Merger Trees.
Proceedings of the Third Workshop on Data analytics in the Cloud, 2014

Demonstration of the Myria big data management service.
Proceedings of the International Conference on Management of Data, 2014

2013
Hadoop's Adolescence.
Proc. VLDB Endow., 2013

A Demonstration of Iterative Parallel Array Processing in Support of Telescope Image Analysis.
Proc. VLDB Endow., 2013

Squeezing a Big Orange into Little Boxes: The AscotDB System for Parallel Processing of Data on a Sphere.
IEEE Data Eng. Bull., 2013

Managing Skew in Hadoop.
IEEE Data Eng. Bull., 2013

Education and career paths for data scientists.
Proceedings of the Conference on Scientific and Statistical Database Management, 2013

The power of data use management in action.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

A vision for personalized service level agreements in the cloud.
Proceedings of the Second Workshop on Data Analytics in the Cloud, 2013

Toward practical query pricing with QueryMarket.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

Time travel in a scientific array database.
Proceedings of the 29th IEEE International Conference on Data Engineering, 2013

Stop That Query! The Need for Managing Data Use.
Proceedings of the Sixth Biennial Conference on Innovative Data Systems Research, 2013

A Discussion on Pricing Relational Data.
Proceedings of the In Search of Elegance in the Theory and Practice of Computation, 2013

2012
The HaLoop approach to large-scale iterative data analysis.
VLDB J., 2012

How to Price Shared Optimizations in the Cloud.
Proc. VLDB Endow., 2012

SkewTune in Action: Mitigating Skew in MapReduce Applications.
Proc. VLDB Endow., 2012

QueryMarket Demonstration: Pricing for Online Data Markets.
Proc. VLDB Endow., 2012

PerfXplain: Debugging MapReduce Job Performance.
Proc. VLDB Endow., 2012

SkewTune: mitigating skew in mapreduce applications.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012

Poster: Hadoop's Adolescence; A Comparative Workloads Analysis from Three Research Clusters.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Abstract: Hadoop's Adolescence; A Comparative Workloads Analysis from Three Research Clusters.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012

Designing good algorithms for MapReduce and beyond.
Proceedings of the ACM Symposium on Cloud Computing, SOCC '12, 2012

2011
Data Markets in the Cloud: An Opportunity for the Database Community.
Proc. VLDB Endow., 2011

Session-Based Browsing for More Effective Query Reuse.
Proceedings of the Scientific and Statistical Database Management, 2011

Towards Efficient and Precise Queries over Ten Million Asteroid Trajectory Models.
Proceedings of the Scientific and Statistical Database Management, 2011

A latency and fault-tolerance optimizer for online parallel query plans.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

ArrayStore: a storage manager for complex parallel array processing.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

Lineage for Markovian stream event queries.
Proceedings of the Tenth ACM International Workshop on Data Engineering for Wireless and Mobile Access, 2011

Hybrid merge/overlap execution technique for parallel array processing.
Proceedings of the 2011 EDBT/ICDT Workshop on Array Databases, 2011

2010
SnipSuggest: Context-Aware Autocompletion for SQL.
Proc. VLDB Endow., 2010

HaLoop: Efficient Iterative Data Processing on Large Clusters.
Proc. VLDB Endow., 2010

Astronomy in the Cloud: Using MapReduce for Image Coaddition
CoRR, 2010

Scalable Clustering Algorithm for N-Body Simulations in a Shared-Nothing Cluster.
Proceedings of the Scientific and Statistical Database Management, 2010

ParaTimer: a progress indicator for MapReduce DAGs.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

Specification and Verification of Complex Location Events with Panoramic.
Proceedings of the Pervasive Computing, 8th International Conference, 2010

Estimating the progress of MapReduce pipelines.
Proceedings of the 26th International Conference on Data Engineering, 2010

Approximation trade-offs in Markovian stream processing: An empirical study.
Proceedings of the 26th International Conference on Data Engineering, 2010

Skew-resistant parallel processing of feature-extracting scientific user-defined functions.
Proceedings of the 1st ACM Symposium on Cloud Computing, 2010

2009
Fault-Tolerance and High Availability in Data Stream Management Systems.
Proceedings of the Encyclopedia of Database Systems, 2009

Lahar Demonstration: Warehousing Markovian Streams.
Proc. VLDB Endow., 2009

Believe It or Not: Adding Belief Annotations to Databases.
Proc. VLDB Endow., 2009

A Demonstration of SciDB: A Science-Oriented DBMS.
Proc. VLDB Endow., 2009

Building the Internet of Things Using RFID: The RFID Ecosystem Experience.
IEEE Internet Comput., 2009

Longitudinal study of a building-scale RFID ecosystem.
Proceedings of the 7th International Conference on Mobile Systems, 2009

Access Methods for Markovian Streams.
Proceedings of the 25th International Conference on Data Engineering, 2009

Analyzing massive astrophysical datasets: Can Pig/Hadoop or a relational DBMS help?
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

A Case for A Collaborative Query Management System.
Proceedings of the Fourth Biennial Conference on Innovative Data Systems Research, 2009

2008
Fault-tolerance in the borealis distributed stream processing system.
ACM Trans. Database Syst., 2008

Fault-tolerant stream processing using a distributed, replicated file system.
Proc. VLDB Endow., 2008

Systems aspects of probabilistic data management.
Proc. VLDB Endow., 2008

Challenges for Event Queries over Markovian Streams.
IEEE Internet Comput., 2008

Event queries on correlated probabilistic streams.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

A demonstration of Cascadia through a digital diary application.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Cascadia: a system for specifying, detecting, and managing rfid events.
Proceedings of the 6th International Conference on Mobile Systems, 2008

Clustering Events on Streams Using Complex Context Information.
Proceedings of the Workshops Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), 2008

Probabilistic Event Extraction from RFID Data.
Proceedings of the 24th International Conference on Data Engineering, 2008

2007
Report on the Fourth International Workshop on Data Management for Sensor Networks (DMSN 2007).
SIGMOD Rec., 2007

Physical Access Control for Captured RFID Data.
IEEE Pervasive Comput., 2007

Data Management in the Worldwide Sensor Web.
IEEE Pervasive Comput., 2007

Homeviews: peer-to-peer middleware for personal data sharing applications.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

Challenges for Pervasive RFID-Based Infrastructures.
Proceedings of the Fifth Annual IEEE International Conference on Pervasive Computing and Communications, 2007

On-Demand View Materialization and Indexing for Network Forensic Analysis.
Proceedings of the Third International Workshop on Networking Meets Databases, 2007

Moirae: History-Enhanced Monitoring.
Proceedings of the Third Biennial Conference on Innovative Data Systems Research, 2007

2006
Towards correcting input data errors probabilistically using integrity constraints.
Proceedings of the Fifth ACM International Workshop on Data Engineering for Wireless and Mobile Access, 2006

2005
Fault-tolerance and load management in a distributed stream processing system.
PhD thesis, 2005

High-Availability Algorithms for Distributed Stream Processing.
Proceedings of the 21st International Conference on Data Engineering, 2005

The Design of the Borealis Stream Processing Engine.
Proceedings of the Second Biennial Conference on Innovative Data Systems Research, 2005

2004
Retrospective on Aurora.
VLDB J., 2004

Load Management and High Availability in the Medusa Distributed Stream Processing System.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2004

Contract-Based Load Management in Federated Distributed Systems.
Proceedings of the 1st Symposium on Networked Systems Design and Implementation (NSDI 2004), 2004

2003
The Aurora and Medusa Projects.
IEEE Data Eng. Bull., 2003

Thwarting Web Censorship with Untrusted Messenger Discovery.
Proceedings of the Privacy Enhancing Technologies, Third International Workshop, 2003

Characterizing Mobility and Network Usage in a Corporate Wireless Local-Area Network.
Proceedings of the First International Conference on Mobile Systems, 2003

Scalable Distributed Stream Processing.
Proceedings of the First Biennial Conference on Innovative Data Systems Research, 2003

2002
Infranet: Circumventing Web Censorship and Surveillance.
Proceedings of the 11th USENIX Security Symposium, 2002

INS/Twine: A Scalable Peer-to-Peer Architecture for Intentional Resource Discovery.
Proceedings of the Pervasive Computing, 2002

2000
Advanced Clone-Analysis to Support Object-Oriented System Refactoring.
Proceedings of the Seventh Working Conference on Reverse Engineering, 2000

1999
Partial Redesign of Java Software Systems Based on Clone Analysis.
Proceedings of the Sixth Working Conference on Reverse Engineering, 1999

Measuring Clone Based Reengineering Opportunities.
Proceedings of the 6th IEEE International Software Metrics Symposium (METRICS 1999), 1999


  Loading...