Nick Koudas

Orcid: 0000-0001-5648-0638

  • University of Toronto, Canada

According to our database1, Nick Koudas authored at least 207 papers between 1996 and 2025.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:



Coping With Data Drift in Online Video Analytics.
Proceedings of the Proceedings 28th International Conference on Extending Database Technology, 2025

DataSculpt: Cost-Efficient Label Function Design via Prompting Large Language Models.
Proceedings of the Proceedings 28th International Conference on Extending Database Technology, 2025

Ensembling Object Detectors for Effective Video Query Processing.
Proceedings of the Proceedings 28th International Conference on Extending Database Technology, 2025

Pythia: A Neural Model for Data Prefetching.
Proceedings of the Proceedings 28th International Conference on Extending Database Technology, 2025

A Distributed Solution for Efficient K Shortest Paths Computation Over Dynamic Road Networks.
IEEE Trans. Knowl. Data Eng., July, 2024

Optimizing Video Queries with Declarative Clues.
Proc. VLDB Endow., July, 2024

Unstructured Data Fusion for Schema and Data Extraction.
Proc. ACM Manag. Data, 2024

Data Acquisition for Improving Model Confidence.
Proc. ACM Manag. Data, 2024

WeShap: Weak Supervision Source Evaluation with Shapley Values.
CoRR, 2024

AXOLOTL: Fairness through Assisted Self-Debiasing of Large Language Model Outputs.
CoRR, 2024

ActiveDP: Bridging Active Learning and Data Programming.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

Querying For Actions Over Videos.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024

Querying for Interactions.
IEEE Trans. Knowl. Data Eng., 2023

dbET: Execution Time Distribution-based Plan Selection.
Proc. ACM Manag. Data, 2023

Can Large Language Models Design Accurate Label Functions?
CoRR, 2023

Marshalling Model Inference in Video Streams.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

SVQ-ACT: Querying for Actions over Videos.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Track Merging for Effective Video Query Processing.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

Video Monitoring Queries.
IEEE Trans. Knowl. Data Eng., 2022

CERTEM: Explaining and Debugging Black-box Entity Resolution Systems with CERTA.
Proc. VLDB Endow., 2022

Spatial and Temporal Constrained Ranked Retrieval over Videos.
Proc. VLDB Endow., 2022

FILA: Online Auditing of Machine Learning Model Accuracy under Finite Labelling Budget.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

Prediction Intervals for Learned Cardinality Estimation: An Experimental Evaluation.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Effective Explanations for Entity Resolution Models.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Ranked Window Query Retrieval over Video Repositories.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

Data Acquisition for Improving Machine Learning Models.
Proc. VLDB Endow., 2021

LES3: Learning-based exact set similarity search.
Proc. VLDB Endow., 2021

Shahin: Faster Algorithms for Generating Explanations for Multiple Predictions.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Evaluating Temporal Queries Over Video Feeds.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Querying for Interactions.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Efficient Construction of Nonlinear Models over Normalized Data.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021

Astrid: Accurate Selectivity Estimation for String Predicates using Deep Learning.
Proc. VLDB Endow., 2020

Efficient Construction of Nonlinear Models overNormalized Data.
CoRR, 2020

Evaluating Temporal Queries Over Video Feeds.
CoRR, 2020

Distributed Processing of k Shortest Path Queries over Dynamic Road Networks.
Proceedings of the 2020 International Conference on Management of Data, 2020

Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries.
Proceedings of the 2020 International Conference on Management of Data, 2020

TQVS: Temporal Queries over Video Streams in Action.
Proceedings of the 2020 International Conference on Management of Data, 2020

SVQ++: Querying for Object Interactions in Video Streams.
Proceedings of the 2020 International Conference on Management of Data, 2020

Approximate Query Processing for Data Exploration using Deep Generative Models.
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020

ApproxML: Efficient Approximate Ad-Hoc ML Models Through Materialization and Reuse.
Proc. VLDB Endow., 2019

Approximate Query Processing using Deep Generative Models.
CoRR, 2019

Multi-Attribute Selectivity Estimation Using Deep Learning.
CoRR, 2019

SVQ: Streaming Video Queries.
Proceedings of the 2019 International Conference on Management of Data, 2019

Top-k Queries over Digital Traces.
Proceedings of the 2019 International Conference on Management of Data, 2019

Interpreting deep learning models for entity resolution: an experience report using LIME.
Proceedings of the Second International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, 2019

Maximizing Gain over Flexible Attributes in Peer to Peer Marketplaces.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2019

An Improved Dynamic Vertical Partitioning Technique for Semi-Structured Data.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019

Nonlinear Models Over Normalized Data.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019

Efficient Construction of Approximate Ad-Hoc ML models Through Materialization and Reuse.
Proc. VLDB Endow., 2018

Assisting Service Providers In Peer-to-peer Marketplaces: Maximizing Gain Over Flexible Attributes.
CoRR, 2017

Efficient Computation of Subspace Skyline over Categorical Domains.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

Processing Analytical Workloads Incrementally.
CoRR, 2015

Reaching a desired set of users via different paths: an online advertising technique on micro-blogging platforms.
Proceedings of the 18th International Conference on Extending Database Technology, 2015

Parallel in-memory trajectory-based spatiotemporal topological join.
Proceedings of the 2015 IEEE International Conference on Big Data (IEEE BigData 2015), Santa Clara, CA, USA, October 29, 2015

Dense subgraph maintenance under streaming edge weight updates for real-time story identification.
VLDB J., 2014

Sharing across Multiple MapReduce Jobs.
ACM Trans. Database Syst., 2014

SerpentTI: flexible analytics of users, boards and domains for pinterest.
Proceedings of the International Conference on Management of Data, 2014

Price trade-offs in social media advertising.
Proceedings of the second ACM conference on Online social networks, 2014

Sampling Online Social Networks.
IEEE Trans. Knowl. Data Eng., 2013

Partitioning and Ranking Tagged Data Sources.
Proc. VLDB Endow., 2013

Some Research Opportunities on Twitter Advertising.
IEEE Data Eng. Bull., 2013

Bursty subgraphs in social networks.
Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, 2013

Peckalytics: analyzing experts and interests on Twitter.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

Information cascade at group scale.
Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013

Pollux: towards scalable distributed real-time search on microblogs.
Proceedings of the Joint 2013 EDBT/ICDT Conferences, 2013

Dense Subgraph Maintenance under Streaming Edge Weight Updates for Real-time Story Identification.
Proc. VLDB Endow., 2012

Letter from the Research Track Co-Chair.
Proc. VLDB Endow., 2011

Efficient diversity-aware search.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

Streaming multiple aggregations using phantoms.
VLDB J., 2010

Transparent anonymization: Thwarting adversaries who know the algorithm.
ACM Trans. Database Syst., 2010

MRShare: Sharing Across Multiple Queries in MapReduce.
Proc. VLDB Endow., 2010

Identifying, Attributing and Describing Spatial Bursts.
Proc. VLDB Endow., 2010

An Access Cost-Aware Approach for Object Retrieval over Multiple Sources.
Proc. VLDB Endow., 2010

Early online identification of attention gathering items in social media.
Proceedings of the Third International Conference on Web Search and Web Data Mining, 2010

TwitterMonitor: trend detection over the twitter stream.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

Crowds, clouds, and algorithms: exploring the human side of "big data" applications.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2010

Efficient identification of coupled entities in document collections.
Proceedings of the 26th International Conference on Data Engineering, 2010

Suffix tree construction algorithms on modern hardware.
Proceedings of the EDBT 2010, 2010

Anytime measures for top-<i>k</i> algorithms on exact and fuzzy data sets.
VLDB J., 2009

The design of a query monitoring system.
ACM Trans. Database Syst., 2009

Optimization Techniques for Reactive Network Monitoring.
IEEE Trans. Knowl. Data Eng., 2009

Improving the Performance of List Intersection.
Proc. VLDB Endow., 2009

Improved Search for Socially Annotated Data.
Proc. VLDB Endow., 2009

Measure-driven Keyword-Query Expansion.
Proc. VLDB Endow., 2009

Distribution-based Microdata Anonymization.
Proc. VLDB Endow., 2009

Finding the K highest-ranked answers in a distributed network.
Comput. Networks, 2009

Query by document.
Proceedings of the Second International Conference on Web Search and Web Data Mining, 2009

Incremental maintenance of length normalized indexes for approximate string matching.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2009

What's on the grapevine?
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2009

Information Cascades in the Blogosphere: A Look Behind the Curtain.
Proceedings of the Third International Conference on Weblogs and Social Media, 2009

Join Reordering by Join Simulation.
Proceedings of the 25th International Conference on Data Engineering, 2009

Metric Functional Dependencies.
Proceedings of the 25th International Conference on Data Engineering, 2009

Interactive query refinement.
Proceedings of the EDBT 2009, 2009

Efficient identification of starters and followers in social media.
Proceedings of the EDBT 2009, 2009

Ranking objects based on relationships and fixed associations.
Proceedings of the EDBT 2009, 2009

Hashed samples: selectivity estimators for set similarity selection queries.
Proc. VLDB Endow., 2008

On space constrained set selection problems.
Data Knowl. Eng., 2008

Adventures in the Blogosphere.
Proceedings of the Scientific and Statistical Database Management, 2008

Categorical skylines for streaming data.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Generating targeted queries for database testing.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Stretch 'n' shrink: resizing queries to user preferences.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Ad-hoc aggregations of ranked lists in the presence of hierarchies.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Fast Indexes and Algorithms for Set Similarity Selection Queries.
Proceedings of the 24th International Conference on Data Engineering, 2008

Validating Multi-column Schema Matchings by Type.
Proceedings of the 24th International Conference on Data Engineering, 2008

Optimizing away joins on data streams.
Proceedings of the 2008 International Workshop on Scalable Stream Processing System, 2008

Efficient sampling of information in social networks.
Proceedings of the Proceeding of the 2008 ACM Workshop on Search in Social Media, 2008

Estimating the selectivity of approximate string queries.
ACM Trans. Database Syst., 2007

Editorial: Revisiting the (Machine) Semantic Web: The Missing Layers for the Human Semantic Web.
IEEE Trans. Knowl. Data Eng., 2007

Index structures for matching XML twigs using relational query processors.
Data Knowl. Eng., 2007

BlogScope: spatio-temporal analysis of the blogosphere.
Proceedings of the 16th International Conference on World Wide Web, 2007

Searching the Blogosphere.
Proceedings of the Tenth International Workshop on the Web and Databases, 2007

Ad-hoc Top-k Query Answering for Data Streams.
Proceedings of the 33rd International Conference on Very Large Data Bases, 2007

BlogScope: A System for Online Analysis of High Volume Text Streams.
Proceedings of the 33rd International Conference on Very Large Data Bases, 2007

Seeking Stable Clusters in the Blogosphere.
Proceedings of the 33rd International Conference on Very Large Data Bases, 2007

Anytime Measures for Top-k Algorithms.
Proceedings of the 33rd International Conference on Very Large Data Bases, 2007

Benchmarking declarative approximate selection predicates.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

Aggregate Query Answering on Anonymized Tables.
Proceedings of the 23rd International Conference on Data Engineering, 2007

Finding Skyline and Top-k Bargaining Solutions.
Proceedings of the 23rd International Conference on Data Engineering, 2007

Group Linkage.
Proceedings of the 23rd International Conference on Data Engineering, 2007

A Lightweight Online Framework For Query Progress Indicators.
Proceedings of the 23rd International Conference on Data Engineering, 2007

Propagating Updates in SPIDER.
Proceedings of the 23rd International Conference on Data Engineering, 2007

Fast Identification of Relational Constraint Violations.
Proceedings of the 23rd International Conference on Data Engineering, 2007

Approximation and streaming algorithms for histogram construction problems.
ACM Trans. Database Syst., 2006

Integrating XML data sources using approximate joins.
ACM Trans. Database Syst., 2006

Keyword Proximity Search in XML Trees.
IEEE Trans. Knowl. Data Eng., 2006

Letter from the Special Issue Editor.
IEEE Data Eng. Bull., 2006

Similarity Search: A Matching Based Approach.
Proceedings of the 32nd International Conference on Very Large Data Bases, 2006

Relaxing Join and Selection Queries.
Proceedings of the 32nd International Conference on Very Large Data Bases, 2006

Answering Top-k Queries Using Views.
Proceedings of the 32nd International Conference on Very Large Data Bases, 2006

Record linkage: similarity measures and algorithms.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2006

Using SPIDER: an experience report.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2006

Meta-data indexing for XPath location steps.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2006

Data Stream Query Processing.
Proceedings of the XXI Simpósio Brasileiro de Banco de Dados, 2006

Rapid Identification of Column Heterogeneity.
Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 2006

Syntactic Rule Based Approach toWeb Service Composition.
Proceedings of the 22nd International Conference on Data Engineering, 2006

Reasoning About Approximate Match Query Results.
Proceedings of the 22nd International Conference on Data Engineering, 2006

HASE: A Hybrid Approach to Selectivity Estimation for Conjunctive Predicates.
Proceedings of the Advances in Database Technology, 2006

Column Heterogeneity as a Measure of Data Quality.
Proceedings of the First Int'l VLDB Workshop on Clean Databases, 2006

XML & Data Streams.
Proceedings of the Stream Data Management, 2005

Using Datacube Aggregates for Approximate Querying and Deviation Detection.
IEEE Trans. Knowl. Data Eng., 2005

Efficient Handling of Positional Predicates Within XML Query Processing.
Proceedings of the Database and XML Technologies, 2005

Answering order-based queries over XML data.
Proceedings of the 14th international conference on World Wide Web, 2005

Approximate Joins: Concepts and Techniques.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

Indexing Mixed Types for Approximate Retrieval.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

MIX: A Meta-data Indexing System for XML.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

Structure and Content Scoring for XML.
Proceedings of the 31st International Conference on Very Large Data Bases, Trondheim, Norway, August 30, 2005

Multiple Aggregations Over Data Streams.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

SPIDER: flexible matching in databases.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

Monitoring K-Nearest Neighbor Queries Over Moving Objects.
Proceedings of the 21st International Conference on Data Engineering, 2005

Adaptive Processing of Top-K Queries in XML.
Proceedings of the 21st International Conference on Data Engineering, 2005

Data Stream Query Processing.
Proceedings of the 21st International Conference on Data Engineering, 2005

The threshold join algorithm for top-k queries in distributed sensor networks.
Proceedings of the 2nd Workshop on Data Management for Sensor Networks, 2005

Introduction to special issue with best papers from KDD 2002.
Inf. Syst., 2004

Approximate NN queries on Streams with Guaranteed Error/performance Bounds.
Proceedings of the (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31, 2004

Flexible String Matching Against Large Databases in Practice.
Proceedings of the (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31, 2004

Merging the Results of Approximate Match Operations.
Proceedings of the (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31, 2004

Routing XML Queries.
Proceedings of the 20th International Conference on Data Engineering, 2004

LDC: Enabling Search By Partial Distance In A Hyper-Dimensional Space.
Proceedings of the 20th International Conference on Data Engineering, 2004

NNH: Improving Performance of Nearest-Neighbor Searches Using Histograms.
Proceedings of the Advances in Database Technology, 2004

Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets.
IEEE Trans. Knowl. Data Eng., 2003

Two-dimensional substring indexing.
J. Comput. Syst. Sci., 2003

Generalized substring selectivity estimation.
J. Comput. Syst. Sci., 2003

Text joins in an RDBMS for web data integration.
Proceedings of the Twelfth International World Wide Web Conference, 2003

Data Stream Query Processing: A Tutorial.
Proceedings of 29th International Conference on Very Large Data Bases, 2003

Efficient Approximation Of Optimization Queries Under Parametric Aggregation Constraints.
Proceedings of 29th International Conference on Very Large Data Bases, 2003

A System for Keyword Proximity Search on XML Databases.
Proceedings of 29th International Conference on Very Large Data Bases, 2003

Space Constrained Selection Problems for Data Warehouses and Pervasive Computing.
Proceedings of the 15th International Conference on Scientific and Statistical Database Management (SSDBM 2003), 2003

Panel: Querying Networked Databases.
Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, 2003

Correlating synchronous and asynchronous data streams.
Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 24, 2003

Ranked Join Indices.
Proceedings of the 19th International Conference on Data Engineering, 2003

Index-Based Approximate XML Joins.
Proceedings of the 19th International Conference on Data Engineering, 2003

Text Joins for Data Cleansing and Integration in an RDBMS.
Proceedings of the 19th International Conference on Data Engineering, 2003

Navigation- vs. Index-Based XML Multi-Query Processing.
Proceedings of the 19th International Conference on Data Engineering, 2003

Approximate Matching in XML.
Proceedings of the 19th International Conference on Data Engineering, 2003

Efficient computation of spatial joins with intersection predicates.
Int. J. Geogr. Inf. Sci., 2002

Dynamic multidimensional histograms.
Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 2002

Approximate XML joins.
Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 2002

Holistic twig joins: optimal XML pattern matching.
Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 2002

Fast Algorithms For Hierarchical Range Histogram Construction.
Proceedings of the Twenty-first ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2002

Non-linear dimensionality reduction techniques for classification and visualization.
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002

Stream Data Management: Research Directions and Opportunities.
Proceedings of the International Database Engineering & Applications Symposium, 2002

Approximating a Data Stream for Querying and Estimation: Algorithms and Performance Evaluation.
Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26, 2002

Fast Mining of Massive Tabular Data via Approximate Distance Computations.
Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26, 2002

Structural Joins: A Primitive for Efficient XML Query Pattern Matching.
Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26, 2002

Reminiscences on Influential Papers.
SIGMOD Rec., 2001

DIMACS Summer School Tutorial on New Frontiers in Data Mining.
SIGMOD Rec., 2001

Using q-grams in a DBMS for Approximate String Processing.
IEEE Data Eng. Bull., 2001

Approximate String Joins in a Database (Almost) for Free.
Proceedings of the VLDB 2001, 2001

Data-streams and histograms.
Proceedings of the Proceedings on 33rd Annual ACM Symposium on Theory of Computing, 2001

Entropy Based Approximate Querying and Exploration of Datacubes.
Proceedings of the 13th International Conference on Scientific and Statistical Database Management, 2001

PREFER: A System for the Efficient Execution of Multi-parametric Ranked Queries.
Proceedings of the 2001 ACM SIGMOD international conference on Management of data, 2001

Efficient and Tunable Similar Set Retrieval.
Proceedings of the 2001 ACM SIGMOD international conference on Management of data, 2001

An Efficient Approximation Scheme for Data Mining Tasks.
Proceedings of the 17th International Conference on Data Engineering, 2001

Counting Twig Matches in a Tree.
Proceedings of the 17th International Conference on Data Engineering, 2001

High Dimensional Similarity Joins: Algorithms and Performance Evaluation.
IEEE Trans. Knowl. Data Eng., 2000

Indexing support for spatial joins.
Data Knowl. Eng., 2000

Identifying Representative Trends in Massive Time Series Data Sets Using Sketches.
Proceedings of the VLDB 2000, 2000

On Effective Multi-Dimensional Indexing for Strings.
Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000

Optimal Histograms for Hierarchical Range Queries.
Proceedings of the Nineteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2000

Selectivity Estimation for Boolean Queries.
Proceedings of the Nineteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2000

Space Efficient Bitmap Indexing.
Proceedings of the 2000 ACM CIKM International Conference on Information and Knowledge Management, 2000

Mining Deviants in a Time Series Database.
Proceedings of the VLDB'99, 1999

Fast algorithms for spatial and multidimensional joins.
PhD thesis, 1998

Optimal Histograms with Quality Guarantees.
Proceedings of the VLDB'98, 1998

Size Separation Spatial Join.
Proceedings of the SIGMOD 1997, 1997

Filter Trees for Managing Spatial Data over a Range of Size Granularities
Proceedings of the VLDB'96, 1996

Declustering Spatial Databases on a Multi-Computer Architecture.
Proceedings of the Advances in Database Technology, 1996
