2025
Accelerating machine learning queries with linear algebra query processing.
Distributed Parallel Databases, December, 2025
2024
A survey on the evolution of stream processing systems.
VLDB J., 2024
FeatNavigator: Automatic Feature Augmentation on Tabular Data.
CoRR, 2024
OmniMatch: Effective Self-Supervised Any-Join Discovery in Tabular Data Repositories.
CoRR, 2024
CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024
AutoFeat: Transitive Feature Discovery over Join Paths.
Proceedings of the 40th IEEE International Conference on Data Engineering, 2024
Key Insights from a Feature Discovery User Study.
Proceedings of the 2024 Workshop on Human-In-the-Loop Data Analytics, 2024
Evaluating Stream Processing Autoscalers.
Proceedings of the 18th ACM International Conference on Distributed and Event-based Systems, 2024
LLM-PQA: LLM-enhanced Prediction Query Answering.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024
Human-in-the-Loop Feature Discovery for Tabular Data.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024
2023
Styx: Deterministic Transactional Serverless Functions on Streaming Dataflows.
CoRR, 2023
Metadata Representations for Queryable Repositories of Machine Learning Models.
IEEE Access, 2023
Leveraging Large Language Models for Sequential Recommendation.
Proceedings of the 17th ACM Conference on Recommender Systems, 2023
Optimizing ML Inference Queries Under Constraints.
Proceedings of the Web Engineering - 23rd International Conference, 2023
Macaroni: Crawling and Enriching Metadata from Public Model Zoos.
Proceedings of the Web Engineering - 23rd International Conference, 2023
Topio: An Open-Source Web Platform for Trading Geospatial Data.
Proceedings of the Web Engineering - 23rd International Conference, 2023
An Empirical Performance Comparison between Matrix Multiplication Join and Hash Join on GPUs.
Proceedings of the 39th IEEE International Conference on Data Engineering, ICDE 2023, 2023
Towards Evaluating Stream Processing Autoscalers.
Proceedings of the 39th IEEE International Conference on Data Engineering, ICDE 2023, 2023
Optimizing Machine Learning Inference Queries for Multiple Objectives.
Proceedings of the 39th IEEE International Conference on Data Engineering, ICDE 2023, 2023
Amalur: Data Integration Meets Machine Learning.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023
Topio Marketplace: Search and Discovery of Geospatial Data.
Proceedings of the Proceedings 26th International Conference on Extending Database Technology, 2023
Adaptive Distributed Streaming Similarity Joins.
Proceedings of the 17th ACM International Conference on Distributed and Event-based Systems, 2023
Stateful Entities: Object-oriented Cloud Applications as Distributed Dataflows.
Proceedings of the 13th Conference on Innovative Data Systems Research, 2023
Automatic Table Union Search with Tabular Representation Learning.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Transactions across serverless functions leveraging stateful dataflows.
Inf. Syst., 2022
Metadata Representations for Queryable ML Model Zoos.
CoRR, 2022
SiMa: Effective and Efficient Data Silo Federation Using Graph Neural Networks.
CoRR, 2022
Bridging the Gap between Data Integration and ML Systems.
CoRR, 2022
S-QUERY: Opening the Black Box of Internal Stream Processor State.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022
Join Path-Based Data Augmentation for Decision Trees.
Proceedings of the 38th IEEE International Conference on Data Engineering Workshops, 2022
Amalur: Next-generation Data Integration in Data Lakes.
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022
Data Platforms for Data Spaces.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Data Spaces - Design, 2022
2021
Scotty: General and Efficient Open-source Window Aggregation for Stream Processing Systems.
ACM Trans. Database Syst., 2021
Valentine in Action: Matching Tabular Data at Scale.
Proc. VLDB Endow., 2021
Hazelcast Jet: Low-latency Stream Processing at the 99.99th Percentile.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proc. VLDB Endow., 2021
Stateful Entities: Object-oriented Cloud Applications as Distributed Dataflows.
CoRR, 2021
Clonos: Consistent Causal Recovery for Highly-Available Streaming Dataflows.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021
Valentine: Evaluating Matching Techniques for Dataset Discovery.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021
Distributed transactions on serverless stateful functions.
Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems, 2021
2020
Beyond Analytics: The Evolution of Stream Processing Systems.
Proceedings of the 2020 International Conference on Management of Data, 2020
REMA: Graph Embeddings-based Relational Schema Matching.
Proceedings of the Workshops of the EDBT/ICDT 2020 Joint Conference, 2020
2019
Stream Window Aggregation Semantics and Optimization.
Proceedings of the Encyclopedia of Big Data Technologies., 2019
An Intermediate Representation for Optimizing Machine Learning Pipelines.
Proc. VLDB Endow., 2019
Stateful Functions as a Service in Action.
Proc. VLDB Endow., 2019
Muses: Distributed Data Migration System for Polystores.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019
Efficient Window Aggregation with General Stream Slicing.
Proceedings of the Advances in Database Technology, 2019
Operational Stream Processing: Towards Scalable and Consistent Event-Driven Applications.
Proceedings of the Advances in Database Technology, 2019
Generating Reproducible Out-of-Order Data Streams.
Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems, 2019
2018
Benchmarking Distributed Stream Processing Engines.
CoRR, 2018
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018
Benchmarking Distributed Stream Data Processing Systems.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018
2017
BlockJoin: Efficient Matrix Partitioning Through Joins.
Proc. VLDB Endow., 2017
Optimized on-demand data streaming from sensor nodes.
Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 2017
Large-Scale Data Stream Processing Systems.
Proceedings of the Handbook of Big Data Technologies, 2017
2016
Implicit Parallelism through Deep Language Embedding.
SIGMOD Rec., 2016
Apache Flink in current research.
it Inf. Technol., 2016
Bridging the gap: towards optimization across linear and relational algebra.
Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond, 2016
Emma in Action: Declarative Dataflows for Scalable Data Analysis.
Proceedings of the 2016 International Conference on Management of Data, 2016
Apache Flink: Stream Analytics at Scale.
Proceedings of the 2016 IEEE International Conference on Cloud Engineering Workshop, 2016
Cutty: Aggregate Sharing for User-Defined Windows.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016
2015
Apache Flink™: Stream and Batch Processing in a Single Engine.
IEEE Data Eng. Bull., 2015
Optimistic Recovery for Iterative Dataflows in Action.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015
Implicit Parallelism through Deep Language Embedding.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015
2013
Scalable view-based techniques for web data : algorithms and systems. (Techniques efficaces basées sur des vues matérialisées pour la gestion des données du Web : algorithmes et systèmes).
PhD thesis, 2013
Delta: Scalable Data Dissemination under Capacity Constraints.
Proc. VLDB Endow., 2013
2012
Minersoft: Software retrieval in grid and cloud computing infrastructures.
ACM Trans. Internet Techn., 2012
Materialized view selection for XQuery workloads.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012
ViP2P: Efficient XML Management in DHT Networks.
Proceedings of the Web Engineering - 12th International Conference, 2012
2011
The ViP2P Platform: XML Views in P2P
CoRR, 2011
2010
Searching for Software on the EGEE Infrastructure.
J. Grid Comput., 2010
LiquidXML: adaptive XML content redistribution.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010
2009
Effective Keyword Search for Software Resources Installed in Large-Scale Grid Infrastructures.
Proceedings of the 2009 IEEE/WIC/ACM International Conference on Web Intelligence, 2009
Harvesting Large-Scale Grids for Software Resources.
Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009