2025
Skyrise: Exploiting Serverless Cloud Infrastructure for Elastic Data Processing.
CoRR, January, 2025
An Empirical Evaluation of Serverless Cloud Infrastructure for Large-Scale Data Processing.
CoRR, January, 2025
Compression in Main Memory Database Systems: Cost and Performance Trade-Offs of Workload-Driven Data Encoding.
Proceedings of the Datenbanksysteme für Business, 2025
A Demonstration of Skyrise: A Serverless Query Processor.
Proceedings of the Datenbanksysteme für Business, 2025
2024
InferDB: In-Database Machine Learning Inference Using Indexes.
Proc. VLDB Endow., April, 2024
Addressing Data Management Challenges for Interoperable Data Science.
Proceedings of Workshops at the 50th International Conference on Very Large Data Bases, 2024
Ghostwriter: a Distributed Message Broker on RDMA and NVM.
Proceedings of Workshops at the 50th International Conference on Very Large Data Bases, 2024
A Three-Tier Buffer Manager Integrating CXL Device Memory for Database Systems.
Proceedings of the 40th International Conference on Data Engineering, ICDE 2024, 2024
Deco: Fast and Accurate Decentralized Aggregation of Count-Based Windows in Large-Scale IoT Applications.
Proceedings of the Proceedings 27th International Conference on Extending Database Technology, 2024
Surprise Benchmarking: The Why, What, and How.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Tenth International Workshop on Testing Database Systems, 2024
2023
Erratum to: Reviving the Workshop Series on Testing Database Systems - DBTest.
Datenbank-Spektrum, March, 2023
TPCx-AI - An Industry Standard Benchmark for Artificial Intelligence and Machine Learning Systems.
Proc. VLDB Endow., 2023
Analyzing Vectorized Hash Tables Across CPU Architectures.
Proc. VLDB Endow., 2023
BabelMR: A Polyglot Framework for Serverless MapReduce.
Proceedings of the Joint Proceedings of Workshops at the 49th International Conference on Very Large Data Bases (VLDB 2023), Vancouver, Canada, August 28, 2023
Evaluating SIMD Compiler-Intrinsics for Database Systems.
Proceedings of the Joint Proceedings of Workshops at the 49th International Conference on Very Large Data Bases (VLDB 2023), Vancouver, Canada, August 28, 2023
Desis: Efficient Window Aggregation in Decentralized Networks.
Proceedings of the Proceedings 26th International Conference on Extending Database Technology, 2023
Efficient Multi-Model Management.
Proceedings of the Proceedings 26th International Conference on Extending Database Technology, 2023
RMG Sort: Radix-Partitioning-Based Multi-GPU Sorting.
Proceedings of the Datenbanksysteme für Business, 2023
What We Can Learn from Persistent Memory for CXL.
Proceedings of the Datenbanksysteme für Business, 2023
2022
Reminiscences on Influential Papers.
SIGMOD Rec., December, 2022
Datenbank-Spektrum, November, 2022
Reviving the Workshop Series on Testing Database Systems - DBTest.
Datenbank-Spektrum, November, 2022
LogStore: A Workload-Aware, Adaptable Key-Value Store on Hybrid Storage Systems.
IEEE Trans. Knowl. Data Eng., 2022
Imperative or Functional Control Flow Handling: Why not the Best of Both Worlds?
SIGMOD Rec., 2022
PerMA-Bench: Benchmarking Persistent Memory Access.
Proc. VLDB Endow., 2022
Rethinking Stateful Stream Processing with RDMA.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022
Evaluating Multi-GPU Sorting with Modern Interconnects.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022
Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast Interconnects.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022
Materialization and Reuse Optimizations for Production Data Science Pipelines.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines [Abstract].
Proceedings of the LWDA 2022 Workshops: FGWM, 2022
Efficiently Managing Deep Learning Models in a Distributed Environment.
Proceedings of the 25th International Conference on Extending Database Technology, 2022
Evaluating In-Memory Hash Joins on Persistent Memory.
Proceedings of the 25th International Conference on Extending Database Technology, 2022
DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022
Darwin: Scale-In Stream Processing.
Proceedings of the 12th Conference on Innovative Data Systems Research, 2022
2021
Scotty: General and Efficient Open-source Window Aggregation for Stream Processing Systems.
ACM Trans. Database Syst., 2021
Viper: An Efficient Hybrid PMem-DRAM Key-Value Store.
Proc. VLDB Endow., 2021
The Collaborative Research Center FONDA.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Datenbank-Spektrum, 2021
A Survey of Big Data, High Performance Computing, and Machine Learning Benchmarks.
Proceedings of the Performance Evaluation and Benchmarking, 2021
Maximizing Persistent Memory Bandwidth Utilization for OLAP Workloads.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021
Scale-down experiments on TPCx-HS.
Proceedings of the BiDEDE '21: Proceedings of the International Workshop on Big Data in Emergent Distributed Environments, 2021
LogStore: A Workload-aware, Adaptable Key-Value Store on Hybrid Storage Systems (Extended abstract).
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021
Efficient Control Flow in Dataflow Systems: When Ease-of-Use Meets High Performance.
Proceedings of the 37th IEEE International Conference on Data Engineering, 2021
Drop It In Like It's Hot: An Analysis of Persistent Memory as a Drop-in Replacement for NVMe SSDs.
Proceedings of the 17th International Workshop on Data Management on New Hardware, 2021
2020
Advice from SIGMOD/PODS 2020.
,
,
,
,
,
,
,
,
,
,
,
,
SIGMOD Rec., 2020
Quantifying TPC-H Choke Points and Their Optimizations.
Proc. VLDB Endow., 2020
A distributed data exchange engine for polystores.
it Inf. Technol., 2020
How Fast Can We Insert? A Performance Study of Apache Kafka.
CoRR, 2020
Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines.
Proceedings of the 2020 International Conference on Management of Data, 2020
Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects.
Proceedings of the 2020 International Conference on Management of Data, 2020
Grizzly: Efficient Stream Processing Through Adaptive Query Compilation.
Proceedings of the 2020 International Conference on Management of Data, 2020
Optimizing Machine Learning Workloads in Collaborative Environments.
Proceedings of the 2020 International Conference on Management of Data, 2020
Disco: Efficient Distributed Window Aggregation.
Proceedings of the 23rd International Conference on Extending Database Technology, 2020
Incremental stream query analytics.
Proceedings of the 14th ACM International Conference on Distributed and Event-based Systems, 2020
2019
Definition of Data Streams.
Proceedings of the Encyclopedia of Big Data Technologies., 2019
Analyzing Efficient Stream Processing on Modern Hardware.
Proc. VLDB Endow., 2019
An Intermediate Representation for Optimizing Machine Learning Pipelines.
Proc. VLDB Endow., 2019
AJoin: Ad-hoc Stream Joins at Scale.
Proc. VLDB Endow., 2019
Particulate Matter Matters - The Data Science Challenge @ BTW 2019.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Datenbank-Spektrum, 2019
SENSE: Scalable Data Acquisition from Distributed Sensors with Guaranteed Time Coherence.
CoRR, 2019
ADABench - Towards an Industry Standard Benchmark for Advanced Analytics.
Proceedings of the Performance Evaluation and Benchmarking for the Era of Cloud(s), 2019
AStream: Ad-hoc Shared Stream Processing.
Proceedings of the 2019 International Conference on Management of Data, 2019
Muses: Distributed Data Migration System for Polystores.
Proceedings of the 35th IEEE International Conference on Data Engineering, 2019
Efficient Window Aggregation with General Stream Slicing.
Proceedings of the Advances in Database Technology, 2019
Continuous Deployment of Machine Learning Pipelines.
Proceedings of the Advances in Database Technology, 2019
Generating Reproducible Out-of-Order Data Streams.
Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems, 2019
Performance Analysis and Automatic Tuning of Hash Aggregation on GPUs.
Proceedings of the 15th International Workshop on Data Management on New Hardware, 2019
Explanation of Air Pollution Using External Data Sources.
Proceedings of the Datenbanksysteme für Business, 2019
An Overview of Hawk: A Hardware-Tailored Code Generator for the Heterogeneous Many Core Age.
Proceedings of the Datenbanksysteme für Business, 2019
On-the-fly Reconfiguration of Query Plans for Stateful Stream Processing Engines.
Proceedings of the Datenbanksysteme für Business, 2019
2018
Generating custom code for efficient query execution on heterogeneous processors.
VLDB J., 2018
Performance Evaluation and Optimization of Multi-Dimensional Indexes in Hive.
IEEE Trans. Serv. Comput., 2018
Dagstuhl Seminar on Big Stream Processing.
SIGMOD Rec., 2018
Data Management Systems Research at TU Berlin.
SIGMOD Rec., 2018
Efficient and Scalable k‑Means on GPUs.
Datenbank-Spektrum, 2018
Labyrinth: Compiling Imperative Control Flow to Parallel Dataflows.
CoRR, 2018
Benchmarking Distributed Stream Processing Engines.
CoRR, 2018
Methods for Quantifying Energy Consumption in TPC-H.
Proceedings of the 2018 ACM/SPEC International Conference on Performance Engineering, 2018
PolyBench: The First Benchmark for Polystores.
Proceedings of the Performance Evaluation and Benchmarking for the Era of Artificial Intelligence, 2018
Benchmarking Distributed Data Processing Systems for Machine Learning Workloads.
Proceedings of the Performance Evaluation and Benchmarking for the Era of Artificial Intelligence, 2018
Scotty: Efficient Window Aggregation for Out-of-Order Stream Processing.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018
Analysis of TPCx-IoT: The First Industry Standard Benchmark for IoT Gateway Systems.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018
Benchmarking Distributed Stream Data Processing Systems.
Proceedings of the 34th IEEE International Conference on Data Engineering, 2018
Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive Windowing.
Proceedings of the 21st International Conference on Extending Database Technology, 2018
Efficient k-means on GPUs.
Proceedings of the 14th International Workshop on Data Management on New Hardware, 2018
ScootR: Scaling R Dataframes on Dataflow Systems.
Proceedings of the ACM Symposium on Cloud Computing, 2018
2017
BlockJoin: Efficient Matrix Partitioning Through Joins.
Proc. VLDB Endow., 2017
Big Stream Processing Systems (Dagstuhl Seminar 17441).
Dagstuhl Reports, 2017
Generating Custom Code for Efficient Query Execution on Heterogeneous Processors.
CoRR, 2017
PEEL: A Framework for Benchmarking Distributed Systems and Algorithms.
Proceedings of the Performance Evaluation and Benchmarking for the Analytics Era, 2017
Query Centric Partitioning and Allocation for Partially Replicated Database Systems.
Proceedings of the 2017 ACM International Conference on Management of Data, 2017
Benchmarking Data Flow Systems for Scalable Machine Learning.
Proceedings of the 4th ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond, 2017
I<sup>2</sup>: Interactive Real-Time Visualization for Streaming Data.
Proceedings of the 20th International Conference on Extending Database Technology, 2017
PROTEUS: Scalable Online Machine Learning for Predictive Analytics and Real-Time Interactive Visualization.
Proceedings of the Workshops of the EDBT/ICDT 2017 Joint Conference (EDBT/ICDT 2017), 2017
STREAMLINE - Streamlined Analysis of Data at Rest and Data in Motion.
Proceedings of the Workshops of the EDBT/ICDT 2017 Joint Conference (EDBT/ICDT 2017), 2017
Optimized on-demand data streaming from sensor nodes.
Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 2017
Analysis of TPC-DS: the first standard benchmark for SQL-based big data systems.
Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 2017
Gilbert: Declarative Sparse Linear Algebra on Massively Parallel Dataflow Systems.
Proceedings of the Datenbanksysteme für Business, 2017
2016
Apache Flink in current research.
it Inf. Technol., 2016
Towards Streamlined Big Data Analytics.
ERCIM News, 2016
From BigBench to TPCx-BB: Standardization of a Big Data Benchmark.
Proceedings of the Performance Evaluation and Benchmarking. Traditional - Big Data - Interest of Things, 2016
2015
Enhancing Data Generation in TPCx-HS with a Non-uniform Random Distribution.
Proceedings of the Performance Evaluation and Benchmarking: Traditional to Big Data to Internet of Things, 2015
Big Data Benchmark Compendium.
Proceedings of the Performance Evaluation and Benchmarking: Traditional to Big Data to Internet of Things, 2015
The Vision of BigBench 2.0.
Proceedings of the Fourth Workshop on Data analytics in the Cloud, 2015
Just can't get enough: Synthesizing Big Data.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015
Die Apache Flink Plattform zur parallelen Analyse von Datenströmen und Stapeldaten.
Proceedings of the LWA 2015 Workshops: KDML, 2015
DualTable: A hybrid storage model for update optimization in Hive.
Proceedings of the 31st IEEE International Conference on Data Engineering, 2015
High performance stream queries in scala.
Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems, 2015
2014
TPC-DI: The First Industry Benchmark for Data Integration.
Proc. VLDB Endow., 2014
DGFIndex for Smart Grid: Enhancing Hive with a Cost-Effective Multidimensional Range Index.
Proc. VLDB Endow., 2014
DGFIndex for Smart Grid: Enhancing Hive with a Cost-Effective Multidimensional Range Index.
CoRR, 2014
DualTable: A Hybrid Storage Model for Update Optimization in Hive.
CoRR, 2014
Towards a Complete BigBench Implementation.
Proceedings of the Big Data Benchmarking - 5th International Workshop, 2014
Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Performance Characterization and Benchmarking. Traditional to Big Data, 2014
PSBench: a benchmark for content- and topic-based publish/subscribe systems.
Proceedings of the Middleware '14 Posters & Demos Session, 2014
CaSSanDra: An SSD boosted key-value store.
Proceedings of the IEEE 30th International Conference on Data Engineering, Chicago, 2014
Materialized views in Cassandra.
Proceedings of 24th Annual International Conference on Computer Science and Software Engineering, 2014
Optimizing key-value stores for hybrid storage architectures.
Proceedings of 24th Annual International Conference on Computer Science and Software Engineering, 2014
2013
Benchmarking Big Data Systems and the BigData Top100 List.
Big Data, 2013
Variations of the star schema benchmark to test the effects of data skew on query performance.
Proceedings of the ACM/SPEC International Conference on Performance Engineering, 2013
A BigBench Implementation in the Hadoop Ecosystem.
Proceedings of the Advancing Big Data Benchmarks, 2013
Rapid development of data generators using meta generators in PDGF.
Proceedings of the Sixth International Workshop on Testing Database Systems, 2013
BigBench: towards an industry standard benchmark for big data analytics.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013
Poster: MADES - a multi-layered, adaptive, distributed event store.
Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems, 2013
Grand challenge: the bluebay soccer monitoring engine.
Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems, 2013
2012
Solving Big Data Challenges for Enterprise Application Performance Management.
Proc. VLDB Endow., 2012
Landmark-assisted location and tracking in outdoor mobile network.
Multim. Tools Appl., 2012
Efficient update data generation for DBMS benchmarks.
Proceedings of the Third Joint WOSP/SIPEW International Conference on Performance Engineering, 2012
Proceedings of the Specifying Big Data Benchmarks, 2012
BigBench Specification V0.1 - BigBench: An Industry Standard Benchmark for Big Data Analytics.
Proceedings of the Specifying Big Data Benchmarks, 2012
Processing Big Events with Showers and Streams.
Proceedings of the Specifying Big Data Benchmarks, 2012
Setting the Direction for Big Data Benchmark Standards.
Proceedings of the Selected Topics in Performance Evaluation and Benchmarking, 2012
Solving manufacturing equipment monitoring through efficient complex event processing: DEBS grand challenge.
Proceedings of the Sixth ACM International Conference on Distributed Event-Based Systems, 2012
2011
Efficiency in Cluster Database Systems - Dynamic and Workload-Aware Scaling and Allocation.
PhD thesis, 2011
A PDGF Implementation for TPC-H.
Proceedings of the Topics in Performance Evaluation, Measurement and Characterization, 2011
Parallel data generation for performance analysis of large, complex RDBMS.
Proceedings of the Fourth International Workshop on Testing Database Systems, 2011
A protocol for disaster data evacuation.
Proceedings of the ACM SIGCOMM 2011 Conference on Applications, 2011
Demonstration des Parallel Data Generation Framework.
Proceedings of the Datenbanksysteme für Business, 2011
2010
A Data Generator for Cloud-Scale Benchmarking.
Proceedings of the Performance Evaluation, Measurement and Characterization of Complex Systems, 2010
Introducing Scalileo: a Java based scaling framework.
Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking, 2010
2009
Design and Implementation of the Fast Send Protocol.
J. Digit. Inf. Manag., 2009
Generating Shifting Workloads to Benchmark Adaptability in Relational Database Systems.
Proceedings of the Performance Evaluation and Benchmarking, 2009
2008
Interactive TV Services on Mobile Devices.
IEEE Multim., 2008
Dynamic allocation in a self-scaling cluster database.
Concurr. Comput. Pract. Exp., 2008
2007
Proceedings of the 15th International Conference on Multimedia 2007, 2007
Fast Send Protocol - minimizing sending time in high-speed bulk data transfers.
Proceedings of the Second IEEE International Conference on Digital Information Management (ICDIM), 2007