Johannes Gehrke

Orcid: 0009-0006-6293-5209

  • Microsoft, USA
  • Cornell University, Ithaca, USA

According to our database1, Johannes Gehrke authored at least 246 papers between 1995 and 2024.

Collaborative distances:


ACM Fellow

ACM Fellow 2014, "For his contributions to data mining and data stream query processing.".



In proceedings 
PhD thesis 


Online presence:



NL2Code-Reasoning and Planning with LLMs for Code Development.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Sparks of Artificial General Intelligence: Early experiments with GPT-4.
CoRR, 2023

Database Gyms.
Proceedings of the 13th Conference on Innovative Data Systems Research, 2023

The DB Community vis-à-vis Environmental, Health, and Societal Grand Challenges: Innovation Engine, Plumber, or Bystander?
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

VLDB Panel Summary: "The Future of Data(base) Education: Is the Cow Book Dead?".
SIGMOD Rec., 2021

DSB: A Decision Support Benchmark for Workload-Driven and Traditional Database Systems.
Proc. VLDB Endow., 2021

Meeting Effectiveness and Inclusiveness in Remote Collaboration.
Proc. ACM Hum. Comput. Interact., 2021

Instance-Optimized Data Layouts for Cloud Analytics Workloads.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

FastVer: Making Data Integrity a Commodity.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021

Technical Perspective: Checking Invariant Confluence, In Whole or In Parts.
SIGMOD Rec., 2020

Resonance: Replacing Software Constants with Context-Aware Models in Real-time Communication.
CoRR, 2020

Programming by Rewards.
CoRR, 2020

Lightweight Inter-transaction Caching with Precise Clocks and Dynamic Self-invalidation.
CoRR, 2020

The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework.
CoRR, 2020

Qd-tree: Learning Data Layouts for Big Data Analytics.
Proceedings of the 2020 International Conference on Management of Data, 2020

ALEX: An Updatable Adaptive Learned Index.
Proceedings of the 2020 International Conference on Management of Data, 2020

Lumos: A Library for Diagnosing Metric Regressions in Web-Scale Applications.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Testing Framework, and Challenge Results.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

iBox: Internet in a Box.
Proceedings of the HotNets '20: The 19th ACM Workshop on Hot Topics in Networks, 2020

Achieving Low Latency Transactions for Geo-replicated Storage with Blotter.
Proceedings of the Encyclopedia of Big Data Technologies., 2019

Letter from the TCDE Awards Committee.
IEEE Data Eng. Bull., 2019

Reinforcement learning for bandwidth estimation and congestion control in real-time communications.
CoRR, 2019

Multi-version Indexing in Flash-based Key-Value Stores.
CoRR, 2019

ALEX: An Updatable Adaptive Learned Index.
CoRR, 2019

An Empirical Analysis of Deep Learning for Cardinality Estimation.
CoRR, 2019

Intrusive and Non-Intrusive Perceptual Speech Quality Assessment Using a Convolutional Neural Network.
Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

Database Systems 2.0.
Proceedings of the VLDB 2019 PhD Workshop, 2019

Supervised Classifiers for Audio Impairments with Noisy Labels.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Scalable Noisy Speech Dataset and Online Subjective Test Framework.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Non-intrusive Speech Quality Assessment Using Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2019

Efficient, Consistent Distributed Computation with Predictive Treaties.
Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, 2019

Veritas: Shared Verifiable Databases and Tables in the Cloud.
Proceedings of the 9th Biennial Conference on Innovative Data Systems Research, 2019

Continuous Queries in Sensor Networks.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Database Techniques to Improve Scientific Simulations.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Randomization Methods to Ensure Data Privacy.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Event and Pattern Detection over Streams.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Scalable Decision Tree Construction.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

DBMS Interface.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

DBMS Component.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

Generating data series query workloads.
VLDB J., 2018

HypDB: A Demonstration of Detecting, Explaining and Resolving Bias in OLAP queries.
Proc. VLDB Endow., 2018

Improving Optimistic Concurrency Control Through Transaction Batching and Operation Reordering.
Proc. VLDB Endow., 2018

HypDB: Detect, Explain And Resolve Bias in OLAP.
CoRR, 2018

Cuttlefish: A Lightweight Primitive for Adaptive Query Processing.
CoRR, 2018

SLAOrchestrator: Reducing the Cost of Performance SLAs for Cloud Data Analytics.
Proceedings of the 2018 USENIX Annual Technical Conference, 2018

Bias in OLAP Queries: Detection, Explanation, and Removal.
Proceedings of the 2018 International Conference on Management of Data, 2018

Learning State Representations for Query Optimization with Deep Reinforcement Learning.
Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, 2018

Special Section on the International Conference on Data Engineering 2015.
IEEE Trans. Knowl. Data Eng., 2017

Blotter: Low Latency Transactions for Geo-Replicated Storage.
Proceedings of the 26th International Conference on World Wide Web, 2017

READY: Completeness is in the Eye of the Beholder.
Proceedings of the 8th Biennial Conference on Innovative Data Systems Research, 2017

Enabling Lightweight Transactions with Precision Time.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

Geo-Replication: Fast If Possible, Consistent If Necessary.
IEEE Data Eng. Bull., 2016

Technical Perspective: Naiad.
Commun. ACM, 2016

Hashtag Recommendation for Enterprise Applications.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

Conclusions and Looking Forward.
Proceedings of the Data Stream Management - Processing High-Speed Data Streams, 2016

Data Stream Management: A Brave New World.
Proceedings of the Data Stream Management - Processing High-Speed Data Streams, 2016

Sketch-Based Multi-Query Processing over Data Streams.
Proceedings of the Data Stream Management - Processing High-Speed Data Streams, 2016

Pricing Queries Approximately Optimally.
CoRR, 2015

The Homeostasis Protocol: Avoiding Transaction Coordination Through Program Analysis.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Query Workloads for Data Series Indexes.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

Edge-Weighted Personalized PageRank: Breaking A Decade-Old Performance Barrier.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

Balancing Isolation and Sharing of Data in Third-Party Extensible App Ecosystems.
Proceedings of the Engineering the Web in the Big Data Era - 15th International Conference, 2015

Guardat: enforcing data policies at the storage layer.
Proceedings of the Tenth European Conference on Computer Systems, 2015

Centiman: elastic, high performance optimistic concurrency control by watermarking.
Proceedings of the Sixth ACM Symposium on Cloud Computing, 2015

Guest Editorial: Special Section on the International Conference on Data Engineering.
IEEE Trans. Knowl. Data Eng., 2014

The Beckman Report on Database Research.
SIGMOD Rec., 2014

Sparse Partially Linear Additive Models.
CoRR, 2014

Writes that Fall in the Forest and Make no Sound: Semantics-Based Adaptive Data Consistency.
CoRR, 2014

Big data and its technical challenges.
Commun. ACM, 2014

Explainable security for relational databases.
Proceedings of the International Conference on Management of Data, 2014

Fast Iterative Graph Computation with Block Updates.
Proc. VLDB Endow., 2013

An Experimental Analysis of Iterated Spatial Joins in Main Memory.
Proc. VLDB Endow., 2013

Front Matter.
Proc. VLDB Endow., 2013

A Quantitative Evaluation Framework for Missing Value Imputation Algorithms.
CoRR, 2013

Fine-grained disclosure control for app ecosystems.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2013

Beyond myopic inference in big data pipelines.
Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013

Accurate intelligible models with pairwise interactions.
Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013

Asynchronous Large-Scale Graph Processing Made Easy.
Proceedings of the Sixth Biennial Conference on Innovative Data Systems Research, 2013

Big Data Pipelines.
Proceedings of the Sixth Biennial Conference on Innovative Data Systems Research, 2013

Secure and customizable web development in the safe activation framework.
Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, 2013

Entangled queries: Enabling declarative data-driven coordination.
ACM Trans. Database Syst., 2012

Publishing Search Logs - A Comparative Study of Privacy Guarantees.
IEEE Trans. Knowl. Data Eng., 2012

ClouDiA: A Deployment Advisor for Public Clouds.
Proc. VLDB Endow., 2012

The Complexity of Social Coordination.
Proc. VLDB Endow., 2012

Crowd-Blending Privacy.
IACR Cryptol. ePrint Arch., 2012

SAFE extensibility of data-driven web applications.
Proceedings of the 21st World Wide Web Conference 2012, 2012

MaskIt: privately releasing user context streams for personalized mobile applications.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2012

Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary.
Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, 2012

Towards Statistical Queries over Distributed Private User Data.
Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, 2012

Intelligible models for classification and regression.
Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012

MatchMiner: Efficient Spanning Structure Mining in Large Image Collections.
Proceedings of the Computer Vision - ECCV 2012, 2012

Non-tracking web analytics.
Proceedings of the ACM Conference on Computer and Communications Security, 2012

Proceedings of the Encyclopedia of Cryptography and Security, 2nd Ed., 2011

Load Balancing and Range Queries in P2P Systems Using P-Ring.
ACM Trans. Internet Techn., 2011

Differential Privacy via Wavelet Transforms.
IEEE Trans. Knowl. Data Eng., 2011

Entangled Transactions.
Proc. VLDB Endow., 2011

Nerio: Leader Election and Edict Ordering
CoRR, 2011

Towards Privacy for Social Networks: A Zero-Knowledge Based Definition of Privacy.
Proceedings of the Theory of Cryptography - 8th Theory of Cryptography Conference, 2011

ATLAS: a probabilistic algorithm for high dimensional similarity search.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

iReduct: differential privacy with reduced relative errors.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

Coordination through querying in the youtopia system.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

Fast checkpoint recovery algorithms for frequently consistent applications.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

BRRL: a recovery library for main-memory applications in the cloud.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011

Playing games with databases.
Proceedings of the 27th International Conference on Data Engineering, 2011

Declarative data-driven coordination.
Proceedings of the Fifth ACM International Conference on Distributed Event-Based Systems, 2011

Making time-stepped applications tick in the cloud.
Proceedings of the ACM Symposium on Cloud Computing in conjunction with SOSP 2011, 2011

Workload-aware indexing for keyword search in social networks.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Beyond isolation: research opportunities in declarative data-driven coordination.
SIGMOD Rec., 2010

Behavioral Simulations in MapReduce.
Proc. VLDB Endow., 2010

Programming with differential privacy: technical persepctive.
Commun. ACM, 2010

Search in social networks with access control.
Proceedings of the Second International Workshop on Keyword Search on Structured Data, 2010

Privacy in data publishing.
Proceedings of the 26th International Conference on Data Engineering, 2010

Continuous Queries in Sensor Networks.
Proceedings of the Encyclopedia of Database Systems, 2009

Database Techniques to Improve Scientific Simulations.
Proceedings of the Encyclopedia of Database Systems, 2009

Randomization Methods to Ensure Data Privacy.
Proceedings of the Encyclopedia of Database Systems, 2009

Event and Pattern Detection over Streams.
Proceedings of the Encyclopedia of Database Systems, 2009

Scalable Decision Tree Construction.
Proceedings of the Encyclopedia of Database Systems, 2009

DBMS Interface.
Proceedings of the Encyclopedia of Database Systems, 2009

DBMS Component.
Proceedings of the Encyclopedia of Database Systems, 2009

Classification and Regression Trees.
Proceedings of the Encyclopedia of Data Warehousing and Mining, Second Edition (4 Volumes), 2009

Special issue: best papers of VLDB 2007.
VLDB J., 2009

An Evaluation of Checkpoint Recovery for Massively Multiplayer Online Games.
Proc. VLDB Endow., 2009

Data Publishing against Realistic Adversaries.
Proc. VLDB Endow., 2009

Multi-query optimization for sketch-based estimation.
Inf. Syst., 2009

Privacy in Search Logs
CoRR, 2009

Technical perspective - Data stream processing: when you only get one look.
Commun. ACM, 2009

Interactive anonymization of sensitive data.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2009

Database research in computer games.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2009

Scalability for Virtual Worlds.
Proceedings of the 25th International Conference on Data Engineering, 2009

Rule-based multi-query optimization.
Proceedings of the EDBT 2009, 2009

Distributed event stream processing with non-deterministic finite automata.
Proceedings of the Third ACM International Conference on Distributed Event-Based Systems, 2009

Inverted indexes vs. bitmap indexes in decision support systems.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

From Declarative Languages to Declarative Processing in Computer Games.
Proceedings of the Fourth Biennial Conference on Innovative Data Systems Research, 2009

Analyzing Data Streams in Scientific Applications.
Proceedings of the Scientific Data Management - Challenges, Technology, and Deployment., 2009

The Claremont report on database research.
SIGMOD Rec., 2008

Better Scripts, Better Games.
ACM Queue, 2008

Large-scale collaborative analysis and extraction of web data.
Proc. VLDB Endow., 2008

Towards a streaming SQL standard.
Proc. VLDB Endow., 2008

SEMMO: a scalable engine for massively multiplayer online games.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

SGL: a scalable language for data-driven games.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2008

Declarative processing for computer games.
Proceedings of the 2008 ACM SIGGRAPH Symposium on Video Games, 2008

Toward Expressive and Scalable Sponsored Search Auctions.
Proceedings of the 24th International Conference on Data Engineering, 2008

Declarative, Domain-Specific Languages - Elegant Simplicity or a Hammer in Search of a Nail?
Proceedings of the 24th International Conference on Data Engineering, 2008

Privacy: Theory meets Practice on the Map.
Proceedings of the 24th International Conference on Data Engineering, 2008

Wave scheduling and routing in sensor networks.
ACM Trans. Sens. Networks, 2007

<i>L</i>-diversity: Privacy beyond <i>k</i>-anonymity.
ACM Trans. Knowl. Discov. Data, 2007

Database research opportunities in computer games.
SIGMOD Rec., 2007

Index structures for matching XML twigs using relational query processors.
Data Knowl. Eng., 2007

A unified platform for data driven web applications with automatic client-server partitioning.
Proceedings of the 16th International Conference on World Wide Web, 2007

Scaling games to epic proportion.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

Massively multi-query join processing in publish/subscribe systems.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

User-centric personalized extensibility for data-driven web applications.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

P-ring: an efficient and robust P2P range index structure.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

Cayuga: a high-performance event processing engine.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

What is "next" in event processing?
Proceedings of the Twenty-Sixth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2007

High-Speed Function Approximation.
Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), 2007

Worst-Case Background Knowledge for Privacy-Preserving Data Publishing.
Proceedings of the 23rd International Conference on Data Engineering, 2007

Cayuga: A General Purpose Event Monitoring System.
Proceedings of the Third Biennial Conference on Innovative Data Systems Research, 2007

Guest Editors' Introduction: Sensor-Network Applications.
IEEE Internet Comput., 2006

Indexing for Function Approximation.
Proceedings of the 32nd International Conference on Very Large Data Bases, 2006

Injecting utility into anonymized datasets.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2006

Automatic client-server partitioning of data-driven web applications.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2006

On the efficiency of checking perfect privacy.
Proceedings of the Twenty-Fifth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2006

Plagiarism Detection in arXiv.
Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 2006

Hilda: A High-Level Language for Data-DrivenWeb Applications.
Proceedings of the 22nd International Conference on Data Engineering, 2006

Trusted CVS.
Proceedings of the 22nd International Conference on Data Engineering Workshops, 2006

l-Diversity: Privacy Beyond k-Anonymity.
Proceedings of the 22nd International Conference on Data Engineering, 2006

Models and Methods for Privacy-Preserving Data Analysis and Publishing.
Proceedings of the 22nd International Conference on Data Engineering, 2006

Three Case Studies of Large-Scale Data Flows.
Proceedings of the 22nd International Conference on Data Engineering Workshops, 2006

Towards Expressive Publish/Subscribe Systems.
Proceedings of the Advances in Database Technology, 2006

Network scheduling for data archiving applications in sensor networks.
Proceedings of the 3rd Workshop on Data Management for Sensor Networks, 2006

Semantic Approximation of Data Stream Joins.
IEEE Trans. Knowl. Data Eng., 2005

MAFIA: A Maximal Frequent Itemset Algorithm.
IEEE Trans. Knowl. Data Eng., 2005

Automatic Subspace Clustering of High Dimensional Data.
Data Min. Knowl. Discov., 2005

Guaranteeing Correctness and Availability in P2P Range Indices.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2005

Models and methods for privacy-preserving data publishing and analysis: invited tutorial.
Proceedings of the Twenty-fourth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2005

Processing High-Speed Intelligence Feeds in Real-Time.
Proceedings of the Intelligence and Security Informatics, 2005

Multi-query Optimization for Sensor Networks.
Proceedings of the Distributed Computing in Sensor Systems, 2005

Guest Editorial to the special issue on data stream processing.
VLDB J., 2004

Online Scheduling to Minimize Average Stretch.
SIAM J. Comput., 2004

Query Processing in Sensor Networks.
IEEE Pervasive Comput., 2004

Privacy preserving mining of association rules.
Inf. Syst., 2004

A Vision for PetaByte Data Management and Analyis Services for the Arecibo Telescope.
IEEE Data Eng. Bull., 2004

A storage and indexing framework for p2p systems.
Proceedings of the 13th international conference on World Wide Web, 2004

P-tree: a p2p index for resource discovery applications.
Proceedings of the 13th international conference on World Wide Web, 2004

Querying Peer-to-Peer Networks Using P-Trees.
Proceedings of the Seventh International Workshop on the Web and Databases, 2004

Detecting Change in Data Streams.
Proceedings of the (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31, 2004

Approximation Techniques for Spatial Data.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2004

An Indexing Framework for Peer-to-Peer Systems.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2004

The Architecture of the Cornell Knowledge Broker.
Proceedings of the Intelligence and Security Informatics, 2004

Hybrid Push-Pull Query Processing for Sensor Networks.
Proceedings of the 34. Jahrestagung der Gesellschaft für Informatik, 2004

Sketch-Based Multi-query Processing over Data Streams.
Proceedings of the Advances in Database Technology, 2004

WaveScheduling: energy-efficient data dissemination for sensor networks.
Proceedings of the 1st Workshop on Data Management for Sensor Networks, 2004

How to Quickly Find a Witness.
Proceedings of the Constraint-Based Mining and Inductive Databases, 2004

Efficient Approximation of Correlated Sums on Data Streams.
IEEE Trans. Knowl. Data Eng., 2003

Reminiscences on Influential Papers.
SIGMOD Rec., 2003

The Cougar Project: a work-in-progress report.
SIGMOD Rec., 2003

Time management for new faculty.
SIGMOD Rec., 2003

Overview of the 2003 KDD Cup.
SIGKDD Explor., 2003

Letter from the Special Issue Editor.
IEEE Data Eng. Bull., 2003

DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints.
Data Min. Knowl. Discov., 2003

Approximate Join Processing Over Data Streams.
Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, 2003

Limiting privacy breaches in privacy preserving data mining.
Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2003

Gossip-Based Computation of Aggregate Information.
Proceedings of the 44th Symposium on Foundations of Computer Science (FOCS 2003), 2003

MAFIA: A Performance Study of Mining Maximal Frequent Itemsets.
Proceedings of the FIMI '03, 2003

Query Processing in Sensor Networks.
Proceedings of the First Biennial Conference on Innovative Data Systems Research, 2003

Leveraging Non-Uniform Resources for Parallel Query Processing.
Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

Database management systems (3. ed.).
McGraw-Hill, ISBN: 978-0-07-115110-8, 2003

The Cougar Approach to In-Network Query Processing in Sensor Networks.
SIGMOD Rec., 2002

Report on the SIGKDD 2001 Conference Panel "New Research Directions in KDD".
SIGKDD Explor., 2002

Mining Data Streams under Block Evolution.
SIGKDD Explor., 2002

A Framework for Measuring Differences in Data Characteristics.
J. Comput. Syst. Sci., 2002

Scaling mining algorithms to large databases.
Commun. ACM, 2002

Querying and Mining Data Streams: You Only Get One Look.
Proceedings of 28th International Conference on Very Large Data Bases, 2002

Querying and mining data streams: you only get one look a tutorial.
Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 2002

COUGAR: the network is the database.
Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 2002

Processing complex aggregate queries over data streams.
Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 2002

Least Expected Cost Query Optimization: What Can We Expect?
Proceedings of the Twenty-first ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 2002

SECRET: a scalable linear regression tree algorithm.
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002

A theoretical framework for learning from a pool of disparate data sources.
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002

Sequential PAttern mining using a bitmap representation.
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002

GADT: A Probability Space ADT for Representing and Querying the Physical World.
Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26, 2002

DEMON: Mining and Monitoring Evolving Data.
IEEE Trans. Knowl. Data Eng., 2001

Report on the Workshop on Research Issues in Data Mining and Knowledge Discovery Workshop (DMKD 2001).
SIGKDD Explor., 2001

On Computing Correlated Aggregates Over Continual Data Streams.
Proceedings of the 2001 ACM SIGMOD international conference on Management of data, 2001

Query Optimization In Compressed Database Systems.
Proceedings of the 2001 ACM SIGMOD international conference on Management of data, 2001

Towards Sensor Database Systems.
Proceedings of the Mobile Data Management, Second International Conference, 2001

Advances in decision tree construction.
Proceedings of the Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, 2001

Bias Correction in Classification Tree Construction.
Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases.
Proceedings of the 17th International Conference on Data Engineering, 2001

Querying the physical world.
IEEE Wirel. Commun., 2000

RainForest - A Framework for Fast Decision Tree Construction of Large Datasets.
Data Min. Knowl. Discov., 2000

Data Mining with Decision Trees.
Proceedings of the 16th International Conference on Data Engineering, San Diego, California, USA, February 28, 2000

Rapid Convergence of a Local Load Balancing Algorithm for Asynchronous Rings.
Theor. Comput. Sci., 1999

Mining Very Large Databases.
Computer, 1999

BOAT-Optimistic Decision Tree Construction.
Proceedings of the SIGMOD 1999, 1999

A Framework for Measuring Changes in Data Characteristics.
Proceedings of the Eighteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 31, 1999

Classification and Regression: Money *can* Grow on Trees.
Proceedings of the Tutorial Notes for ACM SIGKDD 1999 International Conference on Knowledge Discovery and Data Mining, 1999

CACTUS - Clustering Categorical Data Using Summaries.
Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999

Clustering Large Datasets in Arbitrary Metric Spaces.
Proceedings of the 15th International Conference on Data Engineering, 1999

Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications.
Proceedings of the SIGMOD 1998, 1998

Fair On-Line Scheduling of a Dynamic Set of Tasks on a Single Resource.
Inf. Process. Lett., 1997

The BUCKY Object-Relational Benchmark (Experience Paper).
Proceedings of the SIGMOD 1997, 1997

A proportional share resource allocation algorithm for real-time, time-shared systems.
Proceedings of the 17th IEEE Real-Time Systems Symposium (RTSS '96), 1996

Fast scheduling of periodic tasks on multiple resources.
Proceedings of IPPS '95, 1995
