Heikki Mannila

Affiliations:
  • Academy of Finland, President
  • Aalto University


According to our database1, Heikki Mannila authored at least 186 papers between 1982 and 2024.

Collaborative distances:
  • Dijkstra number2 of three.
  • Erdős number3 of two.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
The Hadamard decomposition problem.
Data Min. Knowl. Discov., July, 2024

2016
Significance testing of word frequencies in corpora.
Digit. Scholarsh. Humanit., 2016

2013
Probabilistic Models for Query Approximation with Large Sparse Binary Datasets
CoRR, 2013

2011
Banded structure in binary matrices.
Knowl. Inf. Syst., 2011

Randomization techniques for assessing the significance of gene periodicity results.
BMC Bioinform., 2011

A Shapley Value Approach for Influence Attribution.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

Permutation Structure in 0-1 Data.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

Analyzing Word Frequencies in Large Text Corpora Using Inter-arrival Times and Bootstrapping.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

2010
Evaluation of BIC and Cross Validation for model selection on sequence segmentations.
Int. J. Data Min. Bioinform., 2010

Evaluating Query Result Significance in Databases via Randomizations.
Proceedings of the SIAM International Conference on Data Mining, 2010

Finding effectors in social networks.
Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2010

Gaussian Clusters and Noise: An Approach Based on the Minimum Description Length Principle.
Proceedings of the Discovery Science - 13th International Conference, 2010

A Theory of Inductive Query Answering.
Proceedings of the Inductive Databases and Constraint-Based Data Mining., 2010

2009
Determining Attributes to Maximize Visibility of Objects.
IEEE Trans. Knowl. Data Eng., 2009

ACM TKDD special issue ACM SIGKDD 2007 and ACM SIGKDD 2008.
ACM Trans. Knowl. Discov. Data, 2009

Randomization methods for assessing data analysis results on real-valued matrices.
Stat. Anal. Data Min., 2009

A randomized approximation algorithm for computing bucket orders.
Inf. Process. Lett., 2009

Approximating the Minimum Chain Completion problem.
Inf. Process. Lett., 2009

Complexity control in a mixture model by the Hardy-Weinberg equilibrium.
Comput. Stat. Data Anal., 2009

Query Significance in Databases via Randomizations
CoRR, 2009

Finding Links and Initiators: A Graph-Reconstruction Problem.
Proceedings of the SIAM International Conference on Data Mining, 2009

Low-Entropy Set Selection.
Proceedings of the SIAM International Conference on Data Mining, 2009

Applying Electromagnetic Field Theory Concepts to Clustering with Constraints.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2009

Randomization methods in data mining.
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28, 2009

Tell me something I don't know: randomization strategies for iterative data mining.
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28, 2009

Randomization Methods for Assessing the Significance of Data Mining Results.
Proceedings of the Foundations of Intelligent Systems, 18th International Symposium, 2009

2008
The Discrete Basis Problem.
IEEE Trans. Knowl. Data Eng., 2008

Optimal segmentation using tree models.
Knowl. Inf. Syst., 2008

Determining significance of pairwise co-occurrences of events in bursty sequences.
BMC Bioinform., 2008

Randomization of real-valued matrices for assessing the significance of data mining results.
Proceedings of the SIAM International Conference on Data Mining, 2008

Mining Association Rules of Simple Conjunctive Queries.
Proceedings of the SIAM International Conference on Data Mining, 2008

Finding Subgroups having Several Descriptions: Algorithms for Redescription Mining.
Proceedings of the SIAM International Conference on Data Mining, 2008

Standing Out in a Crowd: Selecting Attributes for Maximum Visibility.
Proceedings of the 24th International Conference on Data Engineering, 2008

Feature Selection in Taxonomies with Applications to Paleontology.
Proceedings of the Discovery Science, 11th International Conference, 2008

Finding Total and Partial Orders from Data for Seriation.
Proceedings of the Algorithmic Learning Theory, 19th International Conference, 2008

Randomization Techniques for Data Mining Methods.
Proceedings of the Advances in Databases and Information Systems, 2008

2007
Clustering aggregation.
ACM Trans. Knowl. Discov. Data, 2007

Assessing data mining results via swap randomization.
ACM Trans. Knowl. Discov. Data, 2007

How to Handle Small Samples: Bootstrap and Bayesian Methods in the Analysis of Linguistic Change.
Lit. Linguistic Comput., 2007

Constrained hidden Markov models for population-based haplotyping.
BMC Bioinform., 2007

Comparing segmentations by applying randomization techniques.
BMC Bioinform., 2007

A random walk approach to sampling hidden databases.
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2007

Finding Outlying Items in Sets of Partial Rankings.
Proceedings of the Knowledge Discovery in Databases: PKDD 2007, 2007

Nestedness and segmented nestedness.
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007

Finding low-entropy sets and trees from binary data.
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007

Recurrent Predictive Models for Sequence Segmentation.
Proceedings of the Advances in Intelligent Data Analysis VII, 2007

2006
Seriation in Paleontological Data Using Markov Chain Monte Carlo Methods.
PLoS Comput. Biol., 2006

Segmentation and dimensionality reduction.
Proceedings of the Sixth SIAM International Conference on Data Mining, 2006

Finding Trees from Unordered 0-1 Data.
Proceedings of the Knowledge Discovery in Databases: PKDD 2006, 2006

Algorithms for discovering bucket orders from data.
Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006

What is the Dimension of Your Binary Data?
Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 2006

Finding fragments of orders and total orders from 0-1 data.
Proceedings of the Extraction et gestion des connaissances (EGC'2006), 2006

Analysis of Linux Evolution Using Aligned Source Code Segments.
Proceedings of the Discovery Science, 9th International Conference, 2006

2005
Using Markov chain Monte Carlo and dynamic programming for event sequence data.
Knowl. Inf. Syst., 2005

A Hidden Markov Technique for Haplotype Reconstruction.
Proceedings of the Algorithms in Bioinformatics, 5th International Workshop, 2005

Finding partial orders from unordered 0-1 data.
Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2005

Parameter-Free Spatial Data Mining Using MDL.
Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), 2005

Mining Chains of Relations.
Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), 2005

Piecewise Constant Modeling of Sequential Data Using Reversible Jump Markov Chain Monte Carlo.
Proceedings of the Data Mining in Bioinformatics, 2005

2004
Editorial.
Data Min. Knowl. Discov., 2004

Relational link-based ranking.
Proceedings of the (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31, 2004

Geometric and Combinatorial Tiles in 0-1 Data.
Proceedings of the Knowledge Discovery in Databases: PKDD 2004, 2004

Dense itemsets.
Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004

Approximating a collection of frequent sets.
Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004


Boolean Formulas and Frequent Sets.
Proceedings of the Constraint-Based Mining and Inductive Databases, 2004

Hidden Markov Modelling Techniques for Haplotype Analysis.
Proceedings of the Algorithmic Learning Theory, 15th International Conference, 2004

2003
Discovering all most specific sentences.
ACM Trans. Database Syst., 2003

Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data.
IEEE Trans. Knowl. Data Eng., 2003

Mixture Models and Frequent Sets: Combining Global and Local Methods for 0-1 Data.
Proceedings of the Third SIAM International Conference on Data Mining, 2003

Finding recurrent sources in sequences.
Proceedings of the Sventh Annual International Conference on Computational Biology, 2003

An MDL Method for Finding Haplotype Blocks and for Estimating the Strength of Haplotype Block Boundaries.
Proceedings of the 8th Pacific Symposium on Biocomputing, 2003

A Simple Algorithm for Topic Identification in 0-1 Data.
Proceedings of the Knowledge Discovery in Databases: PKDD 2003, 2003

The Pattern Ordering Problem.
Proceedings of the Knowledge Discovery in Databases: PKDD 2003, 2003

Rule Discovery and Probabilistic Modeling for Onomastic Data.
Proceedings of the Knowledge Discovery in Databases: PKDD 2003, 2003

Fragments of order.
Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 24, 2003

2002
Emerging scientific applications in data mining.
Commun. ACM, 2002

Long-range control of expression in yeast.
Bioinform., 2002

Combining Pattern Discovery and Probabilistic Modeling in Data Mining.
Proceedings of the Algorithm Theory, 2002

Topics in 0--1 data.
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002

A Theory of Inductive Query Answering.
Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), 2002

OSSM: A Segmentation Approach to Optimize Frequency Counting.
Proceedings of the 18th International Conference on Data Engineering, San Jose, CA, USA, February 26, 2002

Local and Global Methods in Data Mining: Basic Techniques and Open Problems.
Proceedings of the Automata, Languages and Programming, 29th International Colloquium, 2002

Genome segmentation using piecewise constant intensity models and reversible jump MCMC.
Proceedings of the European Conference on Computational Biology (ECCB 2002), 2002

2001
Time-Series Similarity Problems and Well-Separated Geometric Sets.
Nord. J. Comput., 2001

Finding similar situations in sequences of events via random projections.
Proceedings of the First SIAM International Conference on Data Mining, 2001

Decomposition of Event Sequences into Independent Components.
Proceedings of the First SIAM International Conference on Data Mining, 2001

Combining Discrete Algorithmic and Probabilistic Approaches in Data Mining.
Proceedings of the Principles of Data Mining and Knowledge Discovery, 2001

Finding simple intensity descriptions from event sequence data.
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001

Probabilistic modeling of transaction data with applications to profiling, visualization, and prediction.
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001

Random projection in dimensionality reduction: applications to image and text data.
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001

Time Series Segmentation for Context Recognition in Mobile Devices.
Proceedings of the 2001 IEEE International Conference on Data Mining, 29 November, 2001

Principles of Data Mining
MIT Press, ISBN: 9780262332521, 2001

2000
Theoretical Frameworks for Data Mining.
SIGKDD Explor., 2000

Probabilistic Models for Query Approximation with Large Sparse Binary Data Sets.
Proceedings of the UAI '00: Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence, Stanford University, Stanford, California, USA, June 30, 2000

Context-Based Similarity Measures for Categorical Databases.
Proceedings of the Principles of Data Mining and Knowledge Discovery, 2000

Global partial orders from sequential data.
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000

Approximate Query Answering with Frequent Sets and Maximum Entropy.
Proceedings of the 16th International Conference on Data Engineering, San Diego, California, USA, February 28, 2000

Gene Mapping by Haplotype Pattern Mining.
Proceedings of the 1st IEEE International Symposium on Bioinformatics and Biomedical Engineering, 2000

1999
Rule Discovery in Telecommunication Alarm Data.
J. Netw. Syst. Manag., 1999

Borders: An Efficient Algorithm for Association Generation in Dynamic Databases.
J. Intell. Inf. Syst., 1999

Interactive exploration of interesting findings in the Telecommunication Network Alarm Sequence Analyzer (TASA).
Inf. Softw. Technol., 1999

Reasoning with Examples: Propositional Formulae and Database Dependencies.
Acta Informatica, 1999

Association Rule Selection in a Data Mining Environment.
Proceedings of the Principles of Data Mining and Knowledge Discovery, 1999

Prediction with Local Patterns using Cross-Entropy.
Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999

Inductive Databases (Abstract).
Proceedings of the Inductive Logic Programming, 9th International Workshop, 1999

Similarity between Event Types in Sequences.
Proceedings of the Data Warehousing and Knowledge Discovery, 1999

Modeling KDD Processes within the Inductive Database Framework.
Proceedings of the Data Warehousing and Knowledge Discovery, 1999

1998
Querying Inductive Databases: A Case Study on the MINE RULE Operator.
Proceedings of the Principles of Data Mining and Knowledge Discovery, 1998

Similarity of Attributes by External Probes.
Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), 1998

Rule Discovery from Time Series.
Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), 1998

Learning, Mining, or Modeling? A Case Study from Paleocology.
Proceedings of the Discovery Science, 1998

Frailty Factors and Time-dependent Hazards in Modelling Ear Infections in Children Using BASSIST.
Proceedings of the COMPSTAT 1998, 1998

1997
Disjunctive Datalog.
ACM Trans. Database Syst., 1997

Discovery of Frequent Episodes in Event Sequences.
Data Min. Knowl. Discov., 1997

Levelwise Search and Borders of Theories in Knowledge Discovery.
Data Min. Knowl. Discov., 1997

Distance Measures for Point Sets and their Computation.
Acta Informatica, 1997

Similarity of Event Sequences.
Proceedings of the 4th International Workshop on Temporal Representation and Reasoning, 1997

Inductive Databases and Condensed Representations for Data Mining.
Proceedings of the Logic Programming, 1997

Data mining, Hypergraph Transversals, and Machine Learning.
Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 1997

Finding Similar Time Series.
Proceedings of the Principles of Data Mining and Knowledge Discovery, 1997

Methods and Problems in Data Mining.
Proceedings of the Database Theory, 1997

Discovering All Most Specific Sentences by Randomized Algorithms.
Proceedings of the Database Theory, 1997

Efficient Algorithms for Discovering Frequent Sets in Incremental Databases.
Proceedings of the Workshop on Research Issues on Data Mining and Knowledge Discovery, 1997

A Data Mining Methodology and Its Application to Semi-automatic Knowledge Acquisition.
Proceedings of the Eighth International Workshop on Database and Expert Systems Applications, 1997

1996
A Database Perspective on Knowledge Discovery.
Commun. ACM, 1996

Data Mining: Machine Learning, Statistics, and Databases.
Proceedings of the Eighth International Conference on Scientific and Statistical Database Management, 1996

TASA: Telecommunication Alarm Sequence Analyzer or how to enjoy faults in your network.
Proceedings of the 1996 Network Operations and Management Symposium, 1996

Multiple Uses of Frequent Sets and Condensed Representations (Extended Abstract).
Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), 1996

Discovering Generalized Episodes Using Minimal Occurrences.
Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), 1996

Data Mining and Machine Learning (Abstract).
Proceedings of the Machine Learning, 1996

Knowledge Discovery from Telecommunication Network Alarm Databases.
Proceedings of the Twelfth International Conference on Data Engineering, February 26, 1996

Schema Design and Knowledge Discovery (Abstract).
Proceedings of the Conceptual Modeling, 1996

Fast Discovery of Association Rules.
Proceedings of the Advances in Knowledge Discovery and Data Mining., 1996

1995
Approximate Inference of Functional Dependencies from Relations.
Theor. Comput. Sci., 1995

Ordered and Unordered Tree Inclusion.
SIAM J. Comput., 1995

Recognizing Renamable Generalized Propositional Horn Formulas Is NP-complete.
Discret. Appl. Math., 1995

Discovering Frequent Episodes in Sequences.
Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), 1995

A Perspective on Databases and Data Mining.
Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), 1995

Aspects of Knowledge Discovery and Data Mining.
Proceedings of the Kurzfassungen 7. Workshop Grundlagen von Datenbanken, 1995

MDL learning of unions of simple pattern languages from positive examples.
Proceedings of the Computational Learning Theory, Second European Conference, 1995

1994
Algorithms for Inferring Functional Dependencies from Relations.
Data Knowl. Eng., 1994

The Power of Sampling in Knowledge Discovery.
Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 1994

Adding Disjunction to Datalog.
Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 1994

Efficient Algorithms for Discovering Association Rules.
Proceedings of the Knowledge Discovery in Databases: Papers from the 1994 AAAI Workshop, 1994

Expressive Power and Complexity of Disjunctive Datalog under the Stable Model Semantics.
Proceedings of the Management and Processing of Complex Data Structures, Third Workshop on Information Systems and Artificial Intelligence, Hamburg, Germany, February 28, 1994

Forming Grammars for Structured Documents: an Application of Grammatical Inference.
Proceedings of the Grammatical Inference and Applications, Second International Colloquium, 1994

Disjunctive Logic Programming over Finite Structures.
Proceedings of the Innovationen bei Rechen- und Kommunikationssystemen, Eine Herausforderung für die Informatik, 24. GI-Jahrestagung im Rahmen des 13th World Computer Congress, IFIP Congress '94, Hamburg, 28. August, 1994

An ALgorithm for Learning Hierarchical Classifiers.
Proceedings of the Machine Learning: ECML-94, 1994

Query Primitives for Tree-Structured Data.
Proceedings of the Combinatorial Pattern Matching, 5th Annual Symposium, 1994

Finding Interesting Rules from Large Sets of Discovered Association Rules.
Proceedings of the Third International Conference on Information and Knowledge Management (CIKM'94), Gaithersburg, Maryland, USA, November 29, 1994

1993
Right Invariant Metrics and Measures of Presortedness.
Discret. Appl. Math., 1993

Retrieval from Hierarchical Texts by Partial Patterns.
Proceedings of the 16th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. Pittsburgh, PA, USA, June 27, 1993

Learning rules with local exceptions.
Proceedings of the First European Conference on Computational Learning Theory, 1993

1992
Discovering functional and inclusion dependencies in relational databases.
Int. J. Intell. Syst., 1992

On the Complexity of Inferring Functional Dependencies.
Discret. Appl. Math., 1992

Approximate Dependency Inference from Relations.
Proceedings of the Database Theory, 1992

Grammatical Tree Matching.
Proceedings of the Combinatorial Pattern Matching, Third Annual Symposium, 1992

Learning Hierarchical Rule Sets.
Proceedings of the Fifth Annual ACM Conference on Computational Learning Theory, 1992

Design of Relational Databases
Addison-Wesley, ISBN: 0-201-56523-4, 1992

1991
The Tree Inclusion Problem.
Proceedings of the TAPSOFT'91: Proceedings of the International Joint Conference on Theory and Practice of Software Development, 1991

1990
Unifications, Deunifications, and Their Complexity.
BIT, 1990

Generation of test cases for simple prolog programs.
Acta Cybern., 1990

1989
Automatic Generation of Test Data for Relational Queries.
J. Comput. Syst. Sci., 1989

Practical Algorithms for Finding Prime Attributes and Testing Normal Forms.
Proceedings of the Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 1989

1988
Time Parameter and Arbitrary Deunions in the Set Union Problem.
Proceedings of the SWAT 88, 1988

1987
Dependency Inference.
Proceedings of the VLDB'87, 1987

Flow Analysis of Prolog Programs.
Proceedings of the 1987 Symposium on Logic Programming, San Francisco, California, USA, August 31, 1987

1986
Design by Example: An Application of Armstrong Relations.
J. Comput. Syst. Sci., 1986

Timestamped Term Representation for Implementing Prolog.
Proceedings of the 1986 Symposium on Logic Programming, 1986

Test Data for Relational Queries.
Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, 1986

On the Complexity of Unification Sequences.
Proceedings of the Third International Conference on Logic Programming, 1986

Inclusion Dependencies in Database Design.
Proceedings of the Second International Conference on Data Engineering, 1986

The Set Union Problem with Backtracking.
Proceedings of the Automata, Languages and Programming, 13th International Colloquium, 1986

1985
On the Suitability of Trace Semantics for Modular Proofs of Communicating Processes.
Theor. Comput. Sci., 1985

Measures of Presortedness and Optimal Sorting Algorithms.
IEEE Trans. Computers, 1985

A Fast Algorithm for Renaming a Set of Clauses as a Horn Set.
Inf. Process. Lett., 1985

Small Armstrong Relations for Database Design.
Proceedings of the Fourth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, 1985

1984
A Simple Linear-Time Algorithm for in Situ Merging.
Inf. Process. Lett., 1984

A Semantic Approach to Program Modularity
Inf. Control., 1984

Measures of Presortedness and Optimal Sorting Algorithms (Extended Abstract).
Proceedings of the Automata, 1984

1983
A topological characterization of (λ, μ)<sup>*</sup>-compactness.
Ann. Pure Appl. Log., 1983

On the Relationship of Minimum and Optimum Covers for a Set of Functional Dependencies.
Acta Informatica, 1983

Derivation of Efficient DAG Marking Algorithms.
Proceedings of the Conference Record of the Tenth Annual ACM Symposium on Principles of Programming Languages, 1983

1982
A Refinement of Kahn's Semantic to Handle Non-Determinism and Communication (Extended Abstract).
Proceedings of the ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, 1982

Locality in Modular Systems.
Proceedings of the Automata, 1982


  Loading...