HAN
03-toc-ix-xviii-9780123814791
2011/6/1
3:32
Page xvi
#8
xvi
Contents
9.7.2
Semi-Supervised Classification
432
9.7.3
Active Learning
433
9.7.4
Transfer Learning
434
9.8
Summary
436
9.9
Exercises
438
9.10
Bibliographic Notes
439
Chapter 10 Cluster Analysis: Basic Concepts and Methods
443
10.1
Cluster Analysis
444
10.1.1 What Is Cluster Analysis?
444
10.1.2 Requirements for Cluster Analysis
445
10.1.3 Overview of Basic Clustering Methods
448
10.2
Partitioning Methods
451
10.2.1 k-Means: A Centroid-Based Technique
451
10.2.2 k-Medoids: A Representative Object-Based Technique
454
10.3
Hierarchical Methods
457
10.3.1 Agglomerative versus Divisive Hierarchical Clustering
459
10.3.2 Distance Measures in Algorithmic Methods
461
10.3.3 BIRCH: Multiphase Hierarchical Clustering Using Clustering
Feature Trees
462
10.3.4 Chameleon: Multiphase Hierarchical Clustering Using Dynamic
Modeling
466
10.3.5 Probabilistic Hierarchical Clustering
467
10.4
Density-Based Methods
471
10.4.1 DBSCAN: Density-Based Clustering Based on Connected
Regions with High Density
471
10.4.2 OPTICS: Ordering Points to Identify the Clustering Structure
473
10.4.3 DENCLUE: Clustering Based on Density Distribution Functions
476
10.5
Grid-Based Methods
479
10.5.1 STING: STatistical INformation Grid
479
10.5.2 CLIQUE: An Apriori-like Subspace Clustering Method
481
10.6
Evaluation of Clustering
483
10.6.1 Assessing Clustering Tendency
484
10.6.2 Determining the Number of Clusters
486
10.6.3 Measuring Clustering Quality
487
10.7
Summary__490'>Summary
490
10.8
Exercises
491
10.9
Bibliographic Notes
494
Chapter 11 Advanced Cluster Analysis
497
11.1
Probabilistic Model-Based Clustering
497
11.1.1 Fuzzy Clusters
499
HAN
03-toc-ix-xviii-9780123814791
2011/6/1
3:32
Page xvii
#9
Contents
xvii
11.1.2 Probabilistic Model-Based Clusters
501
11.1.3 Expectation-Maximization Algorithm
505
11.2
Clustering High-Dimensional Data
508
11.2.1 Clustering High-Dimensional Data: Problems, Challenges,
and Major Methodologies
508
11.2.2 Subspace Clustering Methods
510
11.2.3 Biclustering
512
11.2.4 Dimensionality Reduction Methods and Spectral Clustering
519
11.3
Clustering Graph and Network Data
522
11.3.1 Applications and Challenges
523
11.3.2 Similarity Measures
525
11.3.3 Graph Clustering Methods
528
11.4
Clustering with Constraints
532
11.4.1 Categorization of Constraints
533
11.4.2 Methods for Clustering with Constraints
535
11.5
Summary
538
11.6
Exercises
539
11.7
Bibliographic Notes
540
Chapter 12 Outlier Detection
543
12.1
Outliers and Outlier Analysis
544
12.1.1 What Are Outliers?
544
12.1.2 Types of Outliers
545
12.1.3 Challenges of Outlier Detection
548
12.2
Outlier Detection Methods
549
12.2.1 Supervised, Semi-Supervised, and Unsupervised Methods
549
12.2.2 Statistical Methods, Proximity-Based Methods, and
Clustering-Based Methods
551
12.3
Statistical Approaches
553
12.3.1 Parametric Methods
553
12.3.2 Nonparametric Methods
558
12.4
Proximity-Based Approaches
560
12.4.1 Distance-Based Outlier Detection and a Nested Loop
Method
561
12.4.2 A Grid-Based Method
562
12.4.3 Density-Based Outlier Detection
564
12.5
Clustering-Based Approaches
567
12.6
Classification-Based Approaches
571
12.7
Mining Contextual and Collective Outliers
573
12.7.1 Transforming Contextual Outlier Detection to Conventional
Outlier Detection
573
HAN
03-toc-ix-xviii-9780123814791
2011/6/1
3:32
Page xviii
#10
xviii
Contents
12.7.2 Modeling Normal Behavior with Respect to Contexts
574
12.7.3 Mining Collective Outliers
575
12.8
Outlier Detection in High-Dimensional Data
576
12.8.1 Extending Conventional Outlier Detection
577
12.8.2 Finding Outliers in Subspaces
578
12.8.3 Modeling High-Dimensional Outliers
579
12.9
Summary
581
12.10 Exercises
582
12.11 Bibliographic Notes
583
Chapter 13 Data Mining Trends and Research Frontiers
585
13.1
Mining Complex Data Types
585
13.1.1 Mining Sequence Data: Time-Series, Symbolic Sequences,
and Biological Sequences
586
13.1.2 Mining Graphs and Networks
591
13.1.3 Mining Other Kinds of Data
595
13.2
Other Methodologies of Data Mining
598
13.2.1 Statistical Data Mining
598
13.2.2 Views on Data Mining Foundations
600
13.2.3 Visual and Audio Data Mining
602
13.3
Data Mining Applications
607
13.3.1 Data Mining for Financial Data Analysis
607
13.3.2 Data Mining for Retail and Telecommunication Industries
609
13.3.3 Data Mining in Science and Engineering
611
13.3.4 Data Mining for Intrusion Detection and Prevention
614
13.3.5 Data Mining and Recommender Systems
615
13.4
Data Mining and Society
618
13.4.1 Ubiquitous and Invisible Data Mining
618
13.4.2 Privacy, Security, and Social Impacts of Data Mining
620
13.5
Data Mining Trends
622
13.6
Summary
625
13.7
Exercises
626
13.8
Bibliographic Notes
628
Bibliography
633
Index
673