HAN
03-toc-ix-xviii-9780123814791
2011/6/1
3:32
Page xiv
#6
xiv
Contents
Chapter 7 Advanced Pattern Mining
279
7.1
Pattern Mining: A Road Map
279
7.2
Pattern Mining in Multilevel, Multidimensional Space
283
7.2.1
Mining Multilevel Associations
283
7.2.2
Mining Multidimensional Associations
287
7.2.3
Mining Quantitative Association Rules
289
7.2.4
Mining Rare Patterns and Negative Patterns
291
7.3
Constraint-Based Frequent Pattern Mining
294
7.3.1
Metarule-Guided Mining of Association Rules
295
7.3.2
Constraint-Based Pattern Generation: Pruning Pattern Space
and Pruning Data Space
296
7.4
Mining High-Dimensional Data and Colossal Patterns
301
7.4.1
Mining Colossal Patterns by Pattern-Fusion
302
7.5
Mining Compressed or Approximate Patterns
307
7.5.1
Mining Compressed Patterns by Pattern Clustering
308
7.5.2
Extracting Redundancy-Aware Top-k Patterns
310
7.6
Pattern Exploration and Application
313
7.6.1
Semantic Annotation of Frequent Patterns
313
7.6.2
Applications of Pattern Mining
317
7.7
Summary
319
7.8
Exercises
321
7.9
Bibliographic Notes
323
Chapter 8 Classification: Basic Concepts
327
8.1
Basic Concepts
327
8.1.1
What Is Classification?
327
8.1.2
General Approach to Classification
328
8.2
Decision Tree Induction
330
8.2.1
Decision Tree Induction
332
8.2.2
Attribute Selection Measures
336
8.2.3
Tree Pruning
344
8.2.4
Scalability and Decision Tree Induction
347
8.2.5
Visual Mining for Decision Tree Induction
348
8.3
Bayes Classification Methods
350
8.3.1
Bayes’ Theorem
350
8.3.2
Na¨ıve Bayesian Classification
351
8.4
Rule-Based Classification
355
8.4.1
Using IF-THEN Rules for Classification
355
8.4.2
Rule Extraction from a Decision Tree
357
8.4.3
Rule Induction Using a Sequential Covering Algorithm
359
HAN
03-toc-ix-xviii-9780123814791
2011/6/1
3:32
Page xv
#7
Contents
xv
8.5
Model Evaluation and Selection
364
8.5.1
Metrics for Evaluating Classifier Performance
364
8.5.2
Holdout Method and Random Subsampling
370
8.5.3
Cross-Validation
370
8.5.4
Bootstrap
371
8.5.5
Model Selection Using Statistical Tests of Significance
372
8.5.6
Comparing Classifiers Based on Cost–Benefit and ROC Curves
373
8.6
Techniques to Improve Classification Accuracy
377
8.6.1
Introducing Ensemble Methods
378
8.6.2
Bagging
379
8.6.3
Boosting and AdaBoost
380
8.6.4
Random Forests
382
8.6.5
Improving Classification Accuracy of Class-Imbalanced Data
383
8.7
Summary
385
8.8
Exercises
386
8.9
Bibliographic Notes
389
Chapter 9 Classification: Advanced Methods
393
9.1
Bayesian Belief Networks
393
9.1.1
Concepts and Mechanisms
394
9.1.2
Training Bayesian Belief Networks
396
9.2
Classification by Backpropagation
398
9.2.1
A Multilayer Feed-Forward Neural Network
398
9.2.2
Defining a Network Topology
400
9.2.3
Backpropagation
400
9.2.4
Inside the Black Box: Backpropagation and Interpretability
406
9.3
Support Vector Machines
408
9.3.1
The Case When the Data Are Linearly Separable
408
9.3.2
The Case When the Data Are Linearly Inseparable
413
9.4
Classification Using Frequent Patterns
415
9.4.1
Associative Classification
416
9.4.2
Discriminative Frequent Pattern–Based Classification
419
9.5
Lazy Learners (or Learning from Your Neighbors)
422
9.5.1
k-Nearest-Neighbor Classifiers
423
9.5.2
Case-Based Reasoning
425
9.6
Other Classification Methods
426
9.6.1
Genetic Algorithms
426
9.6.2
Rough Set Approach
427
9.6.3
Fuzzy Set Approaches
428
9.7
Additional Topics Regarding Classification
429
9.7.1
Multiclass Classification
430