Data Mining. Concepts and Techniques, 3rd Edition

HAN 22-ind-673-708-9780123814791

Yüklə 7,95 Mb.

Pdf görüntüsü

səhifə	331/343
tarix	08.10.2017
ölçüsü	7,95 Mb.
	#3817

1 ... 327 328 329 330 331 332 333 334 ... 343

HAN

22-ind-673-708-9780123814791

2011/6/1

3:27

Page 676

#4

676

Index

biclusters, 511

with coherent values, 516

with coherent values on rows, 516

with constant values, 515

with constant values on columns, 515

with constant values on rows, 515

as submatrix, 515

types of, 515–516

bimodal, 47

bin boundaries, 89

binary attributes, 41, 79

asymmetric, 42, 70

as Boolean, 41

contingency table for, 70

dissimilarity between, 71–72

example, 41–42

proximity measures, 70–72

symmetric, 42, 70–71

See also attributes

binning

discretization by, 115

equal-frequency, 89

smoothing by bin boundaries, 89

smoothing by bin means, 89

smoothing by bin medians, 89

biological sequences, 586, 624

alignment of, 590–591

analysis, 590

BLAST, 590

hidden Markov model, 591

as mining trend, 624

multiple sequence alignment, 590

pairwise alignment, 590

phylogenetic tree, 590

substitution matrices, 590

bipartite graphs, 523

BIRCH, 458, 462–466

CF-trees, 462–463, 464, 465–466

clustering feature, 462, 463, 464

effectiveness, 465

multiphase clustering technique, 464–465

See also hierarchical methods

bitmap indexing, 160–161, 179

bitmapped join indexing, 163, 179

bivariate distribution, 40

BLAST. See Basic Local Alignment Search Tool

BOAT. See Bootstrapped Optimistic Algorithm for

Tree construction

Boolean association rules, 281

Boolean attributes, 41

boosting, 380

accuracy, 382

AdaBoost, 380–382

bagging versus, 381–382

weight assignment, 381

bootstrap method, 371, 386

bottom-up design approach, 133, 151–152

bottom-up subspace search, 510–511

boxplots, 49

computation, 50

example, 50

ﬁve-number summary, 49

illustrated, 50

in outlier visualization, 555

BUC, 200–204, 235

for 3-D data cube computation, 200

algorithm, 202

Apriori property, 201

bottom-up construction, 201

iceberg cube construction, 201

partitioning snapshot, 203

performance, 204

top-down processing order, 200, 201

business intelligence (BI), 27

business metadata, 135

business query view, 151

C

C4.5, 332, 385

class-based ordering, 358

gain ratio use, 340

greedy approach, 332

pessimistic pruning, 345

rule extraction, 358

See also decision tree induction

cannot-link constraints, 533

CART, 332, 385

cost complexity pruning algorithm, 345

Gini index use, 341

greedy approach, 332

See also decision tree induction

case updating, 404

case-based reasoning (CBR), 425–426

challenges, 426

categorical attributes, 41

CBA. See Classiﬁcation Based on Associations

CBLOF. See cluster-based local outlier factor

CELL method, 562, 563

cells, 10–11

aggregate, 189

ancestor, 189

base, 189

descendant, 189

HAN

22-ind-673-708-9780123814791

2011/6/1

3:27

Page 677

#5

Index

677

dimensional, 189

exceptions, 231

residual value, 234

central tendency measures, 39, 44, 45–47

mean, 45–46

median, 46–47

midrange, 47

for missing values, 88

models, 47

centroid distance, 108

CF-trees, 462–463, 464

nodes, 465

parameters, 464

structure illustration, 464

CHAID, 343

Chameleon, 459, 466–467

clustering illustration, 466

relative closeness, 467

relative interconnectivity, 466–467

See also hierarchical methods

Chernoff faces, 60

asymmetrical, 61

illustrated, 62

ChiMerge, 117

chi-square test, 95

chunking, 195

chunks, 195

2-D, 197

3-D, 197

computation of, 198

scanning order, 197

CLARA. See Clustering Large Applications

CLARANS. See Clustering Large Applications

based upon Randomized Search

class comparisons, 166, 175, 180

attribute-oriented induction for,

175–178

mining, 176

presentation of, 175–176

procedure, 175–176

class conditional independence, 350

class imbalance problem, 384–385, 386

ensemble methods for, 385

on multiclass tasks, 385

oversampling, 384–385, 386

threshold-moving approach, 385

undersampling, 384–385, 386

class label attributes, 328

class-based ordering, 357

class/concept descriptions, 15

classes, 15, 166

contrasting, 15

equivalence, 427

target, 15

classiﬁcation, 18, 327–328, 385

accuracy, 330

accuracy improvement techniques, 377–385

active learning, 433–434

advanced methods, 393–442

applications, 327

associative, 415, 416–419, 437

automatic, 445

backpropagation, 393, 398–408, 437

bagging, 379–380

basic concepts, 327–330

Bayes methods, 350–355

Bayesian belief networks, 393–397, 436

boosting, 380–382

case-based reasoning, 425–426

of class-imbalanced data, 383–385

confusion matrix, 365–366, 386

costs and beneﬁts, 373–374

decision tree induction, 330–350

discriminative frequent pattern-based, 437

document, 430

ensemble methods, 378–379

evaluation metrics, 364–370

example, 19

frequent pattern-based, 393, 415–422, 437

fuzzy set approaches, 428–429, 437

general approach to, 328

genetic algorithms, 426–427, 437

heterogeneous networks, 593

homogeneous networks, 593

IF-THEN rules for, 355–357

interpretability, 369

k-nearest-neighbor, 423–425

lazy learners, 393, 422–426

learning step, 328

model representation, 18

model selection, 364, 370–377

multiclass, 430–432, 437

in multimedia data mining, 596

neural networks for, 19, 398–408

pattern-based, 282, 318

perception-based, 348–350

precision measure, 368–369

as prediction problem, 328

process, 328

process illustration, 329

random forests, 382–383

recall measure, 368–369

robustness, 369

rough set approach, 427–428, 437

Yüklə 7,95 Mb.

Dostları ilə paylaş:

1 ... 327 328 329 330 331 332 333 334 ... 343