Data Mining. Concepts and Techniques, 3rd Edition

HAN 22-ind-673-708-9780123814791

Yüklə 7,95 Mb.

Pdf görüntüsü

səhifə	332/343
tarix	08.10.2017
ölçüsü	7,95 Mb.
	#3817

1 ... 328 329 330 331 332 333 334 335 ... 343

HAN

22-ind-673-708-9780123814791

2011/6/1

3:27

Page 678

#6

678

Index

classiﬁcation (Continued)

rule-based, 355–363, 386

scalability, 369

semi-supervised, 432–433, 437

sentiment, 434

spatial, 595

speed, 369

support vector machines (SVMs), 393,

408–415, 437

transfer learning, 434–436

tree pruning, 344–347, 385

web-document, 435

Classiﬁcation Based on Associations (CBA), 417

Classiﬁcation based on Multiple Association Rules

(CMAR), 417–418

Classiﬁcation based on Predictive Association Rules

(CPAR), 418–419

classiﬁcation-based outlier detection, 571–573, 582

one-class model, 571–572

semi-supervised learning, 572

See also outlier detection

classiﬁers, 328

accuracy, 330, 366

bagged, 379–380

Bayesian, 350, 353

case-based reasoning, 425–426

comparing with ROC curves, 373–377

comparison aspects, 369

decision tree, 331

error rate, 367

k-nearest-neighbor, 423–425

Naive Bayesian, 351–352

overﬁtting data, 330

performance evaluation metrics, 364–370

recognition rate, 366–367

rule-based, 355

Clementine, 603, 606

CLIQUE, 481–483

clustering steps, 481–482

effectiveness, 483

strategy, 481

See also cluster analysis; grid-based methods

closed data cubes, 192

closed frequent itemsets, 247, 308

example, 248

mining, 262–264

shortcomings for compression, 308–309

closed graphs, 591

closed patterns, 280

top-k most frequent, 307

closure checking, 263–264

cloud computing, 31

cluster analysis, 19–20, 443–495

advanced, 497–541

agglomerative hierarchical clustering,

459–461

applications, 444, 490

attribute types and, 446

as automatic classiﬁcation, 445

biclustering, 511, 512–519

BIRCH, 458, 462–466

Chameleon, 458, 466–467

CLIQUE, 481–483

clustering quality measurement, 484, 487–490

clustering tendency assessment, 484–486

constraint-based, 447, 497, 532–538

correlation-based, 511

as data redundancy technique, 108

as data segmentation, 445

DBSCAN, 471–473

DENCLUE, 476–479

density-based methods, 449, 471–479, 491

in derived space, 519–520

dimensionality reduction methods, 519–522

discretization by, 116

distance measures, 461–462

distance-based, 445

divisive hierarchical clustering, 459–461

evaluation, 483–490, 491

example, 20

expectation-maximization (EM) algorithm,

505–508

graph and network data, 497, 522–532

grid-based methods, 450, 479–483, 491

heterogeneous networks, 593

hierarchical methods, 449, 457–470, 491

high-dimensional data, 447, 497, 508–522

homogeneous networks, 593

in image recognition, 444

incremental, 446

interpretability, 447

k-means, 451–454

k-medoids, 454–457

k-modes, 454

in large databases, 445

as learning by observation, 445

low-dimensional, 509

methods, 448–451

multiple-phase, 458–459

number of clusters determination, 484, 486–487

OPTICS, 473–476

orthogonal aspects, 491

for outlier detection, 445

outlier detection and, 543

HAN

22-ind-673-708-9780123814791

2011/6/1

3:27

Page 679

#7

Index

679

partitioning methods, 448, 451–457, 491

pattern, 282, 308–310

probabilistic hierarchical clustering, 467–470

probability model-based, 497–508

PROCLUS, 511

requirements, 445–448, 490–491

scalability, 446

in search results organization, 444

spatial, 595

spectral, 519–522

as standalone tool, 445

STING, 479–481

subspace, 318–319, 448

subspace search methods, 510–511

taxonomy formation, 20

techniques, 443, 444

as unsupervised learning, 445

usability, 447

use of, 444

cluster computing, 31

cluster samples, 108–109

cluster-based local outlier factor (CBLOF), 569–570

clustering. See cluster analysis

clustering features, 462, 463, 464

Clustering Large Applications based upon

Randomized Search (CLARANS), 457

Clustering Large Applications (CLARA), 456–457

clustering quality measurement, 484t, 487–490

cluster completeness, 488

cluster homogeneity, 487–488

extrinsic methods, 487–489

intrinsic methods, 487, 489–490

rag bag, 488

silhouette coefﬁcient, 489–490

small cluster preservation, 488

clustering space, 448

clustering tendency assessment, 484–486

homogeneous hypothesis, 486

Hopkins statistic, 484–485

nonhomogeneous hypothesis, 486

nonuniform distribution of data, 484

See also cluster analysis

clustering with obstacles problem, 537

clustering-based methods, 552, 567–571

example, 553

See also outlier detection

clustering-based outlier detection, 567–571, 582

approaches, 567

distance to closest cluster, 568–569

ﬁxed-width clustering, 570

intrusion detection by, 569–570

objects not belonging to a cluster, 568

in small clusters, 570–571

weakness of, 571

clustering-based quantitative associations, 290–291

clusters, 66, 443, 444, 490

arbitrary shape, discovery of, 446

assignment rule, 497–498

completeness, 488

constraints on, 533

cuts and, 529–530

density-based, 472

determining number of, 484, 486–487

discovery of, 318

fuzzy, 499–501

graph clusters, ﬁnding, 528–529

on high-dimensional data, 509

homogeneity, 487–488

merging, 469, 470

ordering, 474–475, 477

pattern-based, 516

probabilistic, 502–503

separation of, 447

shapes, 471

small, preservation, 488

CMAR. See Classiﬁcation based on Multiple

Association Rules

CN2, 359, 363

collaborative recommender systems, 610, 617, 618

collective outlier detection, 548, 582

categories of, 576

contextual outlier detection versus, 575

on graph data, 576

structure discovery, 575

collective outliers, 575, 581

mining, 575–576

co-location patterns, 319, 595

colossal patterns, 302, 320

core descendants, 305, 306

core patterns, 304–305

illustrated, 303

mining challenge, 302–303

Pattern-Fusion mining, 302–307

combined signiﬁcance, 312

complete-linkage algorithm, 462

completeness

data, 84–85

data mining algorithm, 22

complex data types, 166

biological sequence data, 586, 590–591

graph patterns, 591–592

mining, 585–598, 625

networks, 591–592

in science applications, 612

Yüklə 7,95 Mb.

Dostları ilə paylaş:

1 ... 328 329 330 331 332 333 334 335 ... 343