HAN
22-ind-673-708-9780123814791
2011/6/1
3:27
Page 690
#18
690
Index
icon-based visualization (Continued)
stick figure technique, 61–63
See also data visualization
ID3, 332, 385
greedy approach, 332
information gain, 336
See also decision tree induction
IF-THEN rules, 355–357
accuracy, 356
conflict resolution strategy, 356
coverage, 356
default rule, 357
extracting from decision tree, 357
form, 355
rule antecedent, 355
rule consequent, 355
rule ordering, 357
satisfied, 356
triggered, 356
illustrated, 149
image data analysis, 319
imbalance problem, 367
imbalance ratio (IR), 270
skewness, 271
inconvertible constraints, 300
incremental data mining, 31
indexes
bitmapped join, 163
composite join, 162
Gini, 332, 341–343
inverted, 212, 213
indexing
bitmap, 160–161, 179
bitmapped join, 179
frequent pattern mining for, 319
join, 161–163, 179
OLAP, 160–163
inductive databases, 601
inferential statistics, 24
information age, moving toward, 1–2
information extraction systems, 430
information gain, 336–340
decision tree induction using, 338–339
ID3 use of, 336
pattern frequency support versus, 421
single feature plot, 420
split-point, 340
information networks
analysis, 592–593
evolution of, 594
link prediction in, 593–594
mining, 623
OLAP in, 594
role discovery in, 593–594
similarity search in, 594
information processing, 153
information retrieval (IR), 26–27
challenges, 27
language model, 26
topic model, 26–27
informativeness model, 535
initial working relations, 168, 169, 177
instance-based learners. See lazy learners
instances, constraints on, 533, 539
integrated data warehouses, 126
integrators, 127
intelligent query answering, 618
interactive data mining, 604, 607
interactive mining, 30
intercuboid query expansion, 221
example, 224–225
method, 223–224
interdimensional association rules, 288
interestingness, 21–23
assessment methods, 23
components of, 21
expected, 22
objective measures, 21–22
strong association rules, 264–265
subjective measures, 22
threshold, 21–22
unexpected, 22
interestingness constraints, 294
application of, 297
interpretability
backpropagation and, 406–408
classification, 369
cluster analysis, 447
data, 85
data quality and, 85
probabilistic hierarchical clustering,
469
interquartile range (IQR), 49, 555
interval-scaled attributes, 43, 79
intracuboid query expansion, 221
example, 223
method, 221–223
value usage, 222
intradimensional association rules, 287
intrusion detection, 569–570
anomaly-based, 614
data mining algorithms, 614–615
discriminative classifiers, 615
distributed data mining, 615
HAN
22-ind-673-708-9780123814791
2011/6/1
3:27
Page 691
#19
Index
691
signature-based, 614
stream data analysis, 615
visualization and query tools, 615
inverted indexes, 212, 213
invisible data mining, 33, 618–620, 625
IQR. See Interquartile range
IR. See information retrieval
item merging, 263
item skipping, 263
items, 13
itemsets, 246
candidate, 251, 252
dependent, 266
dynamic counting, 256
imbalance ratio (IR), 270, 271
negatively correlated, 292
occurrence independence, 266
strongly negatively correlated, 292
See also frequent itemsets
iterative Pattern-Fusion, 306
iterative relocation techniques, 448
J
Jaccard coefficient, 71
join indexing, 161–163, 179
K
k-anonymity method, 621–622
Karush-Kuhn-Tucker (KKT) conditions, 412
k-distance neighborhoods, 565
kernel density estimation, 477–478
kernel function, 415
k-fold cross-validation, 370–371
k-means, 451–454
algorithm, 452
application of, 454
CLARANS, 457
within-cluster variation, 451, 452
clustering by, 453
drawback of, 454–455
functioning of, 452
scalability, 454
time complexity, 453
variants, 453–454
k-means clustering, 536
k-medoids, 454–457
absolute-error criterion, 455
cost function for, 456
PAM, 455–457
k-nearest-neighbor classification, 423
closeness, 423
distance-based comparisons, 425
editing method, 425
missing values and, 424
number of neighbors, 424–425
partial distance method, 425
speed, 425
knowledge
background, 30–31
mining, 29
presentation, 8
representation, 33
transfer, 434
knowledge bases, 5, 8
knowledge discovery
data mining in, 7
process, 8
knowledge discovery from data (KDD), 6
knowledge extraction. See data mining
knowledge mining. See data mining
knowledge type constraints, 294
k-predicate sets, 289
Kulczynski measure, 268, 272
negatively correlated pattern based on, 293–294
L
language model, 26
Laplacian correction, 355
lattice of cuboids, 139, 156, 179, 188–189, 234
lazy learners, 393, 422–426, 437
case-based reasoning classifiers, 425–426
k-nearest-neighbor classifiers, 423–425
l-diversity method, 622
learning
active, 430, 433–434, 437
backpropagation, 400
as classification step, 328
connectionist, 398
by examples, 445
by observation, 445
rate, 397
semi-supervised, 572
supervised, 330
transfer, 430, 434–436, 438
unsupervised, 330, 445, 490
learning rates, 403–404
leave-one-out, 371
lift, 266, 272
correlation analysis with, 266–267
likelihood ratio statistic, 363
linear regression, 90, 105
multiple, 106
linearly, 412–413
linearly inseparable data, 413–415
Dostları ilə paylaş: |