HAN
22-ind-673-708-9780123814791
2011/6/1
3:27
Page 692
#20
692
Index
link mining, 594
link prediction, 594
load, in back-end tools/utilities, 134
loan payment prediction, 608–609
local outlier factor, 566–567
local proximity-based outliers, 564–565
logistic function, 402
log-linear models, 106
lossless compression, 100
lossy compression, 100
lower approximation, 427
M
machine learning, 24–26
active, 25
data mining similarities, 26
semi-supervised, 25
supervised, 24
unsupervised, 25
Mahalanobis distance, 556
majority voting, 335
Manhattan distance, 72–73
MaPle, 519
margin, 410
market basket analysis, 244–246, 271–272
example, 244
illustrated, 244
Markov chains, 591
materialization
full, 159, 179, 234
iceberg cubes, 319
no, 159
partial, 159–160, 192, 234
semi-offline, 226
max patterns, 280
max confidence
measure, 268, 272
maximal frequent itemsets, 247, 308
example, 248
mining, 262–264
shortcomings for compression, 308–309
maximum marginal hyperplane (MMH), 409
SVM finding, 412
maximum normed residual test, 555
mean, 39, 45
bin, smoothing by, 89
example, 45
for missing values, 88
trimmed, 46
weighted arithmetic, 45
measures, 145
accuracy-based, 369
algebraic, 145
all confidence
, 272
antimonotonic, 194
attribute selection, 331
categories of, 145
of central tendency, 39, 44, 45–47
correlation, 266
data cube, 145
dispersion, 48–51
distance, 72–74, 461–462
distributive, 145
holistic, 145
Kulczynski, 272
max confidence
, 272
of multidimensional databases, 146
null-invariant, 272
pattern evaluation, 267–271
precision, 368–369
proximity, 67, 68–72
recall, 368–369
sensitivity, 367
significance, 312
similarity/dissimilarity, 65–78
specificity, 367
median, 39, 46
bin, smoothing by, 89
example, 46
formula, 46–47
for missing values, 88
metadata, 92, 134, 178
business, 135
importance, 135
operational, 135
repositories, 134–135
metarule-guided mining
of association rules, 295–296
example, 295–296
metrics, 73
classification evaluation, 364–370
microeconomic view, 601
midrange, 47
MineSet, 603, 605
minimal interval size, 116
minimal spanning tree algorithm, 462
minimum confidence threshold, 18, 245
Minimum Description Length (MDL), 343–344
minimum support threshold, 18, 190
association rules, 245
count, 246
Minkowski distance, 73
min-max normalization, 114
missing values, 88–89
mixed-effect models, 600
HAN
22-ind-673-708-9780123814791
2011/6/1
3:27
Page 693
#21
Index
693
mixture models, 503, 538
EM algorithm for, 507–508
univariate Gaussian, 504
mode, 39, 47
example, 47
model selection, 364
with statistical tests of significance, 372–373
models, 18
modularity
of clustering, 530
use of, 539
MOLAP. See multidimensional OLAP
monotonic constraints, 298
motifs, 587
moving-object data mining, 595–596, 623–624
multiclass classification, 430–432, 437
all-versus-all (AVA), 430–431
error-correcting codes, 431–432
one-versus-all (OVA), 430
multidimensional association rules, 17, 283,
288, 320
hybrid-dimensional, 288
interdimensional, 288
mining, 287–289
mining with static discretization of quantitative
attributes, 288
with no repeated predicates, 288
See also association rules
multidimensional data analysis
in cube space, 227–234
in multimedia data mining, 596
spatial, 595
of top-k results, 226
multidimensional data mining, 11–13, 34 155–156,
179, 187, 227, 235
data cube promotion of, 26
dimensions, 33
example, 228–229
retail industry, 610
multidimensional data model, 135–146, 178
data cube as, 136–139
dimension table, 136
dimensions, 142–144
fact constellation, 141–142
fact table, 136
snowflake schema, 140–141
star schema, 139–140
multidimensional databases
measures of, 146
querying with starnet model, 149–150
multidimensional histograms, 108
multidimensional OLAP (MOLAP), 132, 164, 179
multifeature cubes, 227, 230, 235
complex query support, 231
examples, 230–231
multilayer feed-forward neural networks,
398–399
example, 405
illustrated, 399
layers, 399
units, 399
multilevel association rules, 281, 283, 284, 320
ancestors, 287
concept hierarchies, 285
dimensions, 281
group-based support, 286
mining, 283–287
reduced support, 285, 286
redundancy, checking, 287
uniform support, 285–286
multimedia data, 14
multimedia data analysis, 319
multimedia data mining, 596
multimodal, 47
multiple linear regression, 90, 106
multiple sequence alignment, 590
multiple-phase clustering, 458–459
multitier data warehouses, 134
multivariate outlier detection, 556
with Mahalanobis distance, 556
with multiple clusters, 557
with multiple parametric distributions, 557
with
χ
2
-static, 556
multiway array aggregation, 195, 235
for full cube computation, 195–199
minimum memory requirements, 198
must-link constraints, 533, 536
mutation operator, 426
mutual information, 315–316
mutually exclusive rules, 358
N
naive Bayesian classification, 385
class label prediction with, 353–355
functioning of, 351–352
nearest-neighbor clustering algorithm, 461
near-match patterns/rules, 281
negative correlation, 55, 56
negative patterns, 280, 283, 320
example, 291–292
mining, 291–294
negative transfer, 436
negative tuples, 364
negatively skewed data, 47
Dostları ilə paylaş: |