HAN
22-ind-673-708-9780123814791
2011/6/1
3:27
Page 680
#8
680
Index
complex data types (Continued)
summary, 586
symbolic sequence data, 586, 588–590
time-series data, 586, 587–588
composite join indices, 162
compressed patterns, 281
mining, 307–312
mining by pattern clustering, 308–310
compression, 100, 120
lossless, 100
lossy, 100
theory, 601
computer science applications, 613
concept characterization, 180
concept comparison, 180
concept description, 166, 180
concept hierarchies, 142, 179
for generalizing data, 150
illustrated, 143, 144
implicit, 143
manual provision, 144
multilevel association rule mining with, 285
multiple, 144
for nominal attributes, 284
for specializing data, 150
concept hierarchy generation, 112, 113, 120
based on number of distinct values, 118
illustrated, 112
methods, 117–119
for nominal data, 117–119
with prespecified semantic connections, 119
schema, 119
conditional probability table (CPT), 394, 395–396
confidence, 21
association rule, 21
interval, 219–220
limits, 373
rule, 245, 246
conflict resolution strategy, 356
confusion matrix, 365–366, 386
illustrated, 366
connectionist learning, 398
consecutive rules, 92
Constrained Vector Quantization Error (CVQE)
algorithm, 536
constraint-based clustering, 447, 497, 532–538, 539
categorization of constraints and, 533–535
hard constraints, 535–536
methods, 535–538
soft constraints, 536–537
speeding up, 537–538
See also cluster analysis
constraint-based mining, 294–301, 320
interactive exploratory mining/analysis, 295
as mining trend, 623
constraint-based patterns/rules, 281
constraint-based sequential pattern mining, 589
constraint-guided mining, 30
constraints
antimonotonic, 298, 301
association rule, 296–297
cannot-link, 533
on clusters, 533
coherence, 535
conflicting, 535
convertible, 299–300
data, 294
data-antimonotonic, 300
data-pruning, 300–301, 320
data-succinct, 300
dimension/level, 294, 297
hard, 534, 535–536, 539
inconvertible, 300
on instances, 533, 539
interestingness, 294, 297
knowledge type, 294
monotonic, 298
must-link, 533, 536
pattern-pruning, 297–300, 320
rules for, 294
on similarity measures, 533–534
soft, 534, 536–537, 539
succinct, 298–299
content-based retrieval, 596
context indicators, 314
context modeling, 316
context units, 314
contextual attributes, 546, 573
contextual outlier detection, 546–547, 582
with identified context, 574
normal behavior modeling, 574–575
structures as contexts, 575
summary, 575
transformation to conventional outlier
detection, 573–574
contextual outliers, 545–547, 573, 581
example, 546, 573
mining, 573–575
contingency tables, 95
continuous attributes, 44
contrasting classes, 15, 180
initial working relations, 177
prime relation, 175, 177
convertible constraints, 299–300
HAN
22-ind-673-708-9780123814791
2011/6/1
3:27
Page 681
#9
Index
681
COP k-means algorithm, 536
core descendants, 305
colossal patterns, 306
merging of core patterns, 306
core patterns, 304–305
core ratio, 305
correlation analysis, 94
discretization by, 117
interestingness measures, 264
with lift, 266–267
nominal data, 95–96
numeric data, 96–97
redundancy and, 94–98
correlation coefficient, 94, 96
numeric data, 96–97
correlation rules, 265, 272
correlation-based clustering methods, 511
correlations, 18
cosine measure, 268
cosine similarity, 77
between two term-frequency vectors, 78
cost complexity pruning algorithm, 345
cotraining, 432–433
covariance, 94, 97
numeric data, 97–98
CPAR. See Classification based on Predictive
Association Rules
credit policy analysis, 608–609
CRM. See customer relationship management
crossover operation, 426
cross-validation, 370–371, 386
k-fold, 370
leave-one-out, 371
in number of clusters determination, 487
stratified, 371
cube gradient analysis, 321
cube shells, 192, 211
computing, 211
cube space
discovery-driven exploration, 231–234
multidimensional data analysis in, 227–234
prediction mining in, 227
subspaces, 228–229
cuboid trees, 205
cuboids, 137
apex, 111, 138, 158
base, 111, 137–138, 158
child, 193
individual, 190
lattice of, 139, 156, 179, 188–189,
234, 290
sparse, 190
subset selection, 160
See also data cubes
curse of dimensionality, 158, 179
customer relationship management (CRM),
619
customer retention analysis, 610
CVQE. See Constrained Vector Quantization Error
algorithm
cyber-physical systems (CPS), 596, 623–624
D
data
antimonotonicity, 300
archeology, 6
biological sequence, 586, 590–591
complexity, 32
conversion to knowledge, 2
cyber-physical system, 596
for data mining, 8
data warehouse, 13–15
database, 9–10
discrimination, 16
dredging, 6
generalizing, 150
graph, 14
growth, 2
linearly inseparable, 413–415
linearly separated, 409
multimedia, 14, 596
multiple sources, 15, 32
multivariate, 556
networked, 14
overfitting, 330
relational, 10
sample, 219
similarity and dissimilarity measures, 65–78
skewed, 47, 271
spatial, 14, 595
spatiotemporal, 595–596
specializing, 150
statistical descriptions, 44–56
streams, 598
symbolic sequence, 586, 588–589
temporal, 14
text, 14, 596–597
time-series, 586, 587
“tombs,” 5
training, 18
transactional, 13–14
types of, 33
web, 597–598
data auditing tools, 92
Dostları ilə paylaş: |