Data Mining. Concepts and Techniques, 3rd Edition

HAN 09-ch02-039-082-9780123814791

Yüklə 7,95 Mb.

Pdf görüntüsü

səhifə	40/343
tarix	08.10.2017
ölçüsü	7,95 Mb.
	#3817

1 ... 36 37 38 39 40 41 42 43 ... 343

Hierarchical Visualization Techniques

HAN

09-ch02-039-082-9780123814791

2011/6/1

3:15

Page 61

#23

2.3 Data Visualization

Sepal length (mm)

Petal length (mm)

Sepal width (mm)

Petal width (mm)

40 50 60 70 80

Iris Species

Setosa

Versicolor

Virginica

0

Figure 2.15

Visualization of the Iris data set using a scatter-plot matrix. Source: http://support.sas.com/

documentation/cdl/en/grstatproc/61948/HTML/default/images/gsgscmat.gif .

Viewing large tables of data can be tedious. By condensing the data, Chernoff faces

make the data easier for users to digest. In this way, they facilitate visualization of reg-

ularities and irregularities present in the data, although their power in relating multiple

relationships is limited. Another limitation is that speciﬁc data values are not shown.

Furthermore, facial features vary in perceived importance. This means that the similarity

of two faces (representing two multidimensional data points) can vary depending on the

order in which dimensions are assigned to facial characteristics. Therefore, this mapping

should be carefully chosen. Eye size and eyebrow slant have been found to be important.

Asymmetrical Chernoff faces were proposed as an extension to the original technique.

Since a face has vertical symmetry (along the y-axis), the left and right side of a face are

identical, which wastes space. Asymmetrical Chernoff faces double the number of facial

characteristics, thus allowing up to 36 dimensions to be displayed.

The stick ﬁgure visualization technique maps multidimensional data to ﬁve-piece

stick ﬁgures, where each ﬁgure has four limbs and a body. Two dimensions are mapped

to the display (x and y) axes and the remaining dimensions are mapped to the angle

HAN

09-ch02-039-082-9780123814791

2011/6/1

3:15

Page 62

#24

62

Chapter 2 Getting to Know Your Data

10

y

ϫ1

ϫ2

ϫ3

ϫ4

ϫ5

ϫ6

ϫ7

ϫ8

ϫ9 ϫ10

–5

–10

x

Figure 2.16

Here is a visualization that uses parallel coordinates. Source: www.stat.columbia.edu/∼cook/

movabletype/archives/2007/10/parallel coordi.thml.

Figure 2.17

Chernoff faces. Each face represents an n-dimensional data point (n ≤ 18).

and/or length of the limbs. Figure 2.18 shows census data, where age and income are

mapped to the display axes, and the remaining dimensions (gender, education, and so

on) are mapped to stick ﬁgures. If the data items are relatively dense with respect to

the two display dimensions, the resulting visualization shows texture patterns, reﬂecting

data trends.

HAN

09-ch02-039-082-9780123814791

2011/6/1

3:15

Page 63

#25

2.3 Data Visualization

63

income

age

Figure 2.18

Census data represented using stick ﬁgures. Source: Professor G. Grinstein, Department of

Computer Science, University of Massachusetts at Lowell.

2.3.4

Hierarchical Visualization Techniques

The visualization techniques discussed so far focus on visualizing multiple dimensions

simultaneously. However, for a large data set of high dimensionality, it would be difﬁ-

cult to visualize all dimensions at the same time. Hierarchical visualization techniques

partition all dimensions into subsets (i.e., subspaces). The subspaces are visualized in a

hierarchical manner.

“Worlds-within-Worlds,” also known as n-Vision, is a representative hierarchical

visualization method. Suppose we want to visualize a 6-D data set, where the dimensions

are F, X

...,X

. We want to observe how dimension F changes with respect to the other

dimensions. We can ﬁrst ﬁx the values of dimensions X

, X

to some selected values,

say, c

, c

. We can then visualize F, X

, X

using a 3-D plot, called a world, as shown in

Figure 2.19. The position of the origin of the inner world is located at the point

, c

)

in the outer world, which is another 3-D plot using dimensions X

, X

. A user can

interactively change, in the outer world, the location of the origin of the inner world.

The user then views the resulting changes of the inner world. Moreover, a user can vary

the dimensions used in the inner world and the outer world. Given more dimensions,

more levels of worlds can be used, which is why the method is called “worlds-within-

worlds.”

As another example of hierarchical visualization methods, tree-maps display hier-

archical data as a set of nested rectangles. For example, Figure 2.20 shows a tree-map

visualizing Google news stories. All news stories are organized into seven categories, each

shown in a large rectangle of a unique color. Within each category (i.e., each rectangle

at the top level), the news stories are further partitioned into smaller subcategories.

Yüklə 7,95 Mb.

Dostları ilə paylaş:

1 ... 36 37 38 39 40 41 42 43 ... 343