Data Mining. Concepts and Techniques, 3rd Edition

HAN 04-fore-xix-xxii-9780123814791

Yüklə 7,95 Mb.

Pdf görüntüsü

səhifə	7/343
tarix	08.10.2017
ölçüsü	7,95 Mb.
	#3817

1 2 3 4 5 6 7 8 9 10 ... 343

HAN

04-fore-xix-xxii-9780123814791

2011/6/1

3:32

Page xix

#1

Foreword

Analyzing large amounts of data is a necessity. Even popular science books, like “super

crunchers,” give compelling cases where large amounts of data yield discoveries and

intuitions that surprise even experts. Every enterprise beneﬁts from collecting and ana-

lyzing its data: Hospitals can spot trends and anomalies in their patient records, search

engines can do better ranking and ad placement, and environmental and public health

agencies can spot patterns and abnormalities in their data. The list continues, with

cybersecurity and computer network intrusion detection; monitoring of the energy

consumption of household appliances; pattern analysis in bioinformatics and pharma-

ceutical data; ﬁnancial and business intelligence data; spotting trends in blogs, Twitter,

and many more. Storage is inexpensive and getting even less so, as are data sensors. Thus,

collecting and storing data is easier than ever before.

The problem then becomes how to analyze the data. This is exactly the focus of this

Third Edition of the book. Jiawei, Micheline, and Jian give encyclopedic coverage of all

the related methods, from the classic topics of clustering and classiﬁcation, to database

methods (e.g., association rules, data cubes) to more recent and advanced topics (e.g.,

SVD/PCA, wavelets, support vector machines).

The exposition is extremely accessible to beginners and advanced readers alike. The

book gives the fundamental material ﬁrst and the more advanced material in follow-up

chapters. It also has numerous rhetorical questions, which I found extremely helpful for

maintaining focus.

We have used the ﬁrst two editions as textbooks in data mining courses at Carnegie

Mellon and plan to continue to do so with this Third Edition. The new version has

signiﬁcant additions: Notably, it has more than 100 citations to works from 2006

onward, focusing on more recent material such as graphs and social networks, sen-

sor networks, and outlier detection. This book has a new section for visualization, has

expanded outlier detection into a whole chapter, and has separate chapters for advanced

xix

HAN

04-fore-xix-xxii-9780123814791

2011/6/1

3:32

Page xx

#2

xx

Foreword

methods—for example, pattern mining with top-k patterns and more and clustering

methods with biclustering and graph clustering.

Overall, it is an excellent book on classic and modern data mining methods, and it is

ideal not only for teaching but also as a reference book.

Christos Faloutsos

Carnegie Mellon University

HAN

04-fore-xix-xxii-9780123814791

2011/6/1

3:32

Page xxi

#3

Foreword to Second Edition

We are deluged by data—scientiﬁc data, medical data, demographic data, ﬁnancial data,

and marketing data. People have no time to look at this data. Human attention has

become the precious resource. So, we must ﬁnd ways to automatically analyze the

data, to automatically classify it, to automatically summarize it, to automatically dis-

cover and characterize trends in it, and to automatically ﬂag anomalies. This is one

of the most active and exciting areas of the database research community. Researchers

in areas including statistics, visualization, artiﬁcial intelligence, and machine learning

are contributing to this ﬁeld. The breadth of the ﬁeld makes it difﬁcult to grasp the

extraordinary progress over the last few decades.

Six years ago, Jiawei Han’s and Micheline Kamber’s seminal textbook organized and

presented Data Mining. It heralded a golden age of innovation in the ﬁeld. This revision

of their book reﬂects that progress; more than half of the references and historical notes

are to recent work. The ﬁeld has matured with many new and improved algorithms, and

has broadened to include many more datatypes: streams, sequences, graphs, time-series,

geospatial, audio, images, and video. We are certainly not at the end of the golden age—

indeed research and commercial interest in data mining continues to grow—but we are

all fortunate to have this modern compendium.

The book gives quick introductions to database and data mining concepts with

particular emphasis on data analysis. It then covers in a chapter-by-chapter tour the

concepts and techniques that underlie classiﬁcation, prediction, association, and clus-

tering. These topics are presented with examples, a tour of the best algorithms for each

problem class, and with pragmatic rules of thumb about when to apply each technique.

The Socratic presentation style is both very readable and very informative. I certainly

learned a lot from reading the ﬁrst edition and got re-educated and updated in reading

the second edition.

Jiawei Han and Micheline Kamber have been leading contributors to data mining

research. This is the text they use with their students to bring them up to speed on

xxi

HAN

04-fore-xix-xxii-9780123814791

2011/6/1

3:32

Page xxii

#4

xxii

Foreword to Second Edition

the ﬁeld. The ﬁeld is evolving very rapidly, but this book is a quick way to learn the

basic ideas, and to understand where the ﬁeld is today. I found it very informative and

stimulating, and believe you will too.

Jim Gray

In his memory

Yüklə 7,95 Mb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8 9 10 ... 343