HAN
05-pref-xxiii-xxx-9780123814791
2011/6/1
3:35
Page xxvii
#5
Preface xxvii
Chapter 1.
Introduction
Chapter 2.
Getting to
Know Your
Data
Chapter 3.
Data
Preprocessing
Chapter 6.
Mining
Frequent
Patterns, ....
Basic
Concepts ...
Chapter 8.
Classification:
Basic Concepts
Chapter 10.
Cluster
Analysis: Basic
Concepts and
Methods
Figure P.1
A suggested sequence of chapters for a short introductory course.
Depending on the length of the instruction period, the background of students, and
your interests, you may select subsets of chapters to teach in various sequential order-
ings. For example, if you would like to give only a short introduction to students on data
mining, you may follow the suggested sequence in Figure P.1. Notice that depending on
the need, you can also omit some sections or subsections in a chapter if desired.
Depending on the length of the course and its technical scope, you may choose to
selectively add more chapters to this preliminary sequence. For example, instructors
who are more interested in advanced classification methods may first add “Chapter 9.
Classification: Advanced Methods”; those more interested in pattern mining may choose
to include “Chapter 7. Advanced Pattern Mining”; whereas those interested in OLAP
and data cube technology may like to add “Chapter 4. Data Warehousing and Online
Analytical Processing” and “Chapter 5. Data Cube Technology.”
Alternatively, you may choose to teach the whole book in a two-course sequence that
covers all of the chapters in the book, plus, when time permits, some advanced topics
such as graph and network mining. Material for such advanced topics may be selected
from the companion chapters available from the book’s web site, accompanied with a
set of selected research papers.
Individual chapters in this book can also be used for tutorials or for special topics in
related courses, such as machine learning, pattern recognition, data warehousing, and
intelligent data analysis.
Each chapter ends with a set of exercises, suitable as assigned homework. The exer-
cises are either short questions that test basic mastery of the material covered, longer
questions that require analytical thinking, or implementation projects. Some exercises
can also be used as research discussion topics. The bibliographic notes at the end of each
chapter can be used to find the research literature that contains the origin of the concepts
and methods presented, in-depth treatment of related topics, and possible extensions.
To the Student
We hope that this textbook will spark your interest in the young yet fast-evolving field of
data mining. We have attempted to present the material in a clear manner, with careful
explanation of the topics covered. Each chapter ends with a summary describing the
main points. We have included many figures and illustrations throughout the text to
make the book more enjoyable and reader-friendly. Although this book was designed as
a textbook, we have tried to organize it so that it will also be useful to you as a reference
HAN
05-pref-xxiii-xxx-9780123814791
2011/6/1
3:35
Page xxviii
#6
xxviii
Preface
book or handbook, should you later decide to perform in-depth research in the related
fields or pursue a career in data mining.
What do you need to know to read this book?
You should have some knowledge of the concepts and terminology associated with
statistics, database systems, and machine learning. However, we do try to provide
enough background of the basics, so that if you are not so familiar with these fields
or your memory is a bit rusty, you will not have trouble following the discussions in
the book.
You should have some programming experience. In particular, you should be able to
read pseudocode and understand simple data structures such as multidimensional
arrays.
To the Professional
This book was designed to cover a wide range of topics in the data mining field. As a
result, it is an excellent handbook on the subject. Because each chapter is designed to be
as standalone as possible, you can focus on the topics that most interest you. The book
can be used by application programmers and information service managers who wish
to learn about the key ideas of data mining on their own. The book would also be useful
for technical data analysis staff in banking, insurance, medicine, and retailing industries
who are interested in applying data mining solutions to their businesses. Moreover, the
book may serve as a comprehensive survey of the data mining field, which may also
benefit researchers who would like to advance the state-of-the-art in data mining and
extend the scope of data mining applications.
The techniques and algorithms presented are of practical utility. Rather than selecting
algorithms that perform well on small “toy” data sets, the algorithms described in the
book are geared for the discovery of patterns and knowledge hidden in large, real data
sets. Algorithms presented in the book are illustrated in pseudocode. The pseudocode
is similar to the C programming language, yet is designed so that it should be easy to
follow by programmers unfamiliar with C or C++. If you wish to implement any of the
algorithms, you should find the translation of our pseudocode into the programming
language of your choice to be a fairly straightforward task.
Book Web Sites with Resources
The book has a web site at www.cs.uiuc.edu/∼hanj/bk3 and another with Morgan Kauf-
mann Publishers at www.booksite.mkp.com/datamining3e. These web sites contain many
supplemental materials for readers of this book or anyone else with an interest in data
mining. The resources include the following:
Slide presentations for each chapter. Lecture notes in Microsoft PowerPoint slides
are available for each chapter.