HAN
20-ch13-585-632-9780123814791
2011/6/1
3:26
Page 630
#46
630
Chapter 13 Data Mining Trends and Research Frontiers
Shim [NRS99]; and Za¨ıane, Han, and Zhu [ZHZ00]). An overview of image mining
methods is given by Hsu, Lee, and Zhang [HLZ02].
Text data analysis has been studied extensively in information retrieval, with
many textbooks and survey articles such as Croft, Metzler, and Strohman [CMS09];
S. Buttcher, C. Clarke, G. Cormack [BCC10]; Manning, Raghavan, and Schutze
[MRS08]; Grossman and Frieder [GR04]; Baeza-Yates and Riberio-Neto [BYRN11];
Zhai [Zha08]; Feldman and Sanger [FS06]; Berry [Ber03]; and Weiss, Indurkhya, Zhang,
and Damerau [WIZD04]. Text mining is a fast-developing field with numerous papers
published in recent years, covering many topics such as topic models (e.g., Blei and
Lafferty [BL09]); sentiment analysis (e.g., Pang and Lee [PL07]); and contextual text
mining (e.g., Mei and Zhai [MZ06]).
Web mining is another focused theme, with books like Chakrabarti [Cha03a], Liu
[Liu06], and Berry [Ber03]. Web mining has substantially improved search engines with
a few influential milestone works, such as Brin and Page [BP98]; Kleinberg [Kle99];
Chakrabarti, Dom, Kumar, et al. [CDK
+
99]; and Kleinberg and Tomkins [KT99].
Numerous results have been generated since then, such as search log mining (e.g.,
Silvestri [Sil10]); blog mining (e.g., Mei, Liu, Su, and Zhai [MLSZ06]); and mining
online forums (e.g., Cong, Wang, Lin, et al. [CWL
+
08]).
Books and surveys on stream data systems and stream data processing include Babu
and Widom [BW01]; Babcock, Babu, Datar, et al. [BBD
+
02]; Muthukrishnan [Mut05];
and Aggarwal [Agg06].
Stream data mining research covers stream cube models (e.g., Chen, Dong, Han,
et al. [CDH
+
02]), stream frequent pattern mining (e.g., Manku and Motwani [MM02]
and Karp, Papadimitriou and Shenker [KPS03]), stream classification (e.g., Domingos
and Hulten [DH00]; Wang, Fan, Yu, and Han [WFYH03]; Aggarwal, Han, Wang, and
Yu [AHWY04b]), and stream clustering (e.g., Guha, Mishra, Motwani, and O’Callaghan
[GMMO00] and Aggarwal, Han, Wang, and Yu [AHWY03]).
There are many books that discuss data mining applications. For financial data
analysis and financial modeling, see, for example, Benninga [Ben08] and Higgins
[Hig08]. For retail data mining and customer relationship management, see, for exam-
ple, books by Berry and Linoff [BL04] and Berson, Smith, and Thearling [BST99]. For
telecommunication-related data mining, see, for example, Horak [Hor08]. There are
also books on scientific data analysis, such as Grossman, Kamath, Kegelmeyer, et al.
[GKK
+
01] and Kamath [Kam09].
Issues in the
theoretical foundations of data mining have been addressed by many
researchers. For example, Mannila presents a summary of studies on the foundations of
data mining in [Man00]. The data reduction view of data mining is summarized in The
New Jersey Data Reduction Report by Barbar´a, DuMouchel, Faloutos, et al. [BDF
+
97].
The data compression view can be found in studies on the minimum description length
principle, such as Grunwald and Rissanen [GR07].
The pattern discovery point of view of data mining is addressed in numerous
machine learning and data mining studies, ranging from association mining, to deci-
sion tree induction, sequential pattern mining, clustering, and so on. The probability
theory point of view is popular in the statistics and machine learning literature, such
HAN
20-ch13-585-632-9780123814791
2011/6/1
3:26
Page 631
#47
13.8 Bibliographic Notes
631
as Bayesian networks and hierarchical Bayesian models in Chapter 9, and probabilis-
tic graph models (e.g., Koller and Friedman [KF09]). Kleinberg, Papadimitriou, and
Raghavan [KPR98] present a microeconomic view, treating data mining as an optimiza-
tion problem. Studies on the inductive database view include Imielinski and Mannila
[IM96] and de Raedt, Guns, and Nijssen [RGN10].
Statistical methods for data analysis are described in many books, such as
Hastie, Tibshirani, Friedman [HTF09]; Freedman, Pisani, and Purves [FPP07]; Devore
[Dev03]; Kutner, Nachtsheim, Neter, and Li [KNNL04]; Dobson [Dob01]; Breiman,
Friedman, Olshen, and Stone [BFOS84]; Pinheiro and Bates [PB00]; Johnson and
Wichern [JW02b]; Huberty [Hub94]; Shumway and Stoffer [SS05]; and Miller [Mil98].
For visual data mining, popular books on the visual display of data and information
include those by Tufte [Tuf90, Tuf97, Tuf01]. A summary of techniques for visualizing
data is presented in Cleveland [Cle93]. A dedicated visual data mining book, Visual
Data Mining: Techniques and Tools for Data Visualization and Mining, is by Soukup and
Davidson [SD02]. The book Information Visualization in Data Mining and Knowledge
Discovery, edited by Fayyad, Grinstein, and Wierse [FGW01], contains a collection of
articles on visual data mining methods.
Ubiquitous and invisible data mining has been discussed in many texts including
John [Joh99], and some articles in a book edited by Kargupta, Joshi, Sivakumar, and
Yesha [KJSY04]. The book Business @ the Speed of Thought: Succeeding in the Digital
Economy by Gates [Gat00] discusses e-commerce and customer relationship manage-
ment, and provides an interesting perspective on data mining in the future. Mena
[Men03] has an informative book on the use of data mining to detect and prevent
crime. It covers many forms of criminal activities, ranging from fraud detection, money
laundering, insurance crimes, identity crimes, and intrusion detection.
Data mining issues regarding privacy and data security are addressed popularly
in literature. Books on privacy and security in data mining include Thuraisingham
[Thu04]; Aggarwal and Yu [AY08]; Vaidya, Clifton, and Zhu [VCZ10]; and Fung,
Wang, Fu, and Yu [FWFY10]. Research articles include Agrawal and Srikant [AS00];
Evfimievski, Srikant, Agrawal, and Gehrke [ESAG02]; and Vaidya and Clifton [VC03].
Differential privacy was introduced by Dwork [Dwo06] and studied by many such as
Hay, Rastogi, Miklau, and Suciu [HRMS10].
There have been many discussions on trends and research directions of data min-
ing in various forums. Several books are collections of articles on these issues such as
Kargupta, Han, Yu, et al. [KHY
+
08].