With effect from the academic year 2015-16
IT 322
DATA WAREHOUSING AND DATA MINING
Instruction per week 4 Periods
Duration of End - Semester Examination 3 Hours
End - Semester Examination 75 Marks
Sessional 25 Marks Credits 3
Course Objectives:
-
To introduce the basic concepts of Data Warehouse and Data Mining techniques.
-
Examine the types of the data to be mined and apply preprocessing methods on raw data.
-
Discover interesting patterns, analyze supervised and unsupervised models and estimate the accuracy of the algorithms.
Course Outcomes:
Students who complete this course should be able to
-
Process raw data to make it suitable for various data mining algorithms.
-
Discover and measure interesting patterns from different kinds of databases.
-
Apply the techniques of clustering, classification, association finding, feature selection and visualization to real world data.
Prerequisites:
Basic Programming, Mathematics-Statistics, Database Concepts
UNIT-I
Introduction: Introduction to Data Mining, Data Mining Functionalities, Classification of Data Mining Systems, Major Issues in Data Mining.
Getting to know your data: Data Objects and Attribute Types, Basic Statistical Descriptions of Data, Measuring Data Similarity and Dissimilarity.
Data Preprocessing: An Overview, Data Cleaning, Data Integration, Data Reduction, Data Transformation and Data Discretization.
UNIT-II
DataWarehousing and Online Analytical Processing
DataWarehouse: Basic Concepts, DataWarehouse Modeling: Data Cube and OLAP, DataWarehouse Design and Usage: A Business Analysis Framework for Data Warehouse Design, Data Warehouse Design Process, Data Warehouse Usage for Information Processing, DataWarehouse Implementation.
Mining Frequent Patterns, Associations and correlations: Basic Concepts, Frequent Item Set Mining Methods, Interesting patterns, Pattern Evaluation Methods, Pattern Mining in Multilevel and multidimensional space.
UNIT-III
Classification: Basic Concepts, Decision Tree Induction, Bayes Classification Methods, Rule-Based Classification, Model Evaluation and Selection, Techniques to Improve Classification Accuracy: Introducing Ensemble Methods, Bagging, Boosting and AdaBoost.
Classification: Advanced Methods
Bayesian Belief Networks, Classification by Back propagation, Support Vector Machines, Lazy Learners (or Learning from Your Neighbors), Other Classification Methods.
UNIT-IV
Cluster Analysis: Basic Concepts and Methods, Overview of Basic Clustering Methods, Partitioning Methods, Hierarchical Methods: Agglomerative versus Divisive Hierarchical Clustering, Distance Measures in Algorithmic Methods, BIRCH: Multiphase Hierarchical Clustering Using Clustering Feature Trees.
Density-Based Methods: DBSCAN: Density-Based Clustering Based on Connected Regions with High Density, OPTICS: Ordering Points to Identify the Clustering Structure,Grid-Based Methods.
Evaluation of Clustering: Assessing Clustering Tendency, Determining the Number of Clusters, Measuring Clustering Quality.
UNIT-V
Outlier Detection: Outliers and Outlier Analysis, Outlier Detection Methods, Statistical Approaches, Proximity-Based Approaches
Data Mining Trends and Research Frontiers:
Mining Complex Data Types: Mining Sequence Data: Time-Series, Symbolic Sequences and Biological Sequences, Mining Other Kinds of Data, Data Mining Applications, Data Mining and Society, Data Mining Trends.
Text Book:
-
Han J & Kamber M, “Data Mining: Concepts and Techniques”, Third Edition, Elsevier, 2011.
Suggested Reading:
-
Pang-Ning Tan, Michael Steinback, Vipin Kumar, “Introduction to Data Mining”, Pearson Education, 2008.
-
M.Humphires, M.Hawkins, M.Dy,“Data Warehousing: Architecture and Implementation”, Pearson Education, 2009.
-
Anahory, Murray, “Data Warehousing in the Real World”, Pearson Education, 2008.
-
Kargupta, Joshi,etc., “Data Mining: Next Generation Challenges and Future Directions”, Prentice Hall of India Pvt Ltd, 2007.
Dostları ilə paylaş: |