Mgs 8040 : Data Mining

Yüklə 45,41 Kb.
ölçüsü45,41 Kb.

MGS 8040: Data Mining

Syllabus for Fall 2014

Instructor: Dr. Satish Nargundkar 
Office: 827 College of Business 

Office Hours:  By appointment 
Phone: (678) 644 6838  

E-Mail :

CRN: 83254, Buckhead Center, Room 406

Thursday 4:30 – 7:00 PM

Prerequisites: MBA 7025 or equivalent or permission of instructor. You must already have knowledge of basic statistics, including Regression Analysis, to succeed in this course.

  1. Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, 3rd Edition, by Gordon Linoff and Michael Berry. ISBN-10: 0470650931

ISBN-13: 978-0470650936, Wiley.
The following optional books/sites may also be helpful.

  1. Making Sense of Data II by Glenn Myatt & Wayne Johnson, John Wiley& Sons, 2009.

  2. Multivariate Data Analysis by Hair, Anderson, Tatham, & Black, Prentice Hall.


  4. The Little SAS Book by Delwiche and Slaughter.

Course Catalog Description

This course covers various analytical techniques to extract managerial information from large data warehouses. A number of well-defined data mining tasks such as classification, estimation, prediction, affinity grouping and clustering, and data visualization are discussed. Design and implementation issues for corporate data warehousing are also addressed.

Detailed Course Description

Data mining supports decision making by detecting patterns, devising rules, identifying new decision alternatives and making predictions. This course is organized around a number of well-defined data mining tasks: description, classification, estimation, prediction, and affinity grouping and clustering. Students will learn to use techniques such as Rule Induction (classification trees), Logistic Regression, Discriminant Analysis, and Neural Networks. Data visualization techniques will be used whenever possible to reveal patterns and relationships. Students will use commercially available software tools to mine large databases. Team-based projects will be conducted.

The course is organized into 3 broad areas as follows:

  1. Context: Decision Support for Strategic/Tactical Decision-making. Data/Information Organization Data Warehouse Design.

  2. Exploratory Analysis: Segmentation Techniques

  3. Forecasting/Segmentation: Modeling Techniques, Transforming analysis into actions

Learning Outcomes/Course Objectives

Upon completion of the course, students will be able to:

  1. Explain in your own words a general framework for decision support within organizations.

  2. Discuss the sources of data, problems with data, and how to overcome them (Data Cleaning).

  3. Understand business requirements, organization structure, and how data mining projects may fit into a client’s organization to meet their decision support needs.

  4. Explain the data mining methodology; use it to analyze a dataset.

  5. Use visual techniques to describe data.

  6. Explain the assumptions of various techniques such as Cluster Analysis, Multiple Regression, Discriminant Analysis, Logistic Regression, and Artificial Neural Networks.

  7. Build multiple regression, discriminant analysis, and Logistic models for forecasting.

  8. Validate models using the Kolmogorov-Smirnov (K-S) test.

  9. Compare and Contrast Neural Networks with Statistical techniques.

  10. Interpret Classification trees.

  11. Use Interaction detection methods such as CART, CHAID, for classification.

  12. Segment data using Cluster Analysis, and interpret the output.

  13. Identify underlying factors using Factor Analysis, and interpret.

  14. Discuss issues of implementation of the results of various techniques.

  15. Develop methods to monitor the ongoing performance of implemented models.

Methods of Instruction:

The course will combine lectures and discussion, plus guest lectures from industry experts. The team-based project will be emphasized, and case studies will be discussed.


Course Average


Course Average




94-96, 97+

A, A+



Tests (2)






Team Project






Final Exam








Less than 60


Late work will get partial credit only, with 10% less for each day of delay.
Software: Students are encouraged to do project work in SAS or R in order to develop a marketable skill. You may choose other software (SPSS is available at GSU) if you wish. SAS will be discussed in class.
General Policies:

  1. Students are expected to attend each class (who knows, you may actually enjoy the class!), arrive on time and participate in class discussions.

  2. Turn off cell phones, pagers, stereos, TVs, etc. when in class. Treat the instructor and each other with courtesy.

Course Assessment:

Your constructive assessment of this course plays an indispensable role in shaping education at Georgia State. Upon completing the course, please take the time to fill out the online course evaluation.

MGS 8040 Data Mining Tentative Schedule – Fall 2014





Overview / Understanding Data

Week 1: 8/28

Introduction – DM Overview


Week 2: 9/4

Regression Review

Understanding Credit Data –

Equifax / Experian / Trans Union

Notes – Simple Regression

Notes – Multiple Regression


Review Regression Analysis Notes

Week 3: 9/11

The Initial Client Meeting

Notes – Initial Client Meeting

Hair Chapter 2

Sample Design Exercise

Solution to Exercise

Data Cleaning

1. Application – Dep. Var, Outcome, Sample time frame

Week 4: 9/18

Introduction to SAS
SAS Training at UCLA
Notes – Basic SAS Analysis
The Little SAS Book

By Delwiche & Slaughter

Data1 subset in Excel

2. SAS assignment
Folder Instructions

Week 5: 9/25

Guest Lecture: State of the art of Analytics and Big Data.

Bill Franks, Chief Analytics Officer, Teradata Corp.

Week 6: 10/2

Data Cleaning

Dummy Variable Definition

Class Handout

Data Warehouse introduction

Books by Edward Tufte.

Gallery of Data Visualization

WHO visualization

3. Crosstabs, Dummy decisions

Week 7: 10/9

Test 1


Week 8: 10/16

Discriminant Analysis

Validation – KS Test

SAS Programs for Reg/Scoring

Hair, Chapter 4

4. Discrim, KS
SAS Programs for Regression/Scoring

Week 9: 10/23

Guest Lecture: Logistic Regression and Classification Trees

Gregg Weldon, Chief Analytics Officer, Analytics IQ Inc.

Intro to Logistic Regression, Logistic Regression, Classification Trees

Week 10: 10/30

Effectiveness of models – A review of methods

Neural Networks

Excel file (demo of logic)

Research Paper on Model Effectivenss


Week 11: 11/6


Cluster Analysis

SPSS Output (Cluster)

Memory Based Reasoning

Hair, Cluster Analysis

Factor Analysis

Clustering Paper

5. Clustering

Project Progress Report (informal, oral)

Week 12: 11/13

Test 2

Week 13: 11/20

Project Presentations


Thanksgiving Break

Week 14: 12/4

Monitoring Reports Review

Project Reports Due [Guidelines]

Sample Final Project

Week 15: 12/11

Final Exam – Comprehensive – 4:15 – 6:45 PM.

Appendix A
MSA Program Goals and Objectives

Students completing the MS in Analytics will:

G1 Understand organizational problems in general and associated analytical problems in particular.

G2 Proficient in the management of data needed for decision-making.

G3 Proficient with the methodological skills needed for data-driven decision-making.

G4 Understand the implementation issues that accompany analytical problem solving.

G5 Be able to demonstrate the positive impact on analytics on organizations.


Objectives/Learning Outcomes (LO): After finishing the program students are expected to have mastered the knowledge and skills to carry out the following analytical tasks:


LO1 Frame Business Problems (G1)  MSA students will properly frame a business problem. 

LO2 Frame Analytical Problems (G1) MSA students will demonstrate the ability to properly solve analytical problems.

LO3 Data Management (G2)  MSA students will effectively acquire, clean, and manage both structured and unstructured data. 

LO4 Methodology (G3) MSA students will identify and apply the appropriate methodology for the business and analytical problem(s) identified.  

LO5 Modeling (G3, G4).  MSA students will build and deploy analytical models across organizations that fit the underlying organizational needs and the analytical problem(s) identified.

LO6 Programming (G4). MSA students will solve analytical problems by utilizing computer programming, both by employing available tools where possible and by developing customized solutions where necessary.

LO7  Life Cycle Management. (G3, G4). MSA student(s) will develop adaptable models that allow for continued organizational improvement of productivity and quality 

LO8  Organizational Impact (G5) MSA student(s) will effectively communicate the positive, strategic impact of a model on the firm to which it is being applied.

Dostları ilə paylaş:

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2017
rəhbərliyinə müraciət

    Ana səhifə