مستودعات البيانات والتنقيب في البيانات Dataware houses and Data Mining

dwikatmo's picture
Course Code: 
676440
Course Outline: 
An Najah National University

An Najah National University

Information Technology Faculty – CIS Department

 

Course title and number

 Data Mining 133467

Instructor(s) name(s)

Mohammed Abdel Khaliq Dwikat

Contact information

(dwikatmo@najah.edu, office: 14G2260, phone ext: 2259)

Semester and academic year

First Semester 2011/2012

Compulsory / Elective

 Elective

Prerequisites

 131353

Course

Contents

(description)

 

This course aims to introduce students to methods used in data mining. It is intended to enable the student prepare and analyze data using predictive and descriptive methods, understand and analyze the results.

 

Course Objectives

 

 

Intended learning

Outcomes and

Competences

 

At the end of this course students should be able to;

1- understand the Data Mining purpose, Data Marts, Data Analysis

2- Know Steps to perform Data Mining

3- detect outliers, extreme values, missing values, clean the data

4- use predictive and descriptive methods

5-analyze data using decision trees

6- analyze data using clustering

7- analyze data using association

8- self confidence

Textbook and  References

(Online Resources)

1.      (Compulsory) Discovering Knowledge in Data, an Introduction to Data Mining,Daniel T. Larose, 2005 , John Wiley & Sons

2.      Data Mining Concepts, Models, Methods & Algorithms, Mehmed Kantardzic.

3.      Advance Data Mining & Techniques, Olson David & Durson Delen.

4.      Microsoft Excel 2010, WEKA, SPSS.

Assessment Criteria

Activity

Percent (%)

First Exam

25

Second Exam

25

Homework and  quizzes

10

Other criteria (Research, Discussion..etc)

 

Final Exam

40


 

 

Week

Subject

1

Introduction: concepts, roots, data sets, Data Mining vs. Data Warehouse & Data Marts, OLTP VS OLAP

2

CRISP-DM vs. SEMMA processes

SEMMA: Sample, Explore, Modify, Model, Asses

3

Data Preparation: Data Types

Categorical/Nonparametric /Discrete : Nominal vs. Ordinal

Parametric/Continuous measures: Integer/Interval vs. Ratios.

4

Data Preparation – raw data, transformation, missing data, outlier analysis

Time dependent data

5

Data Reduction: Dimensions of Large Data Sets, Features Reduction

6

Data Reduction: Entropy Measure, Gini Index, Chi-Square

7

MIDTERM EXAM 1

8

Data Reduction: Values Reduction, Features Discretization, Case Reduction

9

Predictive Methods-Confusion Matrix

Interpreting prediction measurements

Sensitivity, specificity, True positive, True Negative, False Positive, False Negative,

10

Sequential Pattern Discovery, Association Rule Discovery

11

Classification

12

 

13

MIDTERM EXAM 2

14

Regression

15

Deviation Detection

16

Final Exam