## مستودعات البيانات والتنقيب في البيانات Dataware houses and Data Mining

Course Code:
676440
Course Outline:
An Najah National University

An Najah National University

Information Technology Faculty – CIS Department

 Course title and number Data Mining 133467 Instructor(s) name(s) Mohammed Abdel Khaliq Dwikat Contact information (dwikatmo@najah.edu, office: 14G2260, phone ext: 2259) Semester and academic year First Semester 2011/2012 Compulsory / Elective Elective Prerequisites 131353 Course Contents (description) This course aims to introduce students to methods used in data mining. It is intended to enable the student prepare and analyze data using predictive and descriptive methods, understand and analyze the results. Course Objectives Intended learning Outcomes and Competences At the end of this course students should be able to; 1- understand the Data Mining purpose, Data Marts, Data Analysis 2- Know Steps to perform Data Mining 3- detect outliers, extreme values, missing values, clean the data 4- use predictive and descriptive methods 5-analyze data using decision trees 6- analyze data using clustering 7- analyze data using association 8- self confidence Textbook and  References (Online Resources) 1.      (Compulsory) Discovering Knowledge in Data, an Introduction to Data Mining,Daniel T. Larose, 2005 , John Wiley & Sons 2.      Data Mining Concepts, Models, Methods & Algorithms, Mehmed Kantardzic. 3.      Advance Data Mining & Techniques, Olson David & Durson Delen. 4.      Microsoft Excel 2010, WEKA, SPSS. Assessment Criteria Activity Percent (%) First Exam 25 Second Exam 25 Homework and  quizzes 10 Other criteria (Research, Discussion..etc) Final Exam 40

 Week Subject 1 Introduction: concepts, roots, data sets, Data Mining vs. Data Warehouse & Data Marts, OLTP VS OLAP 2 CRISP-DM vs. SEMMA processes SEMMA: Sample, Explore, Modify, Model, Asses 3 Data Preparation: Data Types Categorical/Nonparametric /Discrete : Nominal vs. Ordinal Parametric/Continuous measures: Integer/Interval vs. Ratios. 4 Data Preparation – raw data, transformation, missing data, outlier analysis Time dependent data 5 Data Reduction: Dimensions of Large Data Sets, Features Reduction 6 Data Reduction: Entropy Measure, Gini Index, Chi-Square 7 MIDTERM EXAM 1 8 Data Reduction: Values Reduction, Features Discretization, Case Reduction 9 Predictive Methods-Confusion Matrix Interpreting prediction measurements Sensitivity, specificity, True positive, True Negative, False Positive, False Negative, 10 Sequential Pattern Discovery, Association Rule Discovery 11 Classification 12 13 MIDTERM EXAM 2 14 Regression 15 Deviation Detection 16 Final Exam