Module Catalogues, Xi'an Jiaotong-Liverpool University   
 
Module Code: DTS301TC
Module Title: Data Mining
Module Level: Level 3
Module Credits: 5.00
Academic Year: 2022/23
Semester: SEM1
Originating Department: Shool of AI and Advanced Computing
Pre-requisites: N/A
   
Aims
This module covers methodology, major software tools and applications in data mining. By introducing principal ideas in statistical learning, the course will help students to understand conceptual underpinnings of methods in data mining. It focuses more on usage of existing software packages (mainly in R) than developing the algorithms by the students. Students will be required to work on projects to practice applying existing software.
Learning outcomes 
A. Introduce students to the basic concepts and techniques of Data Mining

B. Demonstrate knowledge of statistical data analysis techniques used in decision making

C. Apply principles of Data Mining to the analysis of large-scale problems

D. Develop skills of using recent data mining software for solving practical problems

E. Gain experience of doing independent study and research
Method of teaching and learning 
This module will be delivered by a combination of formal lectures, tutorials, and computer classes.
Syllabus 
Introduction to Data Mining (1 lecture)

Data Warehouse and OLAP (1 lecture)

Data preprocessing – Discretization, Automatic Attribute Selection (1 lecture)

Data mining knowledge representation – tables, linear models, trees, rules, instance-based representation, clusters (1 lecture)

Attribute-oriented analysis (1 lecture)

Data mining algorithms: Inferring rudimentary rules, decision trees and its construction, constructing rules, association rules and its mining, clustering, classification rules, prediction (8 lectures)

Mining real data – Data transformation methods, e.g., attribute selection, discretization, projections, sampling, cleansing (3 lectures)

Data mining credibility – training and testing, predicting performance, cross-validation, predicting probabilities, counting the cost, evaluating numeric prediction (5 lectures)

Data Mining software and applications – Applying mining data, learning from massive datasets, data stream learning, incorporating domain knowledge, text mining, web mining, (5 lectures)
Delivery Hours  
Lectures Seminars Tutorials Lab/Prcaticals Fieldwork / Placement Other(Private study) Total
Hours/Semester 26      26    98  150 

Assessment

Sequence Method % of Final Mark
1 Coursework 1(Groupwork) 20.00
2 Coursework 2 20.00
3 Final Project 60.00

Module Catalogue generated from SITS CUT-OFF: 6/3/2020 1:47:00 AM