Module Catalogues, Xi'an Jiaotong-Liverpool University   
 
Module Code: DTS101TC
Module Title: Introduction to Big Data
Module Level: Level 1
Module Credits: 2.50
Academic Year: 2020/21
Semester: SEM2
Originating Department: Shool of AI and Advanced Computing
Pre-requisites: N/A
   
Aims
The scientific and industrial importance of Big Data emerged from the recent explosion of the volume of available digitized data. This module exposes the challenges posed by Big Data in the fields of acquisition, storage and analysis. This module provides students with a broad perspective on these challenges, and introduces aspects of them that will addressed in further detail in other modules of the programme.
Learning outcomes 
A. Develop a global perspective on the sources and uses of big data.

B. Engage critically with the technical challenges of data acquisition and management.

C. Develop an understanding of the industrial and commercial applications of big data.

D. Demonstrate an awareness of the quantitative problems posed by the analysis of big data.
Method of teaching and learning 
The module will be delivered in a combination of lectures and seminars, with presentation
by industrial partners. In addition to the time of classes, students will be expected to devote the unsupervised time to private study of course material (including written material and electronically-distributed data samples).
Syllabus 
General introduction to big data: the three Vs (Volume, Variety, Velocity). (1 lecture)


Technical aspects: (4 lectures)

- Sources (published results, transactions, social networks);

- Retrieval (discussion of confidentiality issues);

- Formats;

- Examples of industrial and commercial applications (safety, health care and consumer behaviour for instance).


Storage and treatment: (4 lectures)

- Memory allocation, algorithmic issues (approximate counting, Morris’s algorithm);

- Database solutions;

- Notions of parallel computing and cloud computing with industrial examples.


Analysis of Big Data: (4 lectures)

- Noise and corruption: cleaning data;

- Learning from data: training and validation data sets;

- Curse of dimensionality.
Delivery Hours  
Lectures Seminars Tutorials Lab/Prcaticals Fieldwork / Placement Other(Private study) Total
Hours/Semester 13  13        49  75 

Assessment

Sequence Method % of Final Mark
1 Assessment Task 1 (Groupwork) 20.00
2 Written Examination 80.00

Module Catalogue generated from SITS CUT-OFF: 6/3/2020 1:28:24 AM