Teaching Responsibility

LJMU Schools involved in Delivery:

Computer Science and Mathematics

Learning Methods

Lecture
Practical

Module Offerings

6124COMP-SEP-CTY

Aims

The aim of this module is to develop the knowledge and skills for working effectively with the large scale data storage and processing frameworks that underpin data science.

Module Content

Outline Syllabus:Big Data Volume – tracks what happens Velocity – real-time Variety – text, images, audio, video Big Data Difficulties Variability – inconsistency of data Veracity – quality of data Complexity – complex data management issues Big Data storage and Analysis Tools Apache Hadoop Hadoop provenance Apache Hadoop Framework Common Distributed File System (HDFS) YARN MapReduce Job Tracker Task Tracker Apache Hadoop Tools Pig (Pig Latin, ETL) Hive (data warehousing + SQL) in detail Apache Spark (in-memory analytics) in detail Apache Mahout (machine learning system) in detail Apache SOLR (scalable search tool) Hadoop in the Cloud - Amazon EC2/S3 Services Emerging Trends in Big Data storage and processing
Module Overview:
The aim of this module is to develop the knowledge and skills for working effectively with the large scale data storage and processing frameworks that underpin data science.This module provides both theoretical and practical experience of large scale data storage considerations and the development of tools to support the processing of that data.
Additional Information:This module provides both theoretical and practical experience of large scale data storage considerations and the development of tools to support the processing of that data.

Assessments

Report
Centralised Exam