Teaching Responsibility
LJMU Schools involved in Delivery:
Computer Science and Mathematics
Learning Methods
Lecture
Practical
Module Offerings
6124COMP-SEP-CTY
Aims
The aim of this module is to develop the knowledge and skills for working effectively with the large scale data storage and processing frameworks that underpin data science.
Module Content
Outline Syllabus:Big Data
Volume – tracks what happens
Velocity – real-time
Variety – text, images, audio, video
Big Data Difficulties
Variability – inconsistency of data
Veracity – quality of data
Complexity – complex data management issues
Big Data storage and Analysis Tools
Apache Hadoop
Hadoop provenance
Apache Hadoop Framework
Common
Distributed File System (HDFS)
YARN
MapReduce
Job Tracker
Task Tracker
Apache Hadoop Tools
Pig (Pig Latin, ETL)
Hive (data warehousing + SQL) in detail
Apache Spark (in-memory analytics) in detail
Apache Mahout (machine learning system) in detail
Apache SOLR (scalable search tool)
Hadoop in the Cloud - Amazon EC2/S3 Services
Emerging Trends in Big Data storage and processing
Module Overview:
The aim of this module is to develop the knowledge and skills for working effectively with the large scale data storage and processing frameworks that underpin data science.This module provides both theoretical and practical experience of large scale data storage considerations and the development of tools to support the processing of that data.
The aim of this module is to develop the knowledge and skills for working effectively with the large scale data storage and processing frameworks that underpin data science.This module provides both theoretical and practical experience of large scale data storage considerations and the development of tools to support the processing of that data.
Additional Information:This module provides both theoretical and practical experience of large scale data storage considerations and the development of tools to support the processing of that data.
Assessments
Report
Centralised Exam