Summative Assessment for Big Data Analytics
Module Learning Outcomes 1. Create a data set using modern database models and technology 2. Manipulate a data set to extract statistics and features, 3. Critically evaluate and apply data mining techniques/tools to build a classifier or regression model, and predict values for new examples 4. Analyse and communicate issues with scaling up to large data sets, and use appropriate techniques to scale up the computation, 5. Critically discuss the need for privacy, identify privacy risks in releasing information, and design techniques to mediate these risks.
Your task Is to use that dataset, any information you can find about it elsewhere, and the techniques taught in the module, to pose and answer three research questions of your choosing. You will then need to consider how you might store the (research-question-relevant) data in a database, how you might spread a very large version of that data over multiple computers, and what the privacy concerns are here and how you might address them. Produce a structured analysis report using the given template. The structured report consists of seven sections, each containing specific questions, which you must answer.