Skip to content

Latest commit

 

History

History
22 lines (14 loc) · 2.14 KB

README.md

File metadata and controls

22 lines (14 loc) · 2.14 KB

COMP4651: Cloud Computing and Big Data Systems

Codes for HKUST COMP4651 Fall 2018: Cloud Computing and Big Data Systems

Assignment

Assignment 1: Benchmarking and measuring AWS EC2 CPU, Memory, and Network Performance across different types of instances and cluster locations

Assignment 2: Java implementation on copying files between HDFS and locals while maintaining the checksum

Assignment 3: MapReduce Programming on Java for Bigram count and frequency calculation based on Stripes and Pairs design pattern

Assignment 4: Apache Spark

  • Q1: Building a Word Count Application
  • Q2: Web Server Log Analysis

Assignmnet 5: Power Plant Machine Learning Pipeline Application with Apache Spark

Additional

DataFrame Live Programming: Spark's DataFrame Live Programming hands-on tutorial from Spark SF Meetup 2016

Spark Tutorial: Apache Spark tutorial heavily adapted from Spark MOOC

EMR Test: Test for Amazon EMR and S3 instances