Skip to content

HKUST COMP4651 Fall 18/19: Cloud Computing and Big Data Systems

Notifications You must be signed in to change notification settings

helenli522/COMP4651

 
 

Repository files navigation

COMP4651: Cloud Computing and Big Data Systems

Codes for HKUST COMP4651 Fall 2018: Cloud Computing and Big Data Systems

Assignment

Assignment 1: Benchmarking and measuring AWS EC2 CPU, Memory, and Network Performance across different types of instances and cluster locations

Assignment 2: Java implementation on copying files between HDFS and locals while maintaining the checksum

Assignment 3: MapReduce Programming on Java for Bigram count and frequency calculation based on Stripes and Pairs design pattern

Assignment 4: Apache Spark

  • Q1: Building a Word Count Application
  • Q2: Web Server Log Analysis

Assignmnet 5: Power Plant Machine Learning Pipeline Application with Apache Spark

Additional

DataFrame Live Programming: Spark's DataFrame Live Programming hands-on tutorial from Spark SF Meetup 2016

Spark Tutorial: Apache Spark tutorial heavily adapted from Spark MOOC

EMR Test: Test for Amazon EMR and S3 instances

About

HKUST COMP4651 Fall 18/19: Cloud Computing and Big Data Systems

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 88.8%
  • Python 7.3%
  • Java 3.9%