Skip to content

geekyspartan/bigdata-mapreduce-spark

Repository files navigation

Big Data - Map Reduce and Spark

Implemented following different tasks:

  1. WordCount and SetDifference using Map Reduce Command to run : python mapReduce.py

  2. WordCount and SetDifference using Spark Command to run : python spark_wordCount_setDifference.py

  3. Find frequency for each industry words in the blog authorship corpus Command to run : python spark-industryNameFrequency.py [Directory path for corpus]

About

WordCount and SetDifference using MapReduce and Spark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages