MongoDB Big-Data-Exploration Project

This project seeks to discover, investigate, and solve big data-set questions while utilizing MongoDB for storage and computations. This summer internship project also shows how to answer questions concerning big datasets stored in MongoDB using MongoDB's frameworks and connector. Both the MongoDB native aggregation framework and hadoop were utilized to explore the data.

The data for this project comes from two major sources:

The Bureau of Transportation Statistics provided our Flights dataset, which is the domestic flight schedules for the past year.
The Stanford Network Analysis Project provided us with the Twitter-Memes dataset which contains the blog posts and new articles for the 2008 presidential election.

Roadmap

This project can be divided into three sections, each with in-depth wiki pages describing our steps and observation:

Basic-Flights - Basic analysis on the Flights dataset using MongoDB Aggregation Framework
PageRank-Flights - Computing PageRank over the Flights dataset using the MongoDB MapReduce Framework
Twitter-Memes - Computing PageRank over the Twitter-Memes dataset using Hadoop and associated frameworks/languages (like Apache Pig, Amazon EMR)

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
Basic-Flights		Basic-Flights
PageRank-Flights		PageRank-Flights
Twitter-Memes		Twitter-Memes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MongoDB Big-Data-Exploration Project

Roadmap

Contributors

About

Releases

Packages

medeng/big-data-exploration

Folders and files

Latest commit

History

Repository files navigation

MongoDB Big-Data-Exploration Project

Roadmap

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages