This is an ipython notebook that walks through our process of using IMBD data to predict the success of any given movie.
Movie success predictor written by @dhanus, @dtomc, @rmazumdar, & @sbuschbach for a Harvard Data Science final project. We were advised by @lfcampos.
Install required python packages:
pip install -r requirements.txt
We did two analyses for this project: Oscar Predictor and Box Office Sales.
####Oscar Predictor The data scraper for this analysis can be found in ipython notebook oscar_scraper.ipynb. This takes in the xls file "Academy_Awards_2006.xls" and outputs "AAdictfinal", a dataset in dictionary form. ("AAdict.p" can be used to skip a portion of this notebook, the output will still be "AAdictfinal") To run this data scraper run:
ipython notebook oscar_scraper.ipynb
The process notebook can be found in ipython notebook oscar_process_notebook.ipynb. This notebook uses "AAdictfinal". To run the analysis for the Oscar Predictor run:
ipython notebook oscar_process_notebook.ipynb
####Box Office Sales The data scraper for this analysis can be found in ipython notebook box_office_scraper.ipynb. This notebook outputs "BOdict", a dataset in dictionary form. To run this:
ipython notebook box_office_scraper.ipynb
The process notebook can be found in ipython notebook oscar_process_notebook.ipynb. This notebook uses "BOdict". Run ipython notebook:
ipython notebook box_office_process_notebook.ipynb