This is a simple Flask app to get movie recommendations based on three popular recommendation algorithms. The app deployed on heroku can be viewed here.
Clone the repository and navigate to its directory. Set your environment variable `FLASK_APP` by entering:FLASK_APP=env_var.py
Type flask run
to run the app on localhost. Go to localhost:5000/home
to access the app.
Python and pip are the initial requirements. The others are given in requirements.txt and can be installed once you clone the repository.
- Data Acquisition: The data for the movies (consisting of the following attributes: Movie Number, Movie Name, Director, Poster) was scraped from IMDB using the OMDB API.It was stored in a MongoDB database as a collection. Another collection was created in the same database to store the dummy users and ratings (which were required for the algorithms). A total of 102 movies were stored in the movie collection, and 135 entries in the dummy user collection.
- Movie Recommendation System:This was built by using three recommendation algorithms: user-user collaborative filtering, item-item collaborative filtering and low rank matrix factorization. Input from users is taken in the form of 5 movies and their corresponding ratings. This is converted to a Pandas dataframe and added to the MongoDB collection of users, and also included in the user-item matrix. Similarity matrices were calculated next, using Manhattan similarity criterion. Then the predicted user ratings over the entire user-item matrix (which now has the new user data as well) were calculated, and the five highest ratings for our current user were retrieved. I have ensured that the same movies already rated by the user are not repeated in the recommendations.
- Creation of app:The algorithms having been written, the next step was to create a Flask web app to run the same. Input was rendered using Jinja2 and HTML. Users are first shown the list of movies in the database, and asked to rate them by choosing the movie from a dropdown menu and entering the corresponding rating.
- Deployment on Heroku:The next step was to improve the code appearance, check its running and deploy it on Heroku. A Github repository was created as well for the app.
- routes.py file stores all the routes and views for the app (including the recommendation algorithms).
- forms.py stores all the rendered forms on the app (created using WTForms)
- Templates are stored in Templates folder. Given below are the template files used to render the app.
- base.html contains the basic appearance/structure of the app.
- reco.html contains the HTML code to render the homepage and form data.
- results.html provides the results (recommendations) to the user.
- env_var.py is used for setting the environment variable FLASK_APP (defines application instance).
- Some miscellaneous files (not part of the app, but part of the project in various stages) are stored in the Miscellaneous folder.
- webparser.py contains the code that was used to parse IMDB for movie information.
- algorithms.py contains the recommendation algorithms, data processing part etc. that was used for the app (this file was not used directly, but copied to the app views).
certifi==2018.11.29 chardet==3.0.4 Click==7.0 Flask==1.0.2 gunicorn==19.9.0 idna==2.8 itsdangerous==1.1.0 Jinja2==2.10 MarkupSafe==1.1.0 numpy==1.16.1 pandas==0.24.1 pymongo==3.7.2 python-dateutil==2.8.0 pytz==2018.9 requests==2.21.0 scikit-learn==0.20.2 scipy==1.2.0 flask-wtf==0.14.2 six==1.12.0 urllib3==1.24.1 Werkzeug==0.14.1 WTForms==2.2.1 dnspython==1.16.0