📋 Overview
The Movie Recommendation System is a Python-based application powered by Machine Learning to recommend movies based on user preferences. It combines content-based filtering and collaborative filtering techniques. The system includes a user-friendly interface built using Flask for interactive movie recommendations.
✨ Key Features
- Interactive Movie Search: Input a movie title to receive tailored recommendations.
- Content-Based Filtering: Utilizes movie features like genres, keywords, cast, and director for recommendations.
- Collaborative Filtering: Suggests movies based on user behavior patterns.
- Data Insights: Includes genre distribution, runtime distribution, and other visualizations for exploratory analysis.
Dataset The system uses a movie dataset (movies.csv) containing features like genres, keywords, tagline, cast, and director.
🛠️ Technologies Used
Category | Technologies |
---|---|
Programming Language | Python |
Framework | Flask |
Libraries | pandas, numpy, scikit-learn, nltk, matplotlib, seaborn, TF-IDF, Pickle |
ML Techniques | TF-IDF Vectorization, Cosine Similarity, Collaborative Filtering |
🚀 Installation
Follow these steps to set up the application:
- Clone the Repository:
git clone https://github.com/YourUsername/Movie-Recommendation-System.git cd Movie-Recommendation-System
- Create and Activate a Virtual Environment:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install Dependencies:
pip install -r requirements.txt
- Run the Application:
python app.py
⚙️ How It Works
Step-by-Step Guide
- Upload Data: Load the movie dataset (pre-configured as
movies.csv
). - Preprocessing: Data is cleaned, vectorized using TF-IDF, and models are trained.
- Input Search: Provide a movie name to get recommendations.
- Explore Results: Analyze the recommendations with visualizations and download the results as needed.
💻 Code Highlights
Text Vectorization
Transforms text features using TF-IDF for similarity computation.
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf_vectorizer = TfidfVectorizer(max_features=5000)
tfidf_matrix = tfidf_vectorizer.fit_transform(movie_data['combined_features'])
Similarity Computation
Calculates cosine similarity between movies.
from sklearn.metrics.pairwise import cosine_similarity
cosine_sim = cosine_similarity(tfidf_matrix)
🎯 Applications
- Movie Streaming Platforms: Provide personalized recommendations to users.
- Entertainment Portals: Enhance user experience with tailored suggestions.
- Data Enthusiasts: Explore data-driven approaches to recommendation systems.
📊 Model and Features
- Dataset: Contains features like genres, keywords, tagline, cast, and director.
- Models Used: TF-IDF Vectorization for content-based filtering and collaborative filtering for user-based recommendations.
📦 Folder Structure
Movie-Recommendation-System/
│
├── app.py # Flask application file
├── static/ # Static assets (CSS, JS, images)
├── templates/ # HTML templates for the app
├── data/ # Dataset folder (e.g., movies.csv)
├── models/ # Saved models (e.g., TF-IDF vectorizer, Pickle files)
├── requirements.txt # List of dependencies
├── exploratory_analysis.ipynb # Jupyter Notebook with EDA
└── README.md # Project documentation
🏗️ Contributing
Contributions are welcome!
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
- Make your changes and commit:
git commit -m "Add feature"
- Push the branch:
git push origin feature-name
- Open a pull request.
🙏 Acknowledgments
- Flask for the web application framework.
- scikit-learn for machine learning capabilities.
- pandas for data manipulation.
- nltk for natural language processing.