IMDb Top Rated Indian Movies Scraper

This script scrapes the list of top-rated Indian movies from IMDb's website and retrieves details such as movie name, rank, release year, rating, and IMDb URL.

Overview

This Python script utilizes BeautifulSoup and requests libraries to fetch and parse data from IMDb's top-rated Indian movies webpage. It extracts information from each movie entry in the list and structures it into a list of dictionaries containing movie details.

How It Works

The script performs the following steps:

Fetching Data: It sends a request to IMDb's top-rated Indian movies webpage (https://www.imdb.com/india/top-rated-indian-movies/).
Parsing HTML: It uses BeautifulSoup to parse the HTML content of the webpage.
Extracting Movie Details: It locates the relevant HTML elements containing movie details such as name, rank, release year, rating, and IMDb URL.
Formatting Data: It structures the extracted information into a list of dictionaries, where each dictionary represents a movie with attributes like position, name, release year, rating, and URL.

Usage

To use this script:

Clone the repository or download the imdb_scraper.py file.
Make sure you have Python installed on your system.
Install the required libraries using pip:

pip install requests beautifulsoup4

Run the script:

python imdb_scraper.py

The script will print the scraped data or store it in a variable for further processing.

Example Output

Here is an example of the data structure returned by the script:

[
 {
     'position': 1,
     'name': 'Nayakan',
     'years': 1987,
     'rating': 8.5,
     'url': 'https://www.imdb.com/title/tt0093603/'
 },
 {
     'position': 2,
     'name': 'Anbe Sivam',
     'years': 2003,
     'rating': 8.5,
     'url': 'https://www.imdb.com/title/tt0367495/'
 },
]

Contributing

Contributions are welcome! If you find any issues or want to add improvements, feel free to fork the repository, make your changes, and submit a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
__pycache__		__pycache__
cache_file_id		cache_file_id
data_movies/movies_details		data_movies/movies_details
README.md		README.md
imdb_1.py		imdb_1.py
imdb_anaysis_by_director_7.py		imdb_anaysis_by_director_7.py
imdb_by_decades_3.py		imdb_by_decades_3.py
imdb_caching_8.py		imdb_caching_8.py
imdb_full_movie_list_5.py		imdb_full_movie_list_5.py
imdb_group_movies_2.py		imdb_group_movies_2.py
imdb_language_analysis_6.py		imdb_language_analysis_6.py
imdb_merge_10.py		imdb_merge_10.py
imdb_movie_details_4.py		imdb_movie_details_4.py
imdb_time_9.py		imdb_time_9.py
imdb_website.py		imdb_website.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMDb Top Rated Indian Movies Scraper

Overview

How It Works

Usage

Example Output

Contributing

About

Releases

Packages

Languages

harshitacodes/web_scraping

Folders and files

Latest commit

History

Repository files navigation

IMDb Top Rated Indian Movies Scraper

Overview

How It Works

Usage

Example Output

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages