Skip to content

Web scraper of movies, actors, directors and more

License

Notifications You must be signed in to change notification settings

Ninoko/MovieScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MovieScraper

Description

This is portfolio project.

The main objective of this project was to build relational-based dataset involving informations about movies, people of cinema and their connections to each-other. To accomplish this task I have created tool using requests and bs4 packages for scraping and breadth-first search algorithm to crawl between movie- and person- websites.

The output of this project can be used for:

  • Recommender system dataset extension
  • Movie-datebase website making
  • Graph-based analysis

What I have used

  • Python
  • BeautifulSoup4
  • requests
  • locale
  • csv
  • Breadth-first search algorithm

What I have learned

  • How to use bs4 and requests for website scraping
  • How to handle website requests / sec limitation
  • How to use bfs algorithm for real-life case

Releases

No releases published

Packages

No packages published

Languages