Skip to content

Latest commit

 

History

History
28 lines (20 loc) · 816 Bytes

README.md

File metadata and controls

28 lines (20 loc) · 816 Bytes

MovieScraper

Description

This is portfolio project.

The main objective of this project was to build relational-based dataset involving informations about movies, people of cinema and their connections to each-other. To accomplish this task I have created tool using requests and bs4 packages for scraping and breadth-first search algorithm to crawl between movie- and person- websites.

The output of this project can be used for:

  • Recommender system dataset extension
  • Movie-datebase website making
  • Graph-based analysis

What I have used

  • Python
  • BeautifulSoup4
  • requests
  • locale
  • csv
  • Breadth-first search algorithm

What I have learned

  • How to use bs4 and requests for website scraping
  • How to handle website requests / sec limitation
  • How to use bfs algorithm for real-life case