Talabat_WebScraper

Objective:

Extract and store data pertaining to restaurants available from the Talabat website.

This project primarily makes use of the Beautiful Soup module in python to parse html webpages. Web Srcaping can be a tedious process and to optimize the performance in terms of time, we use multithreading to write to the csv file concurrently.

The csv file contains the following data for every restaraunt found on the website:

brand_name (string)- Name of the restaurant.
cuisine_tags(array) - List of cuisine served by the restaurant.
restaurant_rating (string)- Talabat User Restaurant Rating ( NA - for some new restaurants).
delivery time (int)- Time taken to deliver.
service fee (int)- Service fee charged by the restaurant.
minimum order amount(int) - Minimum Order Amount for delivery.
new_restaurant (bool) - TRUE/FALSE depending on whether the restaurant is new on the website.

The given code works well with the Jumeirah Lakes Towers - JLT area. To be able to scrape data pertaining to other areas, all we have to do is replace the website base url at the appropriate lines, specifically:

#pass the baseurl of the area to the pages function
pages("website_base_url") 

#store the base url in url_list
url_list=["website_base_url"]

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md
WebScraper.py		WebScraper.py
restaurant_data.csv		restaurant_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Talabat_WebScraper

Objective:

About

Releases

Packages

Languages

Ritik3111/Talabat_WebScraper

Folders and files

Latest commit

History

Repository files navigation

Talabat_WebScraper

Objective:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages