MaciejTe / recipebot Public

Notifications You must be signed in to change notification settings
Fork 1
Star 0

Python web scraper based on Scrapy framework for obtaining recipe data from most popular cooking websites.

0 stars 1 fork Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
cookpad		cookpad
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

Recipebot

Python web scraper based on Scrapy framework for obtaining recipe data from most popular cooking websites.

Supported websites

https://cookpad.com/

How to run

Create Python virtualenv
```
python3 -m venv recipevenv 
```
Activate Python virtualenv
```
source recipevenv/bin/activate
```

Install Linux dependencies (for proper SQLAlchemy functioning)

sudo apt install libpq-dev libffi-dev python3-dev libxml2 libxml2-dev libxslt-dev

Install Python libraries
```
python setup.py install
```
Navigate to /cookpad directory, launch scrapy
- Save recipes to postgreSQL database :
```
scrapy crawl cookpadbot -a category=vegan
```
- Save recipes to JSON file (if you want to disable saving to postgreSQL DB, comment ITEM_PIPELINES variable in cookpad/cookpad/settings.py file):
```
scrapy crawl cookpadbot -o vegetarian.json -a category=vegetarian
```
To deactivate virtual environment:
```
deactivate
```

Sample recipes

Sample recipes available in cookpad/cookpad/sample_recipes directory.

About

Python web scraper based on Scrapy framework for obtaining recipe data from most popular cooking websites.

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%