Skip to content

Python web scraper based on Scrapy framework for obtaining recipe data from most popular cooking websites.

License

Notifications You must be signed in to change notification settings

MaciejTe/recipebot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recipebot

License

Python web scraper based on Scrapy framework for obtaining recipe data from most popular cooking websites.

Supported websites

  1. https://cookpad.com/

How to run

  1. Create Python virtualenv

    python3 -m venv recipevenv 
  2. Activate Python virtualenv

    source recipevenv/bin/activate
    
  3. Install Linux dependencies (for proper SQLAlchemy functioning)

    sudo apt install libpq-dev libffi-dev python3-dev libxml2 libxml2-dev libxslt-dev
    
  4. Install Python libraries

    python setup.py install
    
  5. Navigate to /cookpad directory, launch scrapy

    • Save recipes to postgreSQL database :
    scrapy crawl cookpadbot -a category=vegan
    • Save recipes to JSON file (if you want to disable saving to postgreSQL DB, comment ITEM_PIPELINES variable in cookpad/cookpad/settings.py file):
    scrapy crawl cookpadbot -o vegetarian.json -a category=vegetarian
  6. To deactivate virtual environment:

    deactivate

Sample recipes

Sample recipes available in cookpad/cookpad/sample_recipes directory.

About

Python web scraper based on Scrapy framework for obtaining recipe data from most popular cooking websites.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages