Moritz

General purpose Tutti crawler with optional pipeline posting to Slack when a new offer matching a searchterm gets published on Tutti.ch.

Scrapinghhub

Setup a new Scrapinghub project.
Deploy the spider using shub deploy.
Optional: Set SLACK_WEBHOOK and SCRAPINGHUB_API_KEY in the settings of your project to receive Slack notifications.
Run the spider with desired searchterm argument on Scrapinghub (manual or periodic).

Development

Installation

python3 -m venv .venv
. ./.venv/bin/activate
pip install -r repository.txt

Add add an optional .env file

# Optional: Slack Webhook to be called
# SLACK_WEBHOOK=https://hooks.slack.com/services/XXXXXXXX/XXXXXXXX/XXXXXXXX

# Optional: Scraping Hub Project & Key
# only make sense for development
# SCRAPINGHUB_API_KEY=xxx
# SCRAPY_PROJECT_ID=xxx

Running the spider to crawl for a searchterm

Example 1: Crawl the latest roomba offers:

scrapy crawl tutti -a searchterm=roomba

Example 2: Crawl the latest 100 pages of all offers and dump results to a json:

scrapy crawl tutti -o offers.json -a pages=100

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Moritz

Scrapinghhub

Development

Screenshot of Slack integration

Files

README.md

Latest commit

History

README.md

File metadata and controls

Moritz

Scrapinghhub

Development

Screenshot of Slack integration