Skip to content
This repository has been archived by the owner on Apr 4, 2022. It is now read-only.

Latest commit

 

History

History
10 lines (6 loc) · 997 Bytes

README.md

File metadata and controls

10 lines (6 loc) · 997 Bytes

Scraper News

Scrapes Hacker News and tweets (@ScraperNews) all posts with at least 90 comments.

To run, install scrapy and twython (I installed them with pip). You must have your own twitter handle and app (apps.twitter.com) with your phone number attached in order to have read/write access.

Currently runs every 15 minutes with a cron job that calls a scrapy scraper to populate items.json then runs post_to_twitter.py to read the json, check against the recents queue for duplicates, then tweet any posts that have not been posted yet to @ScraperNews.

I'm more interested in posts that generate discussion than posts that could potentially get upvoted for interesting titles, which is why I chose comments instead of votes as the delimeter.