this list is now archived - look inside https://github.com/sw-yx/brain if you would like an updated list! thank you.
- fancyhands - US virtual assistant
- gethuman - Help for customer service
- pickfu - Market feedback from US consumers
- Gosquaredaway- military spouse personal assistants
- Self-care Twitter Bots
- Make @hydratebot
- Wikipedia scraper bot
- Year Progress Twitter bot
- 100daysofcode bot
- Tweepy Twitter Bot
- Github release -> Twitter bot
- Tracery generative story twitter bots: BotWiki, Intro to Twitter Bots, Cheap bots done quick
- Google Earth stills over time bot
- AWS markov chain bot - js version https://github.com/swang/markovchain
- Ace Attorney bot
small replit twitter bots (usually useless)
- Discordpy bots
- Nodejs discord bot
- Typescript community bot
- discord bot for community moderation - https://top.gg/bot/677184239472607299
- snowpack comunity has it
- https://github.com/huginn/huginn (Ruby)
- https://n8n.io/
- https://nodered.org/
- https://github.com/agenda/agenda nodes jobrunner
- https://airflow.apache.org/
- more https://news.ycombinator.com/item?id=21772610
- https://news.ycombinator.com/item?id=20822637
- python! https://automatetheboringstuff.com/2e/
Info from Checkly/Tim Nolet (podcast)
- Old: Selenium
- New: https://developers.google.com/web/tools/puppeteer (Google), https://playwright.dev/ (ex Puppeteer team left for Microsoft), Headless Recorder from checkly generates scripts for either
- learn either: https://theheadless.dev
- https://mihaisplace.blog/2021/10/03/the-state-of-web-scraping-in-2021/
misc
- https://github.com/lorien/awesome-web-scraping
- https://simonwillison.net/2020/Nov/14/personal-data-warehouses/
- https://beepb00p.xyz/exports.html
- https://news.ycombinator.com/item?id=23142220
- https://news.ycombinator.com/item?id=22778089
- https://www.scrapingbee.com/blog/web-scraping-javascript/
- https://www.scrapingbee.com/
- https://www.scrapingbee.com/blog/web-scraping-without-getting-blocked/
- their Twitter surfaces good stuff
- setting cookies https://theheadless.dev/posts/managing-cookies/
- https://twitter.com/SahinKevin/status/1216343661459451906?s=20
- https://qoob.cc/web-scraping/
- https://chrome.google.com/webstore/detail/simplescraper-%E2%81%A0%E2%80%94-a-fast-a/lnddbhdmiciimpkbilgpklcglkdegdkg?utm_source=brainpint&utm_medium=email&utm_campaign=its_ok_to_go_off_script&utm_term=2021-07-06
- Beehive - A flexible event/agent & automation system with lots of bees
- DataFire - An open source framework for building and integrating APIs. Each integration provides a set of composable actions. New actions can be built by combining existing actions, JavaScript, and external libraries. They are driven by JavaScript Promises, and can be triggered by a URL, on a schedule, or manually.
- Kibitzr - Get notified when important things happen
- Netflix Scumblr (#1522) - A web application that allows performing periodic searches and storing / taking actions on the identified results.
- Node-RED (#1296) - A tool for wiring together hardware devices, APIs, and online services in new and interesting ways.
- NoFlo - A JavaScript implementation of Flow-Based Programming (FBP). Separating the control flow of software from the actual software logic. Helping you organize large applications easier than traditional OOP paradigms, especially when importing and modifying large data sets.
- Pico-Engine - A prototype implementation of the pico-engine written in node.js
- Riemann - "A network event stream processing system, in Clojure. Riemann aggregates events from your servers and applications with a powerful stream processing language."
- RSS-Bridge - A PHP application offering a wide selection of feeds for popular and niche services. This includes several services where queries are involved, such as translating Twitter accounts or searches into atom/JSON feeds. Possibly a valuable source of easy data for simpler Huginn scenarios.
- Trigger Happy - "opensource clone of IFTTT, a bridge between your internet services"
- Welcomer Framework - Supports building microservices to flexibly automate your online tasks, putting control of your personal data back in your hands.
- https://apify.com
to Scrape Twitter you have to fake googlebot https://twitter.com/magusnn/status/1339833122456662017?s=20
webscraper.io
- Tray.io https://techcrunch.com/2019/11/26/tray-io-brings-in-50m-more-at-a-600m-valuation-for-its-workflow-automation-tools/
- pipedream
- https://n8n.io/
- https://www.daolf.com/posts/avoiding-being-blocked-while-scraping-ultimate-guide/
- https://tryspider.com/
- https://medium.com/hackernoon/scraping-the-web-with-node-js-f7da67d2f734
- https://github.com/paperswithcode/sota-extractor
- https://github.com/karpathy/arxiv-sanity-preserver
- https://theheadless.dev/
WEBCRAWLING ARCHITECTURE https://nlp.stanford.edu/IR-book/information-retrieval-book.html https://news.ycombinator.com/item?id=24338964