This is a web scraping backend for a short term rental aggregation site. The web scrapers here are used to create a snapshot of city listings for all three short term rental sites being targeted. Airbnb, VRBO, and Sonder. The front-end repo can be found here: STR Aggregator github repo
The web scraper uses a mixture of Puppeteer, cheerio, and axios NPM packages to pull data from the targeted websites and has been deployed to Heroku free tier dynos using a microservices type architecture one city being scraped her dyno. The results are then saved into a mongo DB database with one collection per city.
The fully deployed application can be found at this location: STR Aggregator frontend
Web scrapers can be found at these locations:
https://str-austin.herokuapp.com
https://str-houston.herokuapp.com
https://str-boston.herokuapp.com
https://str-denver.herokuapp.com
- Project Description
- Table of Contents
- Installation Instructions
- Usage
- Guidelines for Contributing
- Tests
- Technologies Used
- Collaborators
- Questions
- Screenshots
- License
Each city requires one deployment of the web scrapers to a Heroku instance. Each Heroku instance will require a Heroku buildpack that can be installed from the Heroku CLI:
heroku buildpacks:add jontewks/puppeteer
You should see the following in settings of your Heroku instance :
You will need an Atlas Mongo DB instance and add the Config Var MONGODB_URI
in Heroku Settings
The web scrapers utilize Heroku free scheduler add-on for the execution. You will need to add a scheduler command and add an interval of your choosing in the scheduler application. You will need to schedule the proper script for the targeted city in the scheduler. Example:
node webscrapers/austin-PopulateAll.js
Please see the front end application reference in the description, to utilize the application
To monitoring the application use the following Heroku CLI commands
View logs and scheduler:
heroku logs --app str-austin --tail
Status and stats on remaining hours quota
heroku ps --app str-austin
Get working shell on dyno for troubleshooting
heroku run bash --app str-austin
Please e-mail one of the contributors at their address listed below with any thoughts on future updates or feature suggestions.
Test early; test often.
Puppeteer Cheerio Mongo DB Heroku Buildpacks Javascript Node js Heroku Scheduler Heroku CLI
This STR Aggregator was conceived, created, and coded by the following group of collaborators:
TEAM | Members |
---|---|
🏈 Vincent Doria, Jr. 🏈 | 🍻 Shane Schilling 🍻 |
🎱 Abraham Spindel 🎱 | 💚 Eric D. Torres 💚 |
Check out our Github profiles:
You can contact any one of use by e-mail the following:
- Vincent Doria, Jr. - vrdphone@gmail.com
- Shane Schilling - trilambda122@gmail.com
- Abraham Spindel - AbrahamSpindel@gmail.com
- Eric D. Torres - etorresntoary@gmail.com
for any additional questions and/ or clarifications you may need about the project.
Sample output from executed webscraper
This application uses the GNU Affero General Public v3.0 License found here.