Easily get updated upwork job alerts via telegram based on specific search terms. Built with Node, Express, Typesrcript, Puppeteer, Mongodb and Telegram chat bot api
This project is fairly easy to setup and run locally. follow these steps to get started:
git clone git@github.com:teyim/Upwork-job-scraper.git
Navigate into the project's directory and run the following command to install all dependencies
npm install
Create a new .env file and copy the environment variable examples from .env.example file into your new .env file
- Configure the .env file with your Mongodb url, Database name, Collection name
- Create telegram bot using BotFather on telegram.
- Get your telegram bot token from bot father
- Set your bot bot token as the TOKEN env variable
- Download and start Ngrok to forward port 5000
- add the Ngrok url as the SERVER_URL env variable
- Get your bots chat id by opening a chat with the bot and sending a message to it, you will as esponse as such: Your Chat ID is -- CHAT BOT ID copy the chatbot ID and set as the CHAT_ID env variable.
Use the following command to run the project locally
npm run dev
The project uses docker, to run the project in a docker container, run the following command in the projects directory
docker-compose up --build
- Scrapes job listings from Upwork using Puppeteer.
- Filters jobs based on specified keywords.
- Stores job data in MongoDB and compares new jobs to previously stored ones.
- Sends alerts for newly added jobs via a Telegram bot.
- Configurable scraping interval and keywords (to be completed).
- Dockerized for consistent deployment.
The main challenge with this project is making sure puppeteer works in the production server. The application run well locally with docker , but when hosted on render or other hosting platform, puppeteer gives timeout error.
Solutions attempted
- Configure puppeteer to run in headless mode
- Switching headless mode off
- Setting up a proxy with puppeteer
