This repository contains two scripts designed to scrape job listings from devjobsscanner website. Users can input their desired job title, remote work preference, sorting preference, and choose how to save the output (CSV, TXT, or both).
- Scrapes job listings using the
requests
library andBeautifulSoup
. - Displays job details in the console.
- Saves job details in CSV and/or TXT format.
- Suitable for static page scraping.
- Enhanced to use
SeleniumBase
for dynamic page interaction. - Supports infinite scrolling to load more job listings.
- Users can specify the number of job listings to scrape.
- More robust handling of dynamically loaded content.
- Python 3.8+
beautifulsoup4
libraryrequests
library
seleniumbase
library- WebDriver for your browser (e.g., ChromeDriver for Chrome, GeckoDriver for Firefox)
-
Clone the repository:
git clone https://github.com/asibhossen897/devJobsScanner-job-scraper.git cd devJobsScanner-job-scraper
-
Install the required libraries:
pip install -r requirements.txt
-
For
job_scraper_dynamic.py
, ensure you have the appropriate WebDriver installed and available in your PATH.
-
Run the script:
python job_scraper_static.py
(If
python
does not work, usepython3
) -
Follow the prompts to input your job search criteria and preferences.
-
Run the script:
python job_scraper_dynamic.py
(If
python
does not work, usepython3
) -
Follow the prompts to input your job search criteria, number of jobs to scrape, and preferences.
job_scraper_static.py
: Script for static job scraping.job_scraper_dynamic.py
: Script for dynamic job scraping with SeleniumBase.requirements.txt
: List of required Python libraries.outputFiles/
: Directory where output files (CSV, TXT) are saved.
These scripts are for educational and personal use only. Scraping websites can be against the terms of service of the website being scraped. Always check the website’s terms and conditions before scraping any content. The author is not responsible for any misuse of these scripts. Use at your own risk.
This project is licensed under the MIT License - see the LICENSE file for details.
Asib Hossen
May 21, 2024