Immobilienscout24 Listing Scraper

This project is a web scraper designed to extract information from Immobilienscout24 property listings. It uses Playwright for browser automation and handles cookie management for seamless operation.

Features

Scrapes detailed information from Immobilienscout24 property listings
Handles cookie management to maintain session
Provides CAPTCHA detection and manual solving option
Supports repeated scraping of the same listing for monitoring changes

Prerequisites

Python 3.10.11 (tested version)
pip (Python package installer)

Project Structure

WebScrapingProject/
├── .idea/
├── venv/
├── __pycache__/
├── config.py
├── cookie-saver.py
├── main.py
├── requirements.txt
└── cookies.json (generated after running cookie-saver.py)

Setup

Clone this repository:

git clone <repository-url>
cd WebScrapingProject

Create a virtual environment:
```
python -m venv venv
```
Activate the virtual environment:
- On Windows:
```
venv\Scripts\activate
```
- On macOS and Linux:
```
source venv/bin/activate
```
Install the required packages:
```
pip install -r requirements.txt
```
Install Playwright browsers:
```
playwright install
```

Usage

First, run the cookie-saver script to set up your session:
```
python cookie-saver.py
```
Follow the prompts to manually accept cookie terms in the browser window.
Update the TARGET_URL in config.py with the Immobilienscout24 listing URL you want to scrape.
Run the main scraper:
```
python main.py
```
The script will scrape the listing and display the results. Press Enter to scrape again or 'q' to quit.

Configuration

You can modify the FIELDS_TO_FETCH dictionary in config.py to adjust which fields are scraped from the listing.

Troubleshooting

If you encounter a CAPTCHA, the script will pause and allow you to solve it manually.
In case of errors, the script will save a screenshot as 'error_screenshot.png' for debugging.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is open source and available under the MIT License.

Disclaimer

This scraper is for educational purposes only. Always respect the terms of service of the websites you interact with.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Immobilienscout24 Listing Scraper

Features

Prerequisites

Project Structure

Setup

Usage

Configuration

Troubleshooting

Contributing

License

Disclaimer

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.idea		.idea
venv		venv
README.md		README.md
browser_manager.py		browser_manager.py
config.py		config.py
cookie-saver.py		cookie-saver.py
main.py		main.py
requirements.txt		requirements.txt

WSHAPER/WebScrapingProject

Folders and files

Latest commit

History

Repository files navigation

Immobilienscout24 Listing Scraper

Features

Prerequisites

Project Structure

Setup

Usage

Configuration

Troubleshooting

Contributing

License

Disclaimer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages