seloger-listings-scraper-tutorial

seloger-listings-scraper-tutorial will let you download all 25 first listings from a Seloger search URL 🏡

With 3 main features per listing

url
price
type

Below, the power of the .py script:

$ python3 seloger-listings-scraper-tutorial.py --url "https://www.seloger.com/list.htm?projects=2&types=2&places=\\[\\{%22divisions%22:\\[2248\\]\\}\\]&mandatorycommodities=0&enterprise=0&qsVersion=1.0&m=search_refine-redirection-search_results."
status code 200
0 420 000 € Appartement
1 545 000 € Appartement
2 324 000 € Appartement
3 599 000 € Appartement
4 399 000 € Appartement
~~ success
 _       _         _            
| |     | |       | |           
| | ___ | |__  ___| |_ __ __  
| |/ _ \| '_ \/ __| __/| '__|
| | (_) | |_) \__ \ |_ | |   
|_|\___/|_.__/|___/\__||_|

You can scrape the data from the very search URL you need, using the dynamic --url argument. And data is immediately exported to a clean and structured .csv file named as seloger-scraper-results.csv, in the same folder of your .py script.

Feature Support

seloger-listings-scraper-tutorial is a Python scraper, developed for demo purpose, which let you download listings from a SeLoger search URL.

25 listings from the first page
3 attributes per listing
elegant @class structure
dynamic --url argument
data exported in structure .csv file

seloger-listings-scraper-tutorial supports Python 3.9.12.

Installation

To run seloger-listings-scraper-tutorial:

Download the .py file
Install dependencies
Go to seloger.com and copy headers from a valid request
Run the .py script

$ pip3 install requests lxml
$ python3 seloger-listings-scraper-tutorial.py

NB: by default, listings are scraped from this specific search URL https://www.seloger.com/list.htm?projects=2&types=2&places=\\[\\{%22divisions%22:\\[2248\\]\\}\\]&mandatorycommodities=0&enterprise=0&qsVersion=1.0&m=search_refine-redirection-search_results.

✨

Notes

This Python script has been written for demo purposes only. You can only collect up to 25 listings i.e. you’ll collect only data from the first page, and there is no pagination handling here.

Furthermore, in order to bypass datadome limitations, you’ll need to go on the home page https://www.seloger.com, open the Network tab of your Chrome dev console, and collect headers which led to the 200 request.

If you need extended ressources on this topic, feel free to check our step-by-step tutorial: https://lobstr.io/blog/comment-scraper-seloger-avec-python-et-requests.

Besides, after a certain amount of requests, you will face consistent bot-mitigation. The website will possibly enter into an endless loop of captcha challenges. No scalability promise of any kind here.

Though, if you need to collect data at scale, please check our ready-made crawler, available here: https://lobstr.io/store/a7e1864ab37570369c69a68d1b943d8b/seloger-iter-listings

It will let you:

paginate
collect +20 attributes per listing
schedule your launch
auto-export data (.csv, s3, googlesheet)
multi-thread
no-code launch

With the possibility to setup and control your scraper from our human-friendly interface, or directly from our developer-ready API.

🦀

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
seloger-listings-scraper-tutorial.py		seloger-listings-scraper-tutorial.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

seloger-listings-scraper-tutorial

Feature Support

Installation

Notes

About

Releases

Packages

Languages

License

lobstrio/seloger-listings-scraper-tutorial

Folders and files

Latest commit

History

Repository files navigation

seloger-listings-scraper-tutorial

Feature Support

Installation

Notes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages