Skip to content

🌎 πŸ–₯ Supercharge your scraper to extract quality page metadata by parsing JSON-LD data via Python's extruct library.

License

Notifications You must be signed in to change notification settings

hackersandslackers/jsonld-scraper-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

23 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Structured Data Scraping Tutorial

Python Extruct Requests GitHub Last Commit GitHub Issues GitHub Stars GitHub Forks

Extruct Tutorial

Supercharge your scraper to extract quality page metadata by parsing JSON-LD data via Python's extruct library.

This repository contains source code for the accompanying tutorial on Hackers and Slackers: https://hackersandslackers.com/scrape-metadata-json-ld/

Installation

Installation via requirements.txt:

$ git clone https://github.com/hackersandslackers/jsonld-scraper-tutorial.git
$ cd jsonld-scraper-tutorial
$ python3 -m venv myenv
$ source myenv/bin/activate
$ pip3 install -r requirements.txt
$ python3 main.py

Installation via Pipenv:

$ git clone https://github.com/hackersandslackers/jsonld-scraper-tutorial.git
$ cd jsonld-scraper-tutorial
$ pipenv shell
$ pipenv update
$ python3 main.py

Installation via Poetry:

$ git clone https://github.com/hackersandslackers/jsonld-scraper-tutorial.git
$ cd jsonld-scraper-tutorial
$ poetry shell
$ poetry update
$ poetry run

Usage

To change the URL targeted by this script, update the URL variable in config.py.


Hackers and Slackers tutorials are free of charge. If you found this tutorial helpful, a small donation would be greatly appreciated to keep us in business. All proceeds go towards coffee, and all coffee goes towards more content.

About

🌎 πŸ–₯ Supercharge your scraper to extract quality page metadata by parsing JSON-LD data via Python's extruct library.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages