Structured Data Scraping Tutorial

Supercharge your scraper to extract quality page metadata by parsing JSON-LD data via Python's extruct library.

This repository contains source code for the accompanying tutorial on Hackers and Slackers: https://hackersandslackers.com/scrape-metadata-json-ld/

Installation

Installation via requirements.txt:

$ git clone https://github.com/hackersandslackers/jsonld-scraper-tutorial.git
$ cd jsonld-scraper-tutorial
$ python3 -m venv myenv
$ source myenv/bin/activate
$ pip3 install -r requirements.txt
$ python3 main.py

Installation via Pipenv:

$ git clone https://github.com/hackersandslackers/jsonld-scraper-tutorial.git
$ cd jsonld-scraper-tutorial
$ pipenv shell
$ pipenv update
$ python3 main.py

Installation via Poetry:

$ git clone https://github.com/hackersandslackers/jsonld-scraper-tutorial.git
$ cd jsonld-scraper-tutorial
$ poetry shell
$ poetry update
$ poetry run

Usage

To change the URL targeted by this script, update the URL variable in config.py.

Hackers and Slackers tutorials are free of charge. If you found this tutorial helpful, a small donation would be greatly appreciated to keep us in business. All proceeds go towards coffee, and all coffee goes towards more content.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github		.github
extruct_tutorial		extruct_tutorial
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
config.py		config.py
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
renovate.json		renovate.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Structured Data Scraping Tutorial

Installation

Usage

About

Uh oh!

Releases

Sponsor this project

Packages

Uh oh!

Contributors 2

Languages

License

hackersandslackers/jsonld-scraper-tutorial

Folders and files

Latest commit

History

Repository files navigation

Structured Data Scraping Tutorial

Installation

Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Uh oh!

Contributors 2

Languages

Packages