-
Due to the way Scrapy structures projects, we need to add the path to this repo to python path.
export PYTHONPATH=/path/to/police-data-trust-scrapers/
-
Create a virtual environment with Python 3.13.0
-
Install requirements
pip install -r requirements_dev.txt
Note: You can add arguments to scrapers by adding -a {argument_name}={argument_value}
to the end of the bash command.
-
Go to the fifty_a folder
cd scrapers/fifty_a
-
Run the office spider
scrapy crawl officer -O officers.jsonl
-
Run the command spider
scrapy crawl command -O commands.jsonl
This is not a wb scraper. It rather pulls data from their API endpoint.
From the repo root, run the following. It will pull the data and create json files
in the data/citizens_police_data_project
folder.
python scrapers/citizens_police_data_project.py