Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1991 poetry parsers #2006

Merged
merged 2 commits into from
Jul 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 16 additions & 23 deletions ingestion/functions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,51 +78,40 @@ ls -l /usr/bin/python*
```

#### Setup
1. Setup and enter a virtual environment in `ingestion/functions`
1. Setup a virtual environment in `ingestion/functions`

python3.8 -m venv venv
source venv/bin/activate
poetry install

1. For each function you're planning to work with, be sure you have required
modules installed, e.g. via:
*NB:* Be sure you're using Python 3.8, which corresponds to the runtime of the job definitions run using Batch:

```shell
# In each parsing's subdir:
python3.8 -m pip install -r requirements.txt
# In the /ingestion/functions (necessary to run unit tests).
python3.8 -m pip install -r requirements.txt
```

*NB:* Be sure you're using Python 3.8, which corresponds to the runtime of the job definitions run using Batch.
poetry env use 3.8

#### Manual ingestion

You should be able to run ingestion using the curator UI. This exists as
a fallback if the UI triggers for ingestion are not working.

1. You'll need AWS access, follow the steps in the previous section.
2. Once you've got AWS setup, run the following in `ingestion/functions` after switching
to the virtualenv:
2. Once you've got AWS setup, run the following in `ingestion/functions` after setting up the virtual env:

source venv/bin/activate
python aws.py jobdefs
poetry run python aws.py jobdefs

This should show existing **job definitions**. Job definitions are templates that
tell AWS Batch which parser to run and in which environment (dev or prod). If this
command doesn't work, contact the engineering team to setup access.

3. Check if the ingestion you want to run already has an associated job definition
corresponding to the environment you want to run in:
`python aws.py jobdefs | grep colombia.*prod` to search for Colombia ingestion
`poetry run python aws.py jobdefs | grep colombia.*prod` to search for Colombia ingestion
in prod, which gives

ACTIVE colombia-colombia-ingestor-prod

4. If step 3 shows that a job definition is available, you can **submit** a job:

python aws.py submit colombia-colombia-ingestor-prod
poetry run python aws.py submit colombia-colombia-ingestor-prod

Check the submit help options `python aws.py submit --help`. The most common
Check the submit help options `poetry run python aws.py submit --help`. The most common
options to use are `-t` (or `--timeout)` to specify the maximum number of *minutes*
the ingestion is allowed to run. The default is 60 minutes, which is fine for
daily ingestion, but might not be enough time to run a backfill.
Expand Down Expand Up @@ -163,7 +152,11 @@ if __name__ == "__main__":
event_handler(event)
```

You are free to write the parsers however you like. Use the existing functions as a template to get started.
You are free to write the parsers however you like. Use the existing functions as a template to get started. If you find you need a dependency that isn't supplied in the virtual environment, you can add it like this:

```shell
poetry install packagename
```

### Writing a parser

Expand Down Expand Up @@ -253,10 +246,10 @@ Fields and nested structs should be preferably not set (or set to `None`) rather
#### Unit tests

Unit testing is mostly standard `pytest`, with a caveat to be sure that tests
are run with the correct Python version. E.g.,
are run inside the poetry environment. E.g.,

```shell
python3.8 -m pytest test/my_test.py
poetry run pytest test/my_test.py
```

#### Integration and End-to-end tests
Expand Down
3 changes: 0 additions & 3 deletions ingestion/functions/common/requirements.txt

This file was deleted.

Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
1 change: 0 additions & 1 deletion ingestion/functions/parsing/cuba/requirements.txt

This file was deleted.

1 change: 0 additions & 1 deletion ingestion/functions/parsing/czechia/requirements.txt

This file was deleted.

Empty file.
Empty file.
Empty file.
1 change: 0 additions & 1 deletion ingestion/functions/parsing/germany/requirements.txt

This file was deleted.

Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
1 change: 0 additions & 1 deletion ingestion/functions/parsing/south_africa/requirements.txt

This file was deleted.

4 changes: 0 additions & 4 deletions ingestion/functions/parsing/taiwan/requirements.txt

This file was deleted.

Empty file.
Empty file.
43 changes: 0 additions & 43 deletions ingestion/functions/requirements.txt

This file was deleted.