Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 1926: fix link checker #1929

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions data/data-pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -322,7 +322,9 @@ see [python-markdown docs](https://github.com/ipython-contrib/jupyter_contrib_nb

### Background

<!-- markdown-link-check-disable -->
For this project, we make use of [pytest](https://docs.pytest.org/en/latest/) for testing purposes.
<!-- markdown-link-check-enable-->

To run tests, simply run `poetry run pytest` in this directory (i.e., `justice40-tool/data/data-pipeline`).

Expand Down Expand Up @@ -440,7 +442,9 @@ In the future, we could adopt any of the below strategies to work around this:

1. We could use [pytest-snapshot](https://pypi.org/project/pytest-snapshot/) to automatically store the output of each test as data changes. This would make it so that you could avoid having to generate a pickle for each method - instead, you would only need to call `generate` once , and only when the dataframe had changed.

<!-- markdown-link-check-disable -->
Additionally, you could use a pandas type schema annotation such as [pandera](https://pandera.readthedocs.io/en/stable/schema_models.html?highlight=inputschema#basic-usage) to annotate input/output schemas for given functions, and your unit tests could use these to validate explicitly. This could be of very high value for annotating expectations.
<!-- markdown-link-check-enable-->

Alternatively, or in conjunction, you could move toward using a more strictly-typed container format for read/writes such as SQL/SQLite, and use something like [SQLModel](https://github.com/tiangolo/sqlmodel) to handle more explicit type guarantees.

Expand Down