Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend-agnostic testing #1205

Merged
merged 53 commits into from
May 25, 2023

Conversation

ADBond
Copy link
Contributor

@ADBond ADBond commented Apr 26, 2023

This PR aims to provide a framework to easily allow backend-agnostic testing, so that it is straightforward to write tests that can run against all SQL backends. This currently covers duckdb, spark, and sqlite only.

The core of this provides a decorator mark_with_dialects_excluding which can be used to decorate test functions, which marks test with all dialects except those specified, and any groups those dialects belong to (see below). There is also an inclusive version mark_with_dialects_including which should be used only for specific dialect tests.

When tests that are marked in this way run, they receive a fixture test_helpers and a parameter dialect, which allows the use of an object helper = test_helpers[dialect] which has methods and properties to cover any dialect-specific part of testing, such as helper.linker, helper.cl (comparison library), and helper.load_frame_from_csv().

Dialects can belong to zero or more groups - currently the only group is the privileged group default containing spark and duckdb which runs if no marks are passed in the command-line when running tests. These groups are just shortcuts to run sets of backends together.

There are a bunch of different options to run tests depending on which tests are to run. In the following core tests are precisely those which are not decorated with one of the new decorators (and should ultimately end up as more unit-testy stuff that does not care about backend, such as test_input_column, test_lat_long_distance:

  • pytest -m core - runs the core tests only
  • pytest -m spark - runs the core tests and all spark tests (&sim for other backends)
  • pytest -m spark_only - runs only spark tests (&sim for other backends)
  • pytest - equivalent to pytest -m default - runs core tests, and those in the default group (spark and duckdb)
  • pytest -m all - runs everything (core and all covered backends)

I have updated only a few test scripts with this approach, just to show how this works.

A couple of notes (which hopefully help sell the benefit of this approach for coverage):

  • initially updated tests/test_charts.py, but one of the tests fails for spark due to a bug in handling of infinity in Bayes factors, so will leave that to a separate PR
  • test_estimate_prob_two_rr_match.py::test_prob_rr_valid_range and tests/test_u_train.py::test_u_train_link_only_sample both fail for sqlite, seemingly due to separate bugs. This backend won't run for now in CI (which I haven't touched), so figure this is okay to go as-is. Will open separate issues for these

Comment on lines +75 to +76
"duckdb",
"duckdb_only",
Copy link
Contributor

@ThomasHepworth ThomasHepworth May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it that duckdb and duckdb_only are identical?

Ignore me, you have a note about this in your main description.

Comment on lines +55 to +61
@property
def cll(self):
return cll_duckdb

@property
def cl(self):
return cl_duckdb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctl too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

addopts = ["-m default"]
markers = [
# only tests where backend is irrelevant:
"core",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

core isn't currently running any tests for me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, is this running without a folder specified? think in that case it is picking up conftest.py from benchmarking/ which doesn't include the same logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in are you doing just e.g. pytest -m core rather than pytest -m core tests/?

If so would you expect that to run the tests and benchmarking? If so I can either copy the functionality to benchmarking, or even perhaps lift conftest to root level so that it covers both

# only tests where backend is irrelevant:
"core",
# see tests/decorator.py::dialect_groups for group details:
"default",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want default to run all of the benchmarks, or do we want this to be its own test mark?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken benchmarking to be a separate workstream and consider them running separately, as they are performing different functions. As far as this PR goes I have just included the changes so that benchmarking runs the same as previously.
Might be a good upgrade to do something similar for benchmarking, but that might want some additional things (like slow/not slow type markers) and maybe a think about how we are doing the benchmark stuff

Copy link
Contributor

@ThomasHepworth ThomasHepworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Andy, I'm happy for this to be merged once Ross is content.

Copy link
Contributor

@RossKen RossKen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work all round @ADBond! 🎉

The code all works well, the only comments I have left are on the docs to try and make them the most readable they can. This isn't necessarily a trivial structure so I really want to make sure it is as easy as possible to understand. Most of them are just suggestions so happy to discuss any to try and get to the best solution!


Splink tests can be broadly categorised into three sets:

* 'Core' tests - these are tests which test some specific bit of functionality which does not depend on any specific SQL dialect. They are usually unit tests - examples are testing [`InputColumn`](https://github.com/moj-analytical-services/splink/blob/master/tests/test_input_column.py) and testing the [latitude-longitude distance calculation](https://github.com/moj-analytical-services/splink/blob/master/tests/test_lat_long_distance.py).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put "Core tests", "Tests for specific backends" and "Backend-agnostic tests" in bold?

Splink tests can be broadly categorised into three sets:

* 'Core' tests - these are tests which test some specific bit of functionality which does not depend on any specific SQL dialect. They are usually unit tests - examples are testing [`InputColumn`](https://github.com/moj-analytical-services/splink/blob/master/tests/test_input_column.py) and testing the [latitude-longitude distance calculation](https://github.com/moj-analytical-services/splink/blob/master/tests/test_lat_long_distance.py).
* Tests for specific backends - these are tests which run against a specific SQL backend, and test some feature peculiar to this backend. There are not many of these, as Splink is designed to run very similarly independent of the backend used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peculiar -> particular ?


* 'Core' tests - these are tests which test some specific bit of functionality which does not depend on any specific SQL dialect. They are usually unit tests - examples are testing [`InputColumn`](https://github.com/moj-analytical-services/splink/blob/master/tests/test_input_column.py) and testing the [latitude-longitude distance calculation](https://github.com/moj-analytical-services/splink/blob/master/tests/test_lat_long_distance.py).
* Tests for specific backends - these are tests which run against a specific SQL backend, and test some feature peculiar to this backend. There are not many of these, as Splink is designed to run very similarly independent of the backend used.
* Backend-agnostic tests - these are tests which run against some SQL backend, but which are written in such a way that they can run against many backends by making use of the [backend-agnostic testing framework](#backend-agnostic-testing). The majority of tests are of this type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swap bullet 2 and 3 around to reflect the order of the following sections. Also specific backend ones should definitely be last given how small a number of tests fall in this category


### Core tests

Core tests do not need to be handled any differently than ordinary `pytest` tests. Any test is marked as `core` by default, and will only be excluded from being a core test if it is decorated using either `@mark_with_dialects_excluding` or `@mark_with_dialects_including` from the [test decorator file](https://github.com/moj-analytical-services/splink/blob/master/tests/decorator.py).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove double negative in first line e.g. "Core tests are treated the same way as ordinary pytest tests."


Note that unlike the exclusive `mark_with_dialects_excluding`, this decorator will _not_ paramaterise the test with the `dialect` argument. This is because usage of the _inclusive_ form is largely designed for single-dialect tests. If you wish to override this behaviour and parameterise the test you can use the argument `pass_dialect`, for example `@mark_with_dialects_including("spark", "sqlite", pass_dialect=True)`, in which case you would need to write the test in a [backend-independent manner](#backend-agnostic-testing).

## Running tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think it would be better to have the running tests section above the writing tests? Just thinking through as a user I am probably going to run the tests that already exist at some point before writing any myself. Plus having those commands further up the page helps for ease of use if you are going to use this page as a reference for the commands.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative could be having some sort of TL;DR section at the top with commands to copy/paste to run the testing suite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that makes sense - running definitely going to be the more common case, and having a look back over it there isn't really any context from writing needed to understand how to run them


Core tests do not need to be handled any differently than ordinary `pytest` tests. Any test is marked as `core` by default, and will only be excluded from being a core test if it is decorated using either `@mark_with_dialects_excluding` or `@mark_with_dialects_including` from the [test decorator file](https://github.com/moj-analytical-services/splink/blob/master/tests/decorator.py).

### Backend-agnostic testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is a fantastic walk-through! 🎉


### Core tests

Core tests do not need to be handled any differently than ordinary `pytest` tests. Any test is marked as `core` by default, and will only be excluded from being a core test if it is decorated using either `@mark_with_dialects_excluding` or `@mark_with_dialects_including` from the [test decorator file](https://github.com/moj-analytical-services/splink/blob/master/tests/decorator.py).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, would it be better to have the decorators laid out in bullets with s description of what section they would be included in? E.g.

Core tests do not need to be handled any differently than ordinary pytest tests. Any test is marked as core by default, and will only be excluded from being a core test if it is decorated using either:

  • @mark_with_dialects_excluding for [Backend-agnostic tests](reference to doc section)
  • @mark_with_dialects_including for [Backend Specific tests](reference to doc section)

from the test decorator file.


### Backend-agnostic testing

The majority of tests should be written using the backend-agnostic testing framework. This just provides some small tools which allow tests to be written in a backend-independent way. This means the tests can then by run against _all_ available SQL backends (or a subset, if some lack _necessary_ features for the test).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it slightly unintuitive to go from talking about testing being "Backend-Agnostic" to immediately excluding one - especially when we mark backend agnostic tests with @mark_with_dialects_excluding. I think it would be helpful to start with some reference/example of writing a test that is applicable across all backends (especially to show how to mark those tests) - then move on to excluding sqlite

* 'Core' tests - these are tests which test some specific bit of functionality which does not depend on any specific SQL dialect. They are usually unit tests - examples are testing [`InputColumn`](https://github.com/moj-analytical-services/splink/blob/master/tests/test_input_column.py) and testing the [latitude-longitude distance calculation](https://github.com/moj-analytical-services/splink/blob/master/tests/test_lat_long_distance.py).
* Tests for specific backends - these are tests which run against a specific SQL backend, and test some feature peculiar to this backend. There are not many of these, as Splink is designed to run very similarly independent of the backend used.
* Backend-agnostic tests - these are tests which run against some SQL backend, but which are written in such a way that they can run against many backends by making use of the [backend-agnostic testing framework](#backend-agnostic-testing). The majority of tests are of this type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a caveat here/somewhere else to explain why Athena isn't included as a backend to test? I.e. because of needing a connection to an AWS account.

Comment on lines 253 to 275
If you want to test Splink against a specific version of python, the easiest method is to utilise docker 🐳.

Docker allows you to more quickly and easily install a specific version of python and run the existing test library against it.

This is particularly useful if you're using py > 3.9.10 (which is currently in use in our tests github action) and need to run a secondary set of tests.

A pre-built Dockerfile for running tests against python version 3.9.10 can be located within [scripts/run_tests.Dockerfile](https://github.com/moj-analytical-services/splink/blob/master/scripts/run_tests.Dockerfile).

To run, simply use the following docker command from within a terminal and the root folder of a splink clone:
```sh
docker build -t run_tests:testing -f scripts/run_tests.Dockerfile . && docker run --rm --name splink-test run_tests:testing
```

This will both build and run the tests library.

Feel free to replace `run_tests:testing` with an image name and tag you're happy with.

Reusing the same image and tag will overwrite your existing image.

You can also overwrite the default `CMD` if you want a different set of `pytest` command-line options, for example
```sh
docker run --rm --name splink-test run_tests:testing pytest -W ignore -m spark tests/test_u_train.py
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to other sections - I don't think this will apply to the majority of people, So I would be tempted to wrap up in a collapsible block

@ADBond
Copy link
Contributor Author

ADBond commented May 24, 2023

@RossKen I have made a bunch of changes to the docs page. I think I have addressed all of your suggestions but let me know if I've missed anything. Also let me know if I've done anything that wasn't what you had in mind/any further feedback

@RossKen
Copy link
Contributor

RossKen commented May 24, 2023

Thanks @ADBond, on my phone now but will look at properly in the morning

@RossKen
Copy link
Contributor

RossKen commented May 24, 2023

One naming thing. Would it make sense to do "Tests for specific backends" -> "Backend-Specific Tests" to better contrast with "Backend-Agnostic Tests"?

Copy link
Contributor

@RossKen RossKen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned before - this is fantastic!

Thanks for making those changes to the docs - now even an idiot like me should be able to understand what is going on.

Good to merge from my end 🎉

@ADBond ADBond merged commit 33804a0 into moj-analytical-services:master May 25, 2023
@ADBond ADBond deleted the backend-agnostic-testing branch May 25, 2023 08:47
@ADBond ADBond mentioned this pull request May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants