-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backend-agnostic testing #1205
Backend-agnostic testing #1205
Conversation
this pattern probably more in line with server-based dbs
i.e. not including unmarked tests that are independent of dialect - should be useful in e.g. CI
so that -m default automatically selects all tests in these cases
"duckdb", | ||
"duckdb_only", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it that duckdb
and duckdb_only
are identical?
Ignore me, you have a note about this in your main description.
@property | ||
def cll(self): | ||
return cll_duckdb | ||
|
||
@property | ||
def cl(self): | ||
return cl_duckdb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctl
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
addopts = ["-m default"] | ||
markers = [ | ||
# only tests where backend is irrelevant: | ||
"core", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core
isn't currently running any tests for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, is this running without a folder specified? think in that case it is picking up conftest.py
from benchmarking/
which doesn't include the same logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as in are you doing just e.g. pytest -m core
rather than pytest -m core tests/
?
If so would you expect that to run the tests and benchmarking? If so I can either copy the functionality to benchmarking, or even perhaps lift conftest
to root level so that it covers both
# only tests where backend is irrelevant: | ||
"core", | ||
# see tests/decorator.py::dialect_groups for group details: | ||
"default", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want default
to run all of the benchmarks, or do we want this to be its own test mark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken benchmarking to be a separate workstream and consider them running separately, as they are performing different functions. As far as this PR goes I have just included the changes so that benchmarking runs the same as previously.
Might be a good upgrade to do something similar for benchmarking, but that might want some additional things (like slow/not slow type markers) and maybe a think about how we are doing the benchmark stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Andy, I'm happy for this to be merged once Ross is content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work all round @ADBond! 🎉
The code all works well, the only comments I have left are on the docs to try and make them the most readable they can. This isn't necessarily a trivial structure so I really want to make sure it is as easy as possible to understand. Most of them are just suggestions so happy to discuss any to try and get to the best solution!
|
||
Splink tests can be broadly categorised into three sets: | ||
|
||
* 'Core' tests - these are tests which test some specific bit of functionality which does not depend on any specific SQL dialect. They are usually unit tests - examples are testing [`InputColumn`](https://github.com/moj-analytical-services/splink/blob/master/tests/test_input_column.py) and testing the [latitude-longitude distance calculation](https://github.com/moj-analytical-services/splink/blob/master/tests/test_lat_long_distance.py). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put "Core tests", "Tests for specific backends" and "Backend-agnostic tests" in bold?
Splink tests can be broadly categorised into three sets: | ||
|
||
* 'Core' tests - these are tests which test some specific bit of functionality which does not depend on any specific SQL dialect. They are usually unit tests - examples are testing [`InputColumn`](https://github.com/moj-analytical-services/splink/blob/master/tests/test_input_column.py) and testing the [latitude-longitude distance calculation](https://github.com/moj-analytical-services/splink/blob/master/tests/test_lat_long_distance.py). | ||
* Tests for specific backends - these are tests which run against a specific SQL backend, and test some feature peculiar to this backend. There are not many of these, as Splink is designed to run very similarly independent of the backend used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
peculiar -> particular ?
|
||
* 'Core' tests - these are tests which test some specific bit of functionality which does not depend on any specific SQL dialect. They are usually unit tests - examples are testing [`InputColumn`](https://github.com/moj-analytical-services/splink/blob/master/tests/test_input_column.py) and testing the [latitude-longitude distance calculation](https://github.com/moj-analytical-services/splink/blob/master/tests/test_lat_long_distance.py). | ||
* Tests for specific backends - these are tests which run against a specific SQL backend, and test some feature peculiar to this backend. There are not many of these, as Splink is designed to run very similarly independent of the backend used. | ||
* Backend-agnostic tests - these are tests which run against some SQL backend, but which are written in such a way that they can run against many backends by making use of the [backend-agnostic testing framework](#backend-agnostic-testing). The majority of tests are of this type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Swap bullet 2 and 3 around to reflect the order of the following sections. Also specific backend ones should definitely be last given how small a number of tests fall in this category
|
||
### Core tests | ||
|
||
Core tests do not need to be handled any differently than ordinary `pytest` tests. Any test is marked as `core` by default, and will only be excluded from being a core test if it is decorated using either `@mark_with_dialects_excluding` or `@mark_with_dialects_including` from the [test decorator file](https://github.com/moj-analytical-services/splink/blob/master/tests/decorator.py). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove double negative in first line e.g. "Core tests are treated the same way as ordinary pytest
tests."
|
||
Note that unlike the exclusive `mark_with_dialects_excluding`, this decorator will _not_ paramaterise the test with the `dialect` argument. This is because usage of the _inclusive_ form is largely designed for single-dialect tests. If you wish to override this behaviour and parameterise the test you can use the argument `pass_dialect`, for example `@mark_with_dialects_including("spark", "sqlite", pass_dialect=True)`, in which case you would need to write the test in a [backend-independent manner](#backend-agnostic-testing). | ||
|
||
## Running tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we think it would be better to have the running tests section above the writing tests? Just thinking through as a user I am probably going to run the tests that already exist at some point before writing any myself. Plus having those commands further up the page helps for ease of use if you are going to use this page as a reference for the commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative could be having some sort of TL;DR section at the top with commands to copy/paste to run the testing suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think that makes sense - running definitely going to be the more common case, and having a look back over it there isn't really any context from writing needed to understand how to run them
|
||
Core tests do not need to be handled any differently than ordinary `pytest` tests. Any test is marked as `core` by default, and will only be excluded from being a core test if it is decorated using either `@mark_with_dialects_excluding` or `@mark_with_dialects_including` from the [test decorator file](https://github.com/moj-analytical-services/splink/blob/master/tests/decorator.py). | ||
|
||
### Backend-agnostic testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is a fantastic walk-through! 🎉
|
||
### Core tests | ||
|
||
Core tests do not need to be handled any differently than ordinary `pytest` tests. Any test is marked as `core` by default, and will only be excluded from being a core test if it is decorated using either `@mark_with_dialects_excluding` or `@mark_with_dialects_including` from the [test decorator file](https://github.com/moj-analytical-services/splink/blob/master/tests/decorator.py). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, would it be better to have the decorators laid out in bullets with s description of what section they would be included in? E.g.
Core tests do not need to be handled any differently than ordinary pytest
tests. Any test is marked as core
by default, and will only be excluded from being a core test if it is decorated using either:
@mark_with_dialects_excluding
for [Backend-agnostic tests](reference to doc section)@mark_with_dialects_including
for [Backend Specific tests](reference to doc section)
from the test decorator file.
|
||
### Backend-agnostic testing | ||
|
||
The majority of tests should be written using the backend-agnostic testing framework. This just provides some small tools which allow tests to be written in a backend-independent way. This means the tests can then by run against _all_ available SQL backends (or a subset, if some lack _necessary_ features for the test). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it slightly unintuitive to go from talking about testing being "Backend-Agnostic" to immediately excluding one - especially when we mark backend agnostic tests with @mark_with_dialects_excluding
. I think it would be helpful to start with some reference/example of writing a test that is applicable across all backends (especially to show how to mark those tests) - then move on to excluding sqlite
* 'Core' tests - these are tests which test some specific bit of functionality which does not depend on any specific SQL dialect. They are usually unit tests - examples are testing [`InputColumn`](https://github.com/moj-analytical-services/splink/blob/master/tests/test_input_column.py) and testing the [latitude-longitude distance calculation](https://github.com/moj-analytical-services/splink/blob/master/tests/test_lat_long_distance.py). | ||
* Tests for specific backends - these are tests which run against a specific SQL backend, and test some feature peculiar to this backend. There are not many of these, as Splink is designed to run very similarly independent of the backend used. | ||
* Backend-agnostic tests - these are tests which run against some SQL backend, but which are written in such a way that they can run against many backends by making use of the [backend-agnostic testing framework](#backend-agnostic-testing). The majority of tests are of this type. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a caveat here/somewhere else to explain why Athena isn't included as a backend to test? I.e. because of needing a connection to an AWS account.
If you want to test Splink against a specific version of python, the easiest method is to utilise docker 🐳. | ||
|
||
Docker allows you to more quickly and easily install a specific version of python and run the existing test library against it. | ||
|
||
This is particularly useful if you're using py > 3.9.10 (which is currently in use in our tests github action) and need to run a secondary set of tests. | ||
|
||
A pre-built Dockerfile for running tests against python version 3.9.10 can be located within [scripts/run_tests.Dockerfile](https://github.com/moj-analytical-services/splink/blob/master/scripts/run_tests.Dockerfile). | ||
|
||
To run, simply use the following docker command from within a terminal and the root folder of a splink clone: | ||
```sh | ||
docker build -t run_tests:testing -f scripts/run_tests.Dockerfile . && docker run --rm --name splink-test run_tests:testing | ||
``` | ||
|
||
This will both build and run the tests library. | ||
|
||
Feel free to replace `run_tests:testing` with an image name and tag you're happy with. | ||
|
||
Reusing the same image and tag will overwrite your existing image. | ||
|
||
You can also overwrite the default `CMD` if you want a different set of `pytest` command-line options, for example | ||
```sh | ||
docker run --rm --name splink-test run_tests:testing pytest -W ignore -m spark tests/test_u_train.py | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to other sections - I don't think this will apply to the majority of people, So I would be tempted to wrap up in a collapsible block
…first, then the sqlite-less peculiarities
@RossKen I have made a bunch of changes to the docs page. I think I have addressed all of your suggestions but let me know if I've missed anything. Also let me know if I've done anything that wasn't what you had in mind/any further feedback |
Thanks @ADBond, on my phone now but will look at properly in the morning |
One naming thing. Would it make sense to do "Tests for specific backends" -> "Backend-Specific Tests" to better contrast with "Backend-Agnostic Tests"? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned before - this is fantastic!
Thanks for making those changes to the docs - now even an idiot like me should be able to understand what is going on.
Good to merge from my end 🎉
This PR aims to provide a framework to easily allow backend-agnostic testing, so that it is straightforward to write tests that can run against all SQL backends. This currently covers
duckdb
,spark
, andsqlite
only.The core of this provides a decorator
mark_with_dialects_excluding
which can be used to decorate test functions, which marks test with all dialects except those specified, and any groups those dialects belong to (see below). There is also an inclusive versionmark_with_dialects_including
which should be used only for specific dialect tests.When tests that are marked in this way run, they receive a fixture
test_helpers
and a parameterdialect
, which allows the use of an objecthelper = test_helpers[dialect]
which has methods and properties to cover any dialect-specific part of testing, such ashelper.linker
,helper.cl
(comparison library), andhelper.load_frame_from_csv()
.Dialects can belong to zero or more groups - currently the only group is the privileged group
default
containingspark
andduckdb
which runs if no marks are passed in the command-line when running tests. These groups are just shortcuts to run sets of backends together.There are a bunch of different options to run tests depending on which tests are to run. In the following
core
tests are precisely those which are not decorated with one of the new decorators (and should ultimately end up as more unit-testy stuff that does not care about backend, such astest_input_column
,test_lat_long_distance
:pytest -m core
- runs thecore
tests onlypytest -m spark
- runs thecore
tests and allspark
tests (&sim for other backends)pytest -m spark_only
- runs onlyspark
tests (&sim for other backends)pytest
- equivalent topytest -m default
- runscore
tests, and those in thedefault
group (spark
andduckdb
)pytest -m all
- runs everything (core
and all covered backends)I have updated only a few test scripts with this approach, just to show how this works.
A couple of notes (which hopefully help sell the benefit of this approach for coverage):
tests/test_charts.py
, but one of the tests fails forspark
due to a bug in handling of infinity in Bayes factors, so will leave that to a separate PRtest_estimate_prob_two_rr_match.py::test_prob_rr_valid_range
andtests/test_u_train.py::test_u_train_link_only_sample
both fail forsqlite
, seemingly due to separate bugs. This backend won't run for now in CI (which I haven't touched), so figure this is okay to go as-is. Will open separate issues for these