You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
If a change in Satpy or one of its dependencies leads to an unintentional change in the produced image, we currently have no automated way of detecting this. If the change is small, we might not notice at all. If the change is large, someone might notice it sooner or later, possibly too late to clearly pinpoint the change to a specific cause. For a tool that is primarily used to produce images, it would be desirable to have systematic acceptance / integration / regression tests.
Describe the solution you'd like
I would like that Satpy runs systematic acceptance / integration / regression tests that do a pixel-per-pixel comparison of produced images compared to reference images.
Describe any changes to existing user workflow
It would change the build process and development workflow. Such tests are probably too heavy to run after each commit on GitHub CI, but they could be combined with the performance tests that are already running on the European Weather Cloud (EWC). Perhaps nightly for the main branch and for non-draft PRs that have had new commits since the last run. If differences are reported, we should then identify whether those differences are expected; if they are, then the new image would becomes the new reference.
I don't foresee any differences to user workflow, except they might get a more stable product and better documentation in case of expected image changes.
Additional context
This idea was triggered by a similar system in NinJo (which reported differences between NinJoTIFF and GeoTIFF images I provided; those turned out to be my fault) and a comment by @djhoese on Slack.
The text was updated successfully, but these errors were encountered:
One thing we'd have to decide is if creating a "basic" set of user-facing examples that produce images and running those and doing the comparisons would be enough. Otherwise, we'd have to come up with a full set of comparison tests. In P2G I have behavior (using the behave package) tests that generate output and then run a compare script in polar2grid which extracts out the arrays from the images and using np.is_close with some tolerances to allow for a few pixels to change. It reports things like if the shape of the output changed, what percentage of the pixels are different, if the expected output file was actually generated for this execution, etc.
Another thing we could do to limit how often things are run is if we run them for PRs, run the regular unit tests too. This assumes running the integration tests takes a long time. If the unit tests fail there is no point is running the integration tests.
Feature Request
Is your feature request related to a problem? Please describe.
If a change in Satpy or one of its dependencies leads to an unintentional change in the produced image, we currently have no automated way of detecting this. If the change is small, we might not notice at all. If the change is large, someone might notice it sooner or later, possibly too late to clearly pinpoint the change to a specific cause. For a tool that is primarily used to produce images, it would be desirable to have systematic acceptance / integration / regression tests.
Describe the solution you'd like
I would like that Satpy runs systematic acceptance / integration / regression tests that do a pixel-per-pixel comparison of produced images compared to reference images.
Describe any changes to existing user workflow
It would change the build process and development workflow. Such tests are probably too heavy to run after each commit on GitHub CI, but they could be combined with the performance tests that are already running on the European Weather Cloud (EWC). Perhaps nightly for the main branch and for non-draft PRs that have had new commits since the last run. If differences are reported, we should then identify whether those differences are expected; if they are, then the new image would becomes the new reference.
I don't foresee any differences to user workflow, except they might get a more stable product and better documentation in case of expected image changes.
Additional context
This idea was triggered by a similar system in NinJo (which reported differences between NinJoTIFF and GeoTIFF images I provided; those turned out to be my fault) and a comment by @djhoese on Slack.
The text was updated successfully, but these errors were encountered: