Faster CI #682

alimanfoo · 2024-12-03T19:56:04Z

As part of #366 we have been moving all Anopheles tests to use simulated data, rather than accessing data from GCS. The action which run these new unit tests (tests.yml) are relatively fast, e.g., running in ~6 minutes. They also have the benefit that they don't need to access GCS data and so can be run from a PR from a fork.

The action which runs the remaining legacy tests that access GCS data are defined in legacy_tests.yml. These tests have been gradually removed during the refactoring work, and at some point we might be able to completely remove this action. Until that time, this action currently takes ~13m and should reduce in time with further refactoring ongoing (e.g., via #665).

The slowest action currently is the notebooks.yml action which runs all Jupyter notebooks in the "notebooks" folder. This takes around 41m and accesses data from GCS so cannot be run in a PR from a fork.

It would be great in general if checks that run in PRs run faster and can be run in a PR from a fork.

Here's a suggested approach:

Move the notebooks.yml action to only run as part of the release.yml action. I.e., we only run the notebooks when trying to make a release. Any notebook failures will block a release being uploaded to PyPI, and so we can then fix any problems and retag the release.
Move the legacy_tests.yml action to only run as part of the release.yml action, for the same reason as above.
At a later date, when all of the Anopheles unit tests that access GCS data have been removed via the refactoring work, maybe remove the legacy_tests.yml action completely. Although possibly some Plasmodium tests will remain that still access GCS data. In which case rename legacy_tests.yml to integration_tests.yml to help better clarify the purpose.

The text was updated successfully, but these errors were encountered:

alimanfoo · 2024-12-04T15:05:21Z

We could also consider running the notebooks and integration tests on merge to master. I.e., these checks only run after a PR is merged, not during commits to an open PR. This would catch any failures earlier than if they were only run on a release. However, this would rely on someone getting notifications of the failures on master.

We could of course do both, i.e., run these checks on merge to master, and as part of the release action to ensure releases aren't made unless these checks are passing.

alimanfoo · 2024-12-06T17:56:00Z

@leehart what do you think? Should the notebooks and integration tests run...

Option 1 - on merge to master
Option 2 - within the release action (preventing upload to PyPI if they fail)
Option 3 - on merge to master and within the release action

leehart · 2024-12-09T17:39:16Z

Hi @alimanfoo. Forgive me, I don't like any of those options!

Generally, things failing after they've been merged to master, resulting in a broken master, are frustrating. Ideally, we only want to merge clean code to master, which is (in my view) the main benefit of the PR CI checks. Ideally problems ought to be resolved during PR, rather than additional issues raised after a PR has been reviewed, approved and merged. It will potentially block releases and can leave a mess for someone else to clean up.

It would be ideal if we could run the notebooks and other GCS-dependent checks for any new set of changes we want, whenever we want to, and update accordingly, before merging into the sacred timeline.

I suppose another option is that we could have a branch running parallel to the main master branch, in which we merge PRs that have not had GCS-dependent checks run, but pass all other checks. Then, when we have run GCS-dependent checks, we can fix it on that branch before merging into master. Maybe the branch could be called "staging".

alimanfoo mentioned this issue Dec 4, 2024

Refactor unit and integration tests and corresponding github actions #686

Merged

leehart added maintenance performance labels Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster CI #682

Faster CI #682

alimanfoo commented Dec 3, 2024 •

edited

Loading

alimanfoo commented Dec 4, 2024 •

edited

Loading

alimanfoo commented Dec 6, 2024 •

edited

Loading

leehart commented Dec 9, 2024

Faster CI #682

Faster CI #682

Comments

alimanfoo commented Dec 3, 2024 • edited Loading

alimanfoo commented Dec 4, 2024 • edited Loading

alimanfoo commented Dec 6, 2024 • edited Loading

leehart commented Dec 9, 2024

alimanfoo commented Dec 3, 2024 •

edited

Loading

alimanfoo commented Dec 4, 2024 •

edited

Loading

alimanfoo commented Dec 6, 2024 •

edited

Loading