Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing: Making integration tests run conditional #612

Merged

Conversation

a-ovchinnikov
Copy link
Contributor

@a-ovchinnikov a-ovchinnikov commented Aug 27, 2024

Before this change integration tests and e2e tests would run unconditionally on every change. This commit introduces some shortcuts to this process: changes that do not affect code (e.g. changes to text files or to anything outside of main source tree and integration tests tree) will not trigger integration and e2e tests, while changes to individual package managers or package managers tests will trigger just the affected tests. This will reduce the average time to run the tests.

Maintainers will complete the following section

  • Commit messages are descriptive enough
  • Code coverage from testing does not decrease and new code is covered
  • Docs updated (if applicable)
  • Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

@a-ovchinnikov a-ovchinnikov marked this pull request as draft August 27, 2024 21:16
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch 2 times, most recently from 5a6560b to eecd05a Compare August 27, 2024 21:42
Copy link
Member

@eskultety eskultety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very promising once we iron out the details.

tests/integration/utils.py Outdated Show resolved Hide resolved
tests/integration/utils.py Outdated Show resolved Hide resolved
tests/integration/conftest.py Outdated Show resolved Hide resolved
tests/integration/utils.py Outdated Show resolved Hide resolved
tests/integration/utils.py Outdated Show resolved Hide resolved
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch from 48405a1 to 4be5ff5 Compare August 28, 2024 23:02
@a-ovchinnikov
Copy link
Contributor Author

/retest

tests/integration/conftest.py Outdated Show resolved Hide resolved
tests/integration/utils.py Outdated Show resolved Hide resolved
tests/integration/utils.py Outdated Show resolved Hide resolved
tests/integration/utils.py Show resolved Hide resolved
tests/integration/conftest.py Outdated Show resolved Hide resolved
tests/integration/utils.py Show resolved Hide resolved
tests/integration/utils.py Outdated Show resolved Hide resolved
tests/integration/utils.py Show resolved Hide resolved
tests/integration/utils.py Outdated Show resolved Hide resolved
tests/integration/conftest.py Outdated Show resolved Hide resolved
tests/integration/utils.py Outdated Show resolved Hide resolved
tests/integration/utils.py Outdated Show resolved Hide resolved
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch from ba0a4e9 to 4eed655 Compare August 29, 2024 20:29
tests/integration/utils.py Fixed Show resolved Hide resolved
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch 2 times, most recently from 89fb877 to 4af4e8d Compare August 29, 2024 22:19
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch 5 times, most recently from 86f33c7 to e8d7ce1 Compare September 3, 2024 16:40
tests/integration/utils.py Fixed Show fixed Hide fixed
tests/integration/utils.py Fixed Show fixed Hide fixed
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch 2 times, most recently from 1f90e33 to 9246669 Compare September 3, 2024 18:59
@a-ovchinnikov a-ovchinnikov marked this pull request as ready for review September 3, 2024 19:14
tox.ini Show resolved Hide resolved
tests/integration/conftest.py Outdated Show resolved Hide resolved
tests/integration/conftest.py Outdated Show resolved Hide resolved
Comment on lines 476 to 501
def retrieve_changed_files_from_git(ref: str = "HEAD") -> tuple[Path, ...]:
repo = Repo(".", search_parent_directories=True)
commit = repo.commit(ref)
modified_files = [Path(f) for f in commit.stats.files]
return tuple(modified_files)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the function returns changes only made in the last commit.
I suppose it should return changes from <current_branch>..main or ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intent was to run this for the latest commit only. Diffing feature branch with main would produce too many changed modules and would effectively negate all the benefits of the change. Suppose one is working on pip in one commit, then on gomod, then on npm. My change addresses the fact that it is not necessary to retest pip when gomod is affected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose one is working on pip in one commit, then on gomod, then on npm. My change addresses the fact that it is not necessary to retest pip when gomod is affected.

This doesn't make much sense to me, IOW if I work on several pkg mgrs at the same time then I need to run the integration test suite against all of them, not just the last one. What you're saying makes sense in an interactive rebase scenario, but not when I'm running the test suite on the whole patch set, so @slimreaper35 is right, this needs to be run against main by default and ref (currently defaulting to "HEAD") should be allowed to be set via an env variable just in case someone needs to further tweak against which ref should the selection be produced against.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit like overengineering to me. It is extremely easy to trigger all tests to run and having the change test anything but the last commit would likely trigger the full set of tests. If this is the desired behavior then we already have an override variable for that. Yet another variable will make the feature harder to use and will require remembering more context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit like overengineering to me. It is extremely easy to trigger all tests to run

But that's exactly not what we want here. To be precise, the scenario where someone would touch all pkg mgrs at the same time is extremely rare. And I specifically don't want to run tests for pkg mgrs I'm not touching and I know that the related tests take too long to execute (gomod, pip), so if I'm touching both RPM and yarn in a single PR, then I don't want the tests to run just for yarn or RPM depending on what the last commit is. And I most certainly don't want to run tests for pip and gomod when I know they take twice as long.

If this is the desired behavior then we already have an override variable for that.

What variable do you mean?

I think the whole point of this PR was to kind of automate the detection of which tests need to be run for a particular change that would work both for one-shot executions as well in interactive rebase scenarios. At least that was my understanding and the value I was seeing in that, i.e. not needing to use any variable (unless absolutely needed for some heavy tweaking) and it'd just do the "right thing".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the last commit in the PR updates only the README.md, but the previous commits are related to one or any number of package managers? We don't want to skip all integration tests in that case.

Very good point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the best of my knowledge it is a merge commit that gets tested on the gate, so it ends up testing the entire commit chain for every PR. It is testing just the last commit locally under an assumption that the last commit contains a testable chunk of work. The change will run zero to all ITs depending on the number of PMs affected by the last commit locally and zero to all ITs on the gate depending on how broad is the change introduced by the entire PR.

Copy link
Contributor Author

@a-ovchinnikov a-ovchinnikov Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that GH tests all commits in a sequence (see a screenshot of the current sequence below)
image
is proven by the latest (at the moment of writing) IT job.

There you can observe that all tests were run (however second commit in the sequence does not affect any code) and there was no failure due to insufficient permissions despite the fact that the necessary permissions are added in the second commit, not the first one. Further, the job does this to fetch the code:

/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* +aa22d80cf78f8d981813bfda262e8ba7f2319589:refs/remotes/pull/612/merge

and checks out the pull request like this:

 /usr/bin/git checkout --progress --force refs/remotes/pull/612/merge

This results in HEAD pointing to a merge commit:

HEAD is now at aa22d80 Merge 7a546e32f90ad33e8e10d95e3164b777af957867 into b3862f2c08d62a901840f33a6a05eabaf46c0365

7a546e3 is my latest commit, b3862f2 is the tip of main at the moment of the last push. It results in all files modified in the feature branch to be counted together.

Edit: formatting.

Copy link
Member

@eskultety eskultety Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further, the job does this to fetch the code:

/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/>origin/* +refs/tags/*:refs/tags/* +aa22d80cf78f8d981813bfda262e8ba7f2319589:refs/remotes/pull/612/merge

and checks out the pull request like this:

/usr/bin/git checkout --progress --force refs/remotes/pull/612/merge

This results in HEAD pointing to a merge commit:

HEAD is now at aa22d80 Merge 7a546e32f90ad33e8e10d95e3164b777af957867 into b3862f2c08d62a901840f33a6a05eabaf46c0365

I never noticed ^this, TIL GitHub automatically creates a "virtual" merge commit as a preview of the changes what it'd look like if the PR were to be merged to main immediately, thank you!

That said, what you modeled according to GitHub's behaviour won't be the case locally where your branch is properly rebased onto main and you won't have this automatic merge commit which you could inspect for changes. Instead, you'll have your individual commits and so based on your current implementation I am either:

  • forced to always run the integration test suite in an interactive rebase (which should be a choice really and a good practice, not a must) OR
  • set an environment variable to always run ALL tests because the last commit would end up yielding no relevant changes

none of which is desirable really because it's both not that much different from the status quo and arguably actually a regression in the behaviour because this solution only really tackles the gating environment but neglects local dev environments. If you instead compare all commits against main instead of just inspecting HEAD, then it should work consistently in both local and gating env.

Edit: while the time it takes to run integration tests should be decreased in general and I'm still in large favour of this PR, I'd argue the major perception of the time penalty paid on the execution happens in local dev envs rather than in the gating CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this change reduces average local run-time more than average gate run-time. Following a separate discussion I'll make selector behavior more uniform in the next revision.

tests/integration/conftest.py Outdated Show resolved Hide resolved
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch 2 times, most recently from 0fc0b7c to 2549ac9 Compare September 11, 2024 14:27
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch 2 times, most recently from 022a533 to 45c8c7c Compare September 11, 2024 15:44
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch 6 times, most recently from 610b78d to 248ec3a Compare September 12, 2024 18:00
files = repo.git.diff("--name-only", f"{main}..HEAD").split("\n")
# Widest net possible in order not to interfere with testing for no good reason.
except Exception:
# If there is no main and no origin/main then someone is probably doing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please log this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I log something anywhere within these functions, I see nothing unless I add --log-cli-level=<level> in [testenv:integration]. But the option is already configured below in the file

Is it just me? I don't understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just me?

It's not, turns out, we need to set log_cli=true for pytest as well then I can see:

2024-09-13 17:44:38 DEBUG HELLO FROM TEST

based on this diff:

diff --git a/tests/integration/utils.py b/tests/integration/utils.py
index 80db1fd1..f7c89398 100644
--- a/tests/integration/utils.py
+++ b/tests/integration/utils.py
@@ -484,6 +484,7 @@ def retrieve_changed_files_from_git() -> tuple[Path, ...]:
     #  Unsupported right operand type for in ("Callable[[], IterableList[Head]]")
     main = "main" if "main" in repo.branches else "origin/main"  # type: ignore
     try:
+        log.debug("HELLO FROM TEST")
         files = repo.git.diff("--name-only", f"{main}..HEAD").split("\n")
     # Widest net possible in order not to interfere with testing for no good reason.
     except Exception as e:

And I've even seen @a-ovchinnikov's warning when I made a typo in there :D:

------------------------------------------ live log collection -------------------------------------------
2024-09-13 17:48:10 WARNING Detection of changed files unexpectedly failed. This either indicates that both 'main' and 'origin/main' branches are missing in a repo or a more fundamental failure. The tool will attempt to recover from this by not filtering any tests out. This is the exception that was encountered: 'Logger' object has no attribute 'DEBUG'
collected 52 items

so yeah, log_cli=true works

Copy link
Member

@slimreaper35 slimreaper35 Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would work, but then tox -e py3 shows countless amount of logs 🤯

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log_cli_level = DEBUG is supposed to log everything. it should be at least INFO for tests, or maybe even just WARNING.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would work, but then tox -e py3 shows countless amount of logs 🤯

It does, do you have a better idea how to handle debug logs from test? I mean, it's kinda expected that once we turn it on, debug logs litter the output, that's their inherent nature, isn't it?

Neither is IMO a problem this PR inherently needs to solve (unless you favour doing it here). We identified the problem and I agree we should decrease the default log level, allowing users to override it from tox's CLI if needed.

Copy link
Member

@slimreaper35 slimreaper35 Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that we could use log_cli only for integration tests, but log-cli is not available in CLI.
We don't have to fix it in this PR.

Copy link
Member

@eskultety eskultety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from logging the case when the main branch isn't called main I only have one other request - please update the README integration testing section mentioning this behaviour and how it can be overridden to either get a full test or a different subset (-k), otherwise LGTM and we're good to merge.

@a-ovchinnikov
Copy link
Contributor Author

Apart from logging the case when the main branch isn't called main I only have one other request - please update the README integration testing section mentioning this behaviour and how it can be overridden to either get a full test or a different subset (-k), otherwise LGTM and we're good to merge.

Thank you, I have completely forgotten to update README!

@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch from 248ec3a to 2a612b9 Compare September 13, 2024 14:54
@@ -258,6 +258,31 @@ Similarly to unit tests, for finer control over which tests get executed, e.g. t
tox -e integration -- tests/integration/test_package_managers.py::test_packages[gomod_without_deps]
```

Integration tests verify package managers handlers behavior and ensure that it remains consistent.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, README/doc changes should ideally go to separate commits from code changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Before this change integration tests and e2e tests would run
unconditionally on every change. This commit introduces
some shortcuts to this process: changes that do not affect
code (e.g. changes to text files or to anything outside
of main source tree and integration tests tree) will not
trigger integration and e2e tests, while changes to individual
package managers or package managers tests will trigger just the
affected tests. This will reduce the average time to run
the tests.

A full run could still be triggered with an environment variable.

Signed-off-by: Alexey Ovchinnikov <aovchinn@redhat.com>
This commit adds documentation for conditional
test runs.

Signed-off-by: Alexey Ovchinnikov <aovchinn@redhat.com>
git action has to be amended: git refuses to work with
a repository which directory is owned by someone else.
This is expected behavior on the gate (Github CI),
but is not expected to happen locally. Standard remedy
was used to address the issue.

Signed-off-by: Alexey Ovchinnikov <aovchinn@redhat.com>
@a-ovchinnikov a-ovchinnikov force-pushed the integ_tests_experiment_02 branch from 2a612b9 to 81da9ac Compare September 16, 2024 14:47
@a-ovchinnikov a-ovchinnikov added this pull request to the merge queue Sep 16, 2024
Merged via the queue into hermetoproject:main with commit fd6798e Sep 16, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants