-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing: Making integration tests run conditional #612
testing: Making integration tests run conditional #612
Conversation
5a6560b
to
eecd05a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very promising once we iron out the details.
48405a1
to
4be5ff5
Compare
/retest |
ba0a4e9
to
4eed655
Compare
89fb877
to
4af4e8d
Compare
86f33c7
to
e8d7ce1
Compare
1f90e33
to
9246669
Compare
tests/integration/utils.py
Outdated
def retrieve_changed_files_from_git(ref: str = "HEAD") -> tuple[Path, ...]: | ||
repo = Repo(".", search_parent_directories=True) | ||
commit = repo.commit(ref) | ||
modified_files = [Path(f) for f in commit.stats.files] | ||
return tuple(modified_files) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the function returns changes only made in the last commit.
I suppose it should return changes from <current_branch>..main
or ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent was to run this for the latest commit only. Diffing feature branch with main would produce too many changed modules and would effectively negate all the benefits of the change. Suppose one is working on pip in one commit, then on gomod, then on npm. My change addresses the fact that it is not necessary to retest pip when gomod is affected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose one is working on pip in one commit, then on gomod, then on npm. My change addresses the fact that it is not necessary to retest pip when gomod is affected.
This doesn't make much sense to me, IOW if I work on several pkg mgrs at the same time then I need to run the integration test suite against all of them, not just the last one. What you're saying makes sense in an interactive rebase scenario, but not when I'm running the test suite on the whole patch set, so @slimreaper35 is right, this needs to be run against main
by default and ref
(currently defaulting to "HEAD") should be allowed to be set via an env variable just in case someone needs to further tweak against which ref should the selection be produced against.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit like overengineering to me. It is extremely easy to trigger all tests to run and having the change test anything but the last commit would likely trigger the full set of tests. If this is the desired behavior then we already have an override variable for that. Yet another variable will make the feature harder to use and will require remembering more context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit like overengineering to me. It is extremely easy to trigger all tests to run
But that's exactly not what we want here. To be precise, the scenario where someone would touch all pkg mgrs at the same time is extremely rare. And I specifically don't want to run tests for pkg mgrs I'm not touching and I know that the related tests take too long to execute (gomod, pip), so if I'm touching both RPM and yarn in a single PR, then I don't want the tests to run just for yarn or RPM depending on what the last commit is. And I most certainly don't want to run tests for pip and gomod when I know they take twice as long.
If this is the desired behavior then we already have an override variable for that.
What variable do you mean?
I think the whole point of this PR was to kind of automate the detection of which tests need to be run for a particular change that would work both for one-shot executions as well in interactive rebase scenarios. At least that was my understanding and the value I was seeing in that, i.e. not needing to use any variable (unless absolutely needed for some heavy tweaking) and it'd just do the "right thing".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the last commit in the PR updates only the README.md, but the previous commits are related to one or any number of package managers? We don't want to skip all integration tests in that case.
Very good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the best of my knowledge it is a merge commit that gets tested on the gate, so it ends up testing the entire commit chain for every PR. It is testing just the last commit locally under an assumption that the last commit contains a testable chunk of work. The change will run zero to all ITs depending on the number of PMs affected by the last commit locally and zero to all ITs on the gate depending on how broad is the change introduced by the entire PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that GH tests all commits in a sequence (see a screenshot of the current sequence below)
is proven by the latest (at the moment of writing) IT job.
There you can observe that all tests were run (however second commit in the sequence does not affect any code) and there was no failure due to insufficient permissions despite the fact that the necessary permissions are added in the second commit, not the first one. Further, the job does this to fetch the code:
/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* +aa22d80cf78f8d981813bfda262e8ba7f2319589:refs/remotes/pull/612/merge
and checks out the pull request like this:
/usr/bin/git checkout --progress --force refs/remotes/pull/612/merge
This results in HEAD pointing to a merge commit:
HEAD is now at aa22d80 Merge 7a546e32f90ad33e8e10d95e3164b777af957867 into b3862f2c08d62a901840f33a6a05eabaf46c0365
7a546e3
is my latest commit, b3862f2
is the tip of main
at the moment of the last push. It results in all files modified in the feature branch to be counted together.
Edit: formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further, the job does this to fetch the code:
/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/>origin/* +refs/tags/*:refs/tags/* +aa22d80cf78f8d981813bfda262e8ba7f2319589:refs/remotes/pull/612/merge
and checks out the pull request like this:
/usr/bin/git checkout --progress --force refs/remotes/pull/612/merge
This results in HEAD pointing to a merge commit:
HEAD is now at aa22d80 Merge 7a546e32f90ad33e8e10d95e3164b777af957867 into b3862f2c08d62a901840f33a6a05eabaf46c0365
I never noticed ^this, TIL GitHub automatically creates a "virtual" merge commit as a preview of the changes what it'd look like if the PR were to be merged to main immediately, thank you!
That said, what you modeled according to GitHub's behaviour won't be the case locally where your branch is properly rebased onto main and you won't have this automatic merge commit which you could inspect for changes. Instead, you'll have your individual commits and so based on your current implementation I am either:
- forced to always run the integration test suite in an interactive rebase (which should be a choice really and a good practice, not a must) OR
- set an environment variable to always run ALL tests because the last commit would end up yielding no relevant changes
none of which is desirable really because it's both not that much different from the status quo and arguably actually a regression in the behaviour because this solution only really tackles the gating environment but neglects local dev environments. If you instead compare all commits against main instead of just inspecting HEAD, then it should work consistently in both local and gating env.
Edit: while the time it takes to run integration tests should be decreased in general and I'm still in large favour of this PR, I'd argue the major perception of the time penalty paid on the execution happens in local dev envs rather than in the gating CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this change reduces average local run-time more than average gate run-time. Following a separate discussion I'll make selector behavior more uniform in the next revision.
0fc0b7c
to
2549ac9
Compare
022a533
to
45c8c7c
Compare
610b78d
to
248ec3a
Compare
files = repo.git.diff("--name-only", f"{main}..HEAD").split("\n") | ||
# Widest net possible in order not to interfere with testing for no good reason. | ||
except Exception: | ||
# If there is no main and no origin/main then someone is probably doing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please log this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I log something anywhere within these functions, I see nothing unless I add --log-cli-level=<level>
in [testenv:integration]
. But the option is already configured below in the file
Is it just me? I don't understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it just me?
It's not, turns out, we need to set log_cli=true
for pytest as well then I can see:
2024-09-13 17:44:38 DEBUG HELLO FROM TEST
based on this diff:
diff --git a/tests/integration/utils.py b/tests/integration/utils.py
index 80db1fd1..f7c89398 100644
--- a/tests/integration/utils.py
+++ b/tests/integration/utils.py
@@ -484,6 +484,7 @@ def retrieve_changed_files_from_git() -> tuple[Path, ...]:
# Unsupported right operand type for in ("Callable[[], IterableList[Head]]")
main = "main" if "main" in repo.branches else "origin/main" # type: ignore
try:
+ log.debug("HELLO FROM TEST")
files = repo.git.diff("--name-only", f"{main}..HEAD").split("\n")
# Widest net possible in order not to interfere with testing for no good reason.
except Exception as e:
And I've even seen @a-ovchinnikov's warning when I made a typo in there :D:
------------------------------------------ live log collection -------------------------------------------
2024-09-13 17:48:10 WARNING Detection of changed files unexpectedly failed. This either indicates that both 'main' and 'origin/main' branches are missing in a repo or a more fundamental failure. The tool will attempt to recover from this by not filtering any tests out. This is the exception that was encountered: 'Logger' object has no attribute 'DEBUG'
collected 52 items
so yeah, log_cli=true
works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that would work, but then tox -e py3
shows countless amount of logs 🤯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log_cli_level = DEBUG
is supposed to log everything. it should be at least INFO
for tests, or maybe even just WARNING
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that would work, but then tox -e py3 shows countless amount of logs 🤯
It does, do you have a better idea how to handle debug logs from test? I mean, it's kinda expected that once we turn it on, debug logs litter the output, that's their inherent nature, isn't it?
Neither is IMO a problem this PR inherently needs to solve (unless you favour doing it here). We identified the problem and I agree we should decrease the default log level, allowing users to override it from tox's CLI if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that we could use log_cli
only for integration tests, but log-cli
is not available in CLI.
We don't have to fix it in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from logging the case when the main branch isn't called main
I only have one other request - please update the README integration testing section mentioning this behaviour and how it can be overridden to either get a full test or a different subset (-k), otherwise LGTM and we're good to merge.
Thank you, I have completely forgotten to update README! |
248ec3a
to
2a612b9
Compare
@@ -258,6 +258,31 @@ Similarly to unit tests, for finer control over which tests get executed, e.g. t | |||
tox -e integration -- tests/integration/test_package_managers.py::test_packages[gomod_without_deps] | |||
``` | |||
|
|||
Integration tests verify package managers handlers behavior and ensure that it remains consistent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For future reference, README/doc changes should ideally go to separate commits from code changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Before this change integration tests and e2e tests would run unconditionally on every change. This commit introduces some shortcuts to this process: changes that do not affect code (e.g. changes to text files or to anything outside of main source tree and integration tests tree) will not trigger integration and e2e tests, while changes to individual package managers or package managers tests will trigger just the affected tests. This will reduce the average time to run the tests. A full run could still be triggered with an environment variable. Signed-off-by: Alexey Ovchinnikov <aovchinn@redhat.com>
This commit adds documentation for conditional test runs. Signed-off-by: Alexey Ovchinnikov <aovchinn@redhat.com>
git action has to be amended: git refuses to work with a repository which directory is owned by someone else. This is expected behavior on the gate (Github CI), but is not expected to happen locally. Standard remedy was used to address the issue. Signed-off-by: Alexey Ovchinnikov <aovchinn@redhat.com>
2a612b9
to
81da9ac
Compare
Before this change integration tests and e2e tests would run unconditionally on every change. This commit introduces some shortcuts to this process: changes that do not affect code (e.g. changes to text files or to anything outside of main source tree and integration tests tree) will not trigger integration and e2e tests, while changes to individual package managers or package managers tests will trigger just the affected tests. This will reduce the average time to run the tests.
Maintainers will complete the following section
Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:
/ok-to-test
(as is the standard for Pipelines as Code)