-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split functional testings via github action matrix #14078
Conversation
44580e1
to
00a75d1
Compare
1h 49m looks like a great result for full test suite run even as a start! |
Yes ;-) I have removed the Ubuntu 22.04 sanity and functional tests, so this PR could maybe merged a bit faster. |
Overall thoughts:
|
I like this idea of splitting it up like this in a matrix. tried it myself just didn't end up writing a script. :) |
We should use this for MacOS and Windows as well... when you have problems with this yaml stuff, I could help a bit. |
Heh, I'm usually fine with that stuff, but honestly if you think you could help with anything I'd be glad to take it. |
I agree, since the sanity tests are just a subset of the full test run there's not much value in running them, unless they're used to gate a full test run. If we parallelize the tests like this we could also split them up in some meaningful way. Perhaps "cli tests" (cli_root, cli_user), "feature tests" (initialize, trim, compression, fallocate, quota, send/recv, snapshots, checkpoints, etc), and "resilience tests" (raidz, redundancy, removal, replacement, slog, online_offline, scrub_mirror, etc). Or some other relatively equal split. We'd just need to settle on the group names and add them to the My inclination would be to try and split it only 3 or 4 ways to cut down a bit on the number of status reports and try and keep it as readable as possible. |
Then I will remove it.
Okay, i will try that ;)
We could also collect each summary of the tags and create a big one in the last step. |
That would be pretty cool. Personally what I've always found to be the most useful as a starting point when a test run fails in the buildbot summary page. It attempts to grab the meaningful log sections for FAILED and KILLED tests so they're easier to pull up an spot check. That's handled by just a little bit of shell here. https://github.com/openzfs/zfs-buildbot/blob/master/scripts/bb-test-zfstests.sh#L33-L38 |
I created a helper script for modifying the common.run file: https://gist.github.com/mcmilk/acc54a671ad2164ce98a0996a2ad6562 @behlendorf - maybe you want to have some changes on this initial grouping ? |
If we go into hardcoding tags, then we may want to at least find non-described tags and run it in one of parts |
2850c9b
to
51775d7
Compare
055230c
to
4e88c94
Compare
476fe47
to
54b7072
Compare
Okay, the main thing is done now. Sometimes the functional testings (part3) will fail. This seems unrelated to this functional tests splitting. I splitted the "Cleanup and summary" into 5 calls -because we have a limit of 1 MiB per step which can be put into the summary. The first call does the summary things and the next 4 calls will print debugging details ( |
This commit changes the workflow of the github actions. We split the workflow into different parts: 1) build zfs modules for Ubuntu 20.04 and 22.04 (~25m) 2) 2x zloop test (~10m) + 2x sanity test (~25m) 3) functional testings in parts 1..5 (each ~1h) - these could be triggered, when sanity tests are ok - currently I just start them all in the same time 4) cleanup and create summary When everything is fine, the full run with all testings should be done in around 2 hours. The codeql.yml and checkstyle.yml are not part in this circle. The testings are also modfied a bit: - report info about CPU and checksum benchmarks - reset the debugging logs for each test - when some error occurred, we call dmesg with -c to get only the log output for the last failed test - we empty also the dbgsys Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
This is request for review now? We went down from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a summary page where we can easily see the logs from tests which failed is real quality of life improvement! As are the faster test run times. I'm good with this, let's go ahead and merge this and refine as needed. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's try and use it! Great work @mcmilk , thank you!
/var/tmp/zloop/*/vdev/ | ||
if-no-files-found: ignore | ||
|
||
sanity: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we discussed that now we may not even need sanity tests, but this PR is already to good to be true (thanks @mcmilk ) so let it be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I lknow, I should remove them. But it's an first most things seem to work
or do not work
.
When some test within sanity will fail ... then the errors are also listed like in the functional testings.
But in good pull requests, this should never happen. So when it's happening, it's a first sth. is not okay here
info.
with: | ||
name: Summary Files | ||
path: Summary/ | ||
- uses: geekyeggo/delete-artifact@v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may leave artifacts, it may be convenient for someone to download already built debug build-release, but just loud thoughts, never mind for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the generated modules will be deleted, but all logs are saved now.
Currently the logs are only saved, when sth. failed .... this is changed to we log always
.
Merged. Let's see how it goes and refine as needed. Thanks! |
There is an error, when the images rotate: https://github.com/actions/runner-images#available-images So currently some runners have 20230313, and some have 20230305 .... I checked this via the file I will create a pull request for it... sorry, but this issue could not be testet, cause all runners had the same version ... |
@mcmilk that was fast to get into this corner case! Glad we can just rerun whole pipeline |
The simple fix for this, is to add the repo checkout on each functional testing ... this could be replace by one single line within the script ... but this needs to be tested a bit more then. |
This commit changes the workflow of the github actions. We split the workflow into different parts: 1) build zfs modules for Ubuntu 20.04 and 22.04 (~25m) 2) 2x zloop test (~10m) + 2x sanity test (~25m) 3) functional testings in parts 1..5 (each ~1h) - these could be triggered, when sanity tests are ok - currently I just start them all in the same time 4) cleanup and create summary When everything is fine, the full run with all testings should be done in around 2 hours. The codeql.yml and checkstyle.yml are not part in this circle. The testings are also modified a bit: - report info about CPU and checksum benchmarks - reset the debugging logs for each test - when some error occurred, we call dmesg with -c to get only the log output for the last failed test - we empty also the dbgsys Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes openzfs#14078
This commit changes the workflow of the github actions. We split the workflow into different parts: 1) build zfs modules for Ubuntu 20.04 and 22.04 (~25m) 2) 2x zloop test (~10m) + 2x sanity test (~25m) 3) functional testings in parts 1..5 (each ~1h) - these could be triggered, when sanity tests are ok - currently I just start them all in the same time 4) cleanup and create summary When everything is fine, the full run with all testings should be done in around 2 hours. The codeql.yml and checkstyle.yml are not part in this circle. The testings are also modified a bit: - report info about CPU and checksum benchmarks - reset the debugging logs for each test - when some error occurred, we call dmesg with -c to get only the log output for the last failed test - we empty also the dbgsys Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes openzfs#14078
This commit changes the workflow of the github actions. We split the workflow into different parts: 1) build zfs modules for Ubuntu 20.04 and 22.04 (~25m) 2) 2x zloop test (~10m) + 2x sanity test (~25m) 3) functional testings in parts 1..5 (each ~1h) - these could be triggered, when sanity tests are ok - currently I just start them all in the same time 4) cleanup and create summary When everything is fine, the full run with all testings should be done in around 2 hours. The codeql.yml and checkstyle.yml are not part in this circle. The testings are also modified a bit: - report info about CPU and checksum benchmarks - reset the debugging logs for each test - when some error occurred, we call dmesg with -c to get only the log output for the last failed test - we empty also the dbgsys Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes openzfs#14078
This commit changes the workflow of the github actions. We split the workflow into different parts: 1) build zfs modules for Ubuntu 20.04 and 22.04 (~25m) 2) 2x zloop test (~10m) + 2x sanity test (~25m) 3) functional testings in parts 1..5 (each ~1h) - these could be triggered, when sanity tests are ok - currently I just start them all in the same time 4) cleanup and create summary When everything is fine, the full run with all testings should be done in around 2 hours. The codeql.yml and checkstyle.yml are not part in this circle. The testings are also modified a bit: - report info about CPU and checksum benchmarks - reset the debugging logs for each test - when some error occurred, we call dmesg with -c to get only the log output for the last failed test - we empty also the dbgsys Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes openzfs#14078
Motivation and Context
The functional testimgs are timing out more and more ... the max. time for execution is 360m ... this seems to be reached :(
Description
This commit changes the workflow of the github actions.
We split the workflow into different parts:
When everything is fine, the full run with all testings
should be done in around 2 hours.
The codeql.yml and checkstyle.yml are not part in this circle.
The testings are also modfied a bit:
only the log output for the last failed test
The testings are done this way
Types of changes
Checklist:
Signed-off-by
.