-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure the device reboots after snap refresh/revert tests (BugFix) #1124
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1124 +/- ##
==========================================
+ Coverage 43.04% 43.14% +0.10%
==========================================
Files 355 355
Lines 38610 38624 +14
Branches 6556 6559 +3
==========================================
+ Hits 16619 16665 +46
+ Misses 21327 21294 -33
- Partials 664 665 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
de5e756
to
1bffbeb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly like what I see here, especially the prints instead of logging, makes way more sense for us else we always need to re-run the job to see what is wrong with it.
Consider taking a look at my comment, I'm a bit confused by that poll function.
573b314
to
c398574
Compare
In order to get more data in the job logs, Snapd._poll_change() is amended to do the following: - Outuput info about the task when it's in the "Wait" status (and return, as it likely means the snap is waiting for a reboot) - Outuput info about the task when it's in the "Error" status - If an ongoing task (status = "Doing") has a progress label assigned, print the current progress in % (similar to what's done in the snap command itself). This is useful to investigate if tasks like snap downloads have frozen at some point. Unit tests for the _poll_change() method are also added.
In order to be more consistent with other methods, these methods now return the response. Unit tests are added for these methods as well.
Script can now be called with the --timeout argument (defaulting to 300 seconds). This is passed to the Snapd() instance, along with the verbosity flag that enables Snapd._info() to output useful information for the job logs.
Print statements make more sense because any information captured during the snap commands should be made available to the Checkbox results later on.
- Reboot commands are moved inside the snapd/snap-[refresh|revert]-* jobs. This is to make sure the device is rebooted right after the command is issued (with previous version, some other jobs might be run between the refresh/revert command and the reboot command, making the test results unreliable) - As a result, the snapd/snap-[refresh|revert]-* are flagged as noreturn - Because Checkbox does not capture the outputs of noreturn jobs, their outputs are stored in the $PLAINBOX_SESSION_SHARE directory using `tee` - Reboot jobs are replaced with attachment jobs to upload the outputs along with the submissions
Following the work done in commit b8befd8, add template-id to the new templates that don't have them set yet.
c398574
to
3bf8bfc
Compare
I've built a snap with the latest version from this branch, and ran it on the Xinyi device: https://certification.canonical.com/hardware/202207-30448/submission/363697/ The outcome looks as expected:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ty for the extra "snapped" validation! +1
…est plan The snap-refresh-revert nested part had been removed from the SRU test plan in cd17d9a following some discoveries that, when executed, the jobs executed would change the system state in the wrong order (e.g. updating the kernel snap without reverting it while running other jobs afterwards). The jobs were modified to prevent this kind of scenario[1], and the use of manifest entries[2] prevents the jobs from being run on all the devices yet (currently, the only project that enables them is the Xinyi project). Fix LM-1522 [1] #1124 [2] https://github.com/canonical/checkbox/tree/main/providers/base/units/snapd#manifest-entries
…est plan (BugFix) (#1176) Bring back kernel/snapd/gadget snaps revert/refresh jobs to the SRU test plan The snap-refresh-revert nested part had been removed from the SRU test plan in cd17d9a following some discoveries that, when executed, the jobs executed would change the system state in the wrong order (e.g. updating the kernel snap without reverting it while running other jobs afterwards). The jobs were modified to prevent this kind of scenario[1], and the use of manifest entries[2] prevents the jobs from being run on all the devices yet (currently, the only project that enables them is the Xinyi project). Fix LM-1522 [1] #1124 [2] https://github.com/canonical/checkbox/tree/main/providers/base/units/snapd#manifest-entries
…canonical#1124) * Provide more information in Snapd._poll_change() In order to get more data in the job logs, Snapd._poll_change() is amended to do the following: - Outuput info about the task when it's in the "Wait" status (and return, as it likely means the snap is waiting for a reboot) - Outuput info about the task when it's in the "Error" status - If an ongoing task (status = "Doing") has a progress label assigned, print the current progress in % (similar to what's done in the snap command itself). This is useful to investigate if tasks like snap downloads have frozen at some point. Unit tests for the _poll_change() method are also added. * Return the response when calling Snapd.install() and remove() In order to be more consistent with other methods, these methods now return the response. Unit tests are added for these methods as well. * Add timeout argument to snap_update_test.py and output more info Script can now be called with the --timeout argument (defaulting to 300 seconds). This is passed to the Snapd() instance, along with the verbosity flag that enables Snapd._info() to output useful information for the job logs. * Replace logging with print statements in snap_update_test.py Print statements make more sense because any information captured during the snap commands should be made available to the Checkbox results later on. * Modify snap-refresh-revert jobs - Reboot commands are moved inside the snapd/snap-[refresh|revert]-* jobs. This is to make sure the device is rebooted right after the command is issued (with previous version, some other jobs might be run between the refresh/revert command and the reboot command, making the test results unreliable) - As a result, the snapd/snap-[refresh|revert]-* are flagged as noreturn - Because Checkbox does not capture the outputs of noreturn jobs, their outputs are stored in the $PLAINBOX_SESSION_SHARE directory using `tee` - Reboot jobs are replaced with attachment jobs to upload the outputs along with the submissions * Replace reboot jobs with log attachment jobs in snap-refresh-revert test plan Fix CHECKBOX-1264 * Fix snapd unit tests * Add unit test to improve coverage * Fix unit test * Add template-id to log-attach jobs Following the work done in commit b8befd8, add template-id to the new templates that don't have them set yet.
…est plan (BugFix) (canonical#1176) Bring back kernel/snapd/gadget snaps revert/refresh jobs to the SRU test plan The snap-refresh-revert nested part had been removed from the SRU test plan in cd17d9a following some discoveries that, when executed, the jobs executed would change the system state in the wrong order (e.g. updating the kernel snap without reverting it while running other jobs afterwards). The jobs were modified to prevent this kind of scenario[1], and the use of manifest entries[2] prevents the jobs from being run on all the devices yet (currently, the only project that enables them is the Xinyi project). Fix LM-1522 [1] canonical#1124 [2] https://github.com/canonical/checkbox/tree/main/providers/base/units/snapd#manifest-entries
When implementing changes for the gadget/snapd/kernel snap refresh[1], one of the job dependencies was not updated properly. The `snapd/reboot-after-snap-refresh-{type}-{name}-to-stable-rev` job does not exist anymore, so the `snapd/snap-verify-after-refresh-{type}-{name}-to-stable-rev` should depend on the refresh job itself (which was modified in the aforementioned PR to do the reboot itself) [1]: #1124
Fix snap refresh verification job dependency When implementing changes for the gadget/snapd/kernel snap refresh[1], one of the job dependencies was not updated properly. The `snapd/reboot-after-snap-refresh-{type}-{name}-to-stable-rev` job does not exist anymore, so the `snapd/snap-verify-after-refresh-{type}-{name}-to-stable-rev` should depend on the refresh job itself (which was modified in the aforementioned PR to do the reboot itself) [1]: #1124
Description
The
snap_update_test.py
script is currently running asynchronous snapd requests and simply waiting 90 seconds before rebooting the device, without checking the status of the snapd command. This leads to very flaky results.This PR updates the script (and parts of
checkbox_support.snapd_utils.snapd
) tonoreturn
jobs)Resolved issues
CHECKBOX-1264
Documentation
Changes are meticulously documented in the git commits. Please read them for more information.
Tests
snap-refresh-revert
test plan on it after making sure the snapd, gadget and pc kernels were all using their respective beta channels. Submission available here. The skipped tests are because the targeted revision is the same as the currently installed revision, which is expected for the gadget snap as the base version, stable version and beta version are all the same revision. Please note that the submission archive has anattachment_files
directory containing the output of thenoreturn
jobs, such as: