Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove "Checking for updates" stage from updater #528

Merged
merged 2 commits into from
Jun 26, 2020

Conversation

eloquence
Copy link
Member

@eloquence eloquence commented Apr 9, 2020

Resolves #478

Status

Ready for review

Description

  • Remove the "Checking for updates" stage from the SecureDrop Workstation preflight updater, as it does not add significant value.
  • Require that the user explicitly start the process, to ensure it can safely run as an uncancellable process.
  • Tweaks to the UI/UX: smaller window, shorter text, headline separate from main body and above progress bar, messaging tweaks, layout tweaks

Test plan

Preparatory steps

(All testing should be done with an existing Qubes install at least at 0.3.0-rpm - dev, prod or staging should make no difference)

  1. Make note of the contents of sdw-last-updated and sdw-update-status in ~/.securedrop_launcher
  2. Ensure your system is up-to-date so you can selectively downgrade in the following scenarios.
  3. Apply the changes in this PR to the launcher versions in /opt/securedrop/launcher and /srv/salt/launcher (if only the /opt copy is overwritten, the updater itself will replace it on the next run).
  4. Read through [0.2.3-rpm] libxenlight failed to create new domain sd-log  #498 - you may encounter it repeatedly while testing this PR, so any additional observations to resolve that issue are appreciated.

Scenario 1: fedora-31 case

  1. Downgrade a package in fedora-31 and reboot the VM to force an actual upgrade to applied (e.g., sudo dnf downgrade zlib).
  2. Run /opt/securedrop/launcher/sdw-launcher.py --skip-delta 0. This forces an updater run.
    • Observe that you are able to launch the client, without a reboot warning.
    • Observe that sdw-update-status and sdw-last-updated contain the expected state (both should include the current timestamp, and sdw-update-status should contain status 0 for success)

Scenario 2: dom0 case

  1. Downgrade a package in dom0 (see instructions below).
  2. Run /opt/securedrop/launcher/sdw-launcher.py --skip-delta 0.
    • Observe that you are prompted to reboot after the update. Keep the window open for now.
    • Observe that sdw-update-status and sdw-last-updated contain the expected state (both should include the current timestamp, and sdw-update-status should contain status 2 for "reboot required").
  3. Click the reboot button and wait for the system to reboot.
  4. Log back in and run /opt/securedrop/launcher/sdw-launcher.py --skip-delta 0.
  5. After updates, observe that you are able to run the client without a reboot.

Downgrading a package in dom0

The method I used for downgrading (all commands require sudo or root shell):

  1. Look at dnf history in dom0 to view recently updated packages.
  2. Inspect a history entry in detail with dnf history info <id>.
  3. Pick the older version of a recently updated package using qubes-dom0-update $package-$version, e.g. qubes-dom0-update python3-qubesimgconverter-4.0.27

That's it -- this doc claims you also have to run dnf downgrade but it seemed to do that automatically for me. Suggestions for a simpler process welcome so we can add them to our testing docs.

Notes on performance

Note that when comparing performance with the current update process, it's important to note that the "applying updates" is never skipped after "checking updates" runs. That's because we always assume updates are required for fedora-31: https://github.com/freedomofpress/securedrop-workstation/blob/master/launcher/sdw_updater_gui/Updater.py#L116

So an actual performance comparison is between "checking for updates + applying updates for 1...n VMs" vs. "always applying updates for all VMs".

Screenshots

Initial dialog

v3-updater-1

After clicking "Start Updates"

v2-updater-2

Updates complete, no reboot needed

v2-updater-3

Updates complete, reboot required (dom0 updates)

(Not including final string tweaks in 9c7fd82)
v2-updater-4

sdlog.info("dom0 update successful")
return UpdateStatus.REBOOT_REQUIRED

if output.find("No packages downloaded") != -1:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emkll This is the most significant business logic change, I think. The return code from qubes-dom0-update does not appear to tell us reliably whether or not updates were applied, and we need to know in order to trigger the reboot logic. So I decided to look at the command output. Based on my review of the updater logic in /usr/lib/qubes/qubes-download-dom0-updates.sh in sys-firewall, my understanding is that this string should be present whenever a dom0 update was successful but yielded no results.

Copy link
Member Author

@eloquence eloquence Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@conorsch suggested maybe running the --check-only stage here after all so we can use its return code to determine the reboot requirement. In time runs that only adds a few seconds as the index is now sufficiently up-to-date for it to use it on the subsequent run. I'll leave it as is for now for your initial review, let me know what you think, and if you have other ideas for how to detect the reboot requirement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work based on my local testing, though theres also other test there, specifically "Nothing to do.", "Complete!" and the "No packages downloaded" you've identified. If upgrades are available, we can parse the transaction summary (checks if upgrades were performed)

However, I agree that running the command with --check-only prior to running the upgrade will make it much simpler to detect a non-zero error code than parsing output. Given the somewhat long run times, @conorsch 's approach is probably best here (we can recycle _check_updates_dom0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll switch to using a --check-only run for dom0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done in 0e5d172.

@eloquence eloquence force-pushed the 478-more-automatic-updates branch from e75c7c3 to 436bbf6 Compare April 10, 2020 01:07
@eloquence eloquence marked this pull request as ready for review April 13, 2020 23:14
Copy link
Contributor

@emkll emkll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eloquence did a first pass with time measurements and some basic functional testing, will return for another functional pass based on your test plan, left some comments inline as well

Current iteration (3 updates)

  • 14 minutes for each

Old iteration (3 passes)

  • 5, 5 and 6 minutes to check
  • 15, 5, and 4 minutes to upgrade (15 minutes was when all VMs needed an upgrade)
  • 20, 10 and 10 minutes total time

sdlog.info("dom0 update successful")
return UpdateStatus.REBOOT_REQUIRED

if output.find("No packages downloaded") != -1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work based on my local testing, though theres also other test there, specifically "Nothing to do.", "Complete!" and the "No packages downloaded" you've identified. If upgrades are available, we can parse the transaction summary (checks if upgrades were performed)

However, I agree that running the command with --check-only prior to running the upgrade will make it much simpler to detect a non-zero error code than parsing output. Given the somewhat long run times, @conorsch 's approach is probably best here (we can recycle _check_updates_dom0)

launcher/sdw_updater_gui/strings.py Show resolved Hide resolved
launcher/sdw_updater_gui/strings.py Outdated Show resolved Hide resolved
"<p>To keep your Workstation safe, daily software updates are required.</p> "
"<p>This typically takes between 5 and 30 minutes. You cannot use the SecureDrop "
"Client or any of is VMs while the updater is running.</p>"
"<p><span style='color:#E62354;'><b>Interrupting software updates may break "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think "break the workstation" is somewhat hyperbolic here, especially when this is written in red. It will "break" the apt cache on the templates, in worst case

Copy link
Member Author

@eloquence eloquence Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are either of the following scenarios likely to require admin intervention to recover from:

  • closing the updater window at any point
  • shutting down the computer?

A system shutdown will tell all processes to terminate. Won't that potentially interrupt a package install, requiring manual intervention to recover?

Anything that requires admin intervention to resolve before the journalist can access SecureDrop again can IMO be fairly described as "breakage" from the journalist's POV.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emkll I did nudge Erik towards simplifying text into very black-and-white terms, here. Trying to keep admins from being over-burdened as user shepherds, felt important in that.

Also, if there's too much text—like, to be "accurate" vs too broad-of-strokes, users will see "wall of text" and not read any of it. UI text is a tough balance, in that regard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't block here but the word "broken" sounds quite negative to me, through it's true that this message is meant to be dissuasive

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current wording in master is

Any interruption in this process may break Workstation components

The proposed change here is:

Interrupting software updates may break the Workstation

Mainly for brevity and simplicity. Happy to consider alternatives as well. I think whether the word "break" makes sense here mainly depends on the impact in a worst case scenario -- e.g., if the user closes the laptop lid while a package is being installed or a critical dom0 salt sate is being applied.

launcher/sdw_updater_gui/strings.py Show resolved Hide resolved
launcher/sdw_updater_gui/strings.py Show resolved Hide resolved
@eloquence
Copy link
Member Author

Given the significant testing burden of this change, we've agreed to defer this until the next sprint (and probably after the 0.3.0-rpm release) for now, after the current (4/22-5/6) sprint.

@eloquence
Copy link
Member Author

eloquence commented May 6, 2020

We've agreed this is still likely a desirable change, but our focus right now is to get SD 1.3.0 + Client/Proxy/Workstation releases with limited copy/paste support & other fixes already merged, out the door, and prep for the fedora-31 transition. Strong candidate for next sprint (after 5/6-5/20).

@eloquence eloquence force-pushed the 478-more-automatic-updates branch from 1d937db to 220f66e Compare June 4, 2020 00:41
@eloquence
Copy link
Member Author

(Rebased and squashed.)

@eloquence
Copy link
Member Author

This is ready for a pass by a new reviewer (Mickael is out for a couple of weeks); remaining open comments are about wording choices in the UI, happy to kick those around more.

@rmol
Copy link
Contributor

rmol commented Jun 10, 2020

I followed the test plan, with the exception that I modified the Fedora 31 template instead of Fedora 30. The updates were performed properly. I did encounter #498 once.

I think the layout changes in UpdaterAppUI.py need some attention, though. Under both i3 and XFCE, the dialog's title and headline text was often clipped, and I was unable to resize the updater dialog to read the rest of the instructions. I tried on both my 4k external monitor and my T480's internal 1920x1080 display. I modified the X DPI, and the font size in the XFCE settings. Even making the font size too small to comfortably read, there was still occasional clipping.

It could well just be how I've modified my system, since @emkll didn't see any problem with it, and the screenshots obviously indicate it worked properly on your machine. At the moment I don't have a machine on which I can install Qubes from scratch to verify.

Scenario 1: fedora-31 case

  1. Downgrade a package in fedora-31 and reboot the VM to force an actual upgrade to applied (e.g., sudo dnf downgrade zlib).
  2. Run /opt/securedrop/launcher/sdw-launcher.py --skip-delta 0. This forces an updater run.
    • Observe that you are able to launch the client, without a reboot warning.
    • Observe that sdw-update-status and sdw-last-updated contain the expected state (both should include the current timestamp, and sdw-update-status should contain status 0 for success)

Scenario 2: dom0 case

  1. Downgrade a package in dom0 (see instructions below).
  2. Run /opt/securedrop/launcher/sdw-launcher.py --skip-delta 0.
    • Observe that you are prompted to reboot after the update. Keep the window open for now.
    • Observe that sdw-update-status and sdw-last-updated contain the expected state (both should include the current timestamp, and sdw-update-status should contain status 2 for "reboot required").
  3. Click the reboot button and wait for the system to reboot.
  4. Log back in and run /opt/securedrop/launcher/sdw-launcher.py --skip-delta 0.
  5. After updates, observe that you are able to run the client without a reboot.

@eloquence
Copy link
Member Author

Awesome, thanks for the test @rmol . If you're able to repro it consistently, could you take a screenshot that illustrates the clipping problem (tomorrow is fine)? :)

@eloquence
Copy link
Member Author

(Test plan updated to fedora-31)

@rmol
Copy link
Contributor

rmol commented Jun 10, 2020

updater-clipping-xfce

@eloquence
Copy link
Member Author

Wow, that's .. horrifying. I'm curious how current master compares for you, does everything consistently render within its bounds, including button labels etc?

@rmol
Copy link
Contributor

rmol commented Jun 10, 2020

Better (button labels) but still clipping. So it's me, but our Qt apps are the only ones I really have trouble with. I suspect that there are ways we could make them more tolerant of desktop environment variance, or if nothing else, make them resizable.

updater-clipping-xfce-master

@eloquence
Copy link
Member Author

Yeah that's pretty terrible too, so I won't feel too bad, but I agree it should never get into such a pathological state if at all possible :). Once you're on a fresh install, it'd be great to have STR for reproducing this.

rmol
rmol previously approved these changes Jun 16, 2020
Copy link
Contributor

@rmol rmol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the updater on a fresh installation of Qubes 4.0.3 and the dialogs are completely readable, with no clipping of text or button labels. Since the visual horror was limited to my regular development system, I'm approving.

@eloquence
Copy link
Member Author

That's great news @rmol. If you have time, I'd appreciate any steps you can provide for getting into this broken state, as we should really make sure (probably in a separate PR) that our dom0 Qt dialogs scale well if users change their settings.

@rmol
Copy link
Contributor

rmol commented Jun 16, 2020

Sure, let me see if I can ruin another Qubes machine. 🙂

@rmol
Copy link
Contributor

rmol commented Jun 16, 2020

I believe all it takes is adjusting the DPI. An easy way under XFCE is to open System Tools > Appearance and enter a Custom DPI setting on the Fonts tab. On my X1 Carbon with 1920x1080 display, a value of 140 was enough to cause text clipping in the second step of the updater.

updater-140-dpi

@eloquence
Copy link
Member Author

Thanks, will investigate :)

- Shorten messaging, tweak text per UX discussions
- Make important text red
- Add a headline element, so we can render it above the progress bar
- Reduce height
This avoids brittle output parsing, and should not have significant
performance impact due to caching.
Copy link
Contributor

@emkll emkll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eloquence these changes look good to me. Tested using the default DPI settings in Qubes, looks good to me.

  • Verified the UI changes are properly reflected in the python code (after running pyuic and black)
  • Verified the test coverage is 100%
  • Visual review
    Followed the test plan:

Scenario 1: fedora-31 case

  1. Downgrade a package in fedora-31 and reboot the VM to force an actual upgrade to applied (e.g., sudo dnf downgrade zlib).
  2. Run /opt/securedrop/launcher/sdw-launcher.py --skip-delta 0. This forces an updater run.
    • [ x Observe that you are able to launch the client, without a reboot warning.
    • Observe that sdw-update-status and sdw-last-updated contain the expected state (both should include the current timestamp, and sdw-update-status should contain status 0 for success)

Scenario 2: dom0 case

  1. Downgrade a package in dom0 (see instructions below).
  2. Run /opt/securedrop/launcher/sdw-launcher.py --skip-delta 0.
    • Observe that you are prompted to reboot after the update. Keep the window open for now.
    • Observe that sdw-update-status and sdw-last-updated contain the expected state (both should include the current timestamp, and sdw-update-status should contain status 2 for "reboot required").
    • Running the preflight updater again still indicates that a reboot is required
  3. Click the reboot button and wait for the system to reboot.
  4. Log back in and run /opt/securedrop/launcher/sdw-launcher.py --skip-delta 0.
  5. After updates, observe that you are able to run the client without a reboot.

Docs are sufficiently vague to not require any changes: https://workstation.securedrop.org/en/latest/admin/securing_workstation.html#apply-updates-when-prompted, but we should flag these changes to pilot orgs once they are released.

Approving but not immediately merging in case @rmol has any further comments.

"<p>To keep your Workstation safe, daily software updates are required.</p> "
"<p>This typically takes between 5 and 30 minutes. You cannot use the SecureDrop "
"Client or any of is VMs while the updater is running.</p>"
"<p><span style='color:#E62354;'><b>Interrupting software updates may break "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't block here but the word "broken" sounds quite negative to me, through it's true that this message is meant to be dissuasive

launcher/sdw_updater_gui/strings.py Show resolved Hide resolved
Copy link
Contributor

@conorsch conorsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good! Ran through a few iterations of the new logic. On the f31 testing, I consistently saw update times of ~15-20m per run. For the dom0 downgrade test, times were roughly 20m, then reboot, then 15m on a forced update run next time.

As a sidenote, not one did I encounter any VM start errors. Pleased with these changes going in, particularly with the verbose testing reports we have above!

@conorsch conorsch merged commit 1226f9e into master Jun 26, 2020
@conorsch conorsch deleted the 478-more-automatic-updates branch June 26, 2020 22:17
@eloquence eloquence mentioned this pull request Jun 26, 2020
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable user to apply available updates without additional prompts
5 participants