Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SecureDrop installation test #25

Merged
merged 3 commits into from
Nov 20, 2024
Merged

Conversation

deeplow
Copy link
Contributor

@deeplow deeplow commented Sep 11, 2024

First attempt at adding a test for SecureDrop.

assert_and_click("menu-vm-xterm");


assert_script_run('gpg --keyserver hkps://keys.openpgp.org --recv-key "2359 E653 8C06 13E6 5295 5E6C 188E DD3B 7B22 E6A3"');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_script_run depends on seeing serial console output - serial console from "work" VM isn't directly connected to the one of the host; for this to work you either need to run something like tail -F /var/log/xen/console/guest-work.log >> /dev/hvc0 in dom0 (we do that here), or do all that from dom0's terminal via qvm-run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Would type_string and then "ret" work as well? I'm trying not to deviate to much from the original instructions so it's easy to update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would work, but your test wouldn't detect if any of those command fails (other than possible some later step dom0 in dom0 failing).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point 😔. I'll just go ahead an use qvm-run, then.

@deeplow deeplow force-pushed the add-securedrop-test branch 2 times, most recently from 3502fd7 to 2008e3f Compare September 12, 2024 10:54
@marmarek
Copy link
Member

Hint: add send_key('alt-f10') to see more output at once in xterm. Not relevant much when everything goes right, but helps quite a bit when debugging.

@deeplow deeplow force-pushed the add-securedrop-test branch 6 times, most recently from 5c8b79c to 3a2149a Compare September 12, 2024 12:39
@deeplow
Copy link
Contributor Author

deeplow commented Sep 12, 2024

Hint: add send_key('alt-f10') to see more output at once in xterm. Not relevant much when everything goes right, but helps quite a bit when debugging.

Thanks for the tip. I had seen that in some places and was wondering about its purpose. I'll add it in the next round.

@deeplow deeplow force-pushed the add-securedrop-test branch from 3a2149a to deebce7 Compare September 12, 2024 17:29
assert_script_run('curl https://raw.githubusercontent.com/freedomofpress/securedrop/d91dc67/securedrop/tests/files/test_journalist_key.sec.no_passphrase | sudo tee /usr/share/securedrop-workstation-dom0-config/sd-journalist.sec');
assert_script_run('sdw-admin --validate');

assert_script_run('xfce4-power-manager -q'); # disable screen blanking during long command
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marmarek there's a command which takes quite a while and in the meantime the screen blanks. I don't think it's xscreensaver because I think that's killed at the beginning of the test. Then I tried to disable XFCE's power management, but didn't help.

Have you encountered this before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My notes have this line:

x11_start_program('env xset s off', valid => 0);

but I'm not sure if that was enough either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I had to combine it with env xset -dpms for this to fully work.

And FYI I noticed that just with env xset s off it still blanked for a lot of the slow command (sdw-admin --apply), but oddly enough the screen showed up just the logs upload command (video). No idea what went on there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It unblanked on the key press.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I totally forgot that it was literally typing each letter. That's why, then.

Copy link
Contributor Author

@deeplow deeplow Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall that the above options were still not working perfectly (the screeen was still bllanking at some point). What seems to have solved it is enabling presentation mode. I haven't look at what it's doing under the hood. But it seems to work. And because the setting is persistent, I think it shouldn't need anymore all the xscreensaver exits.

canvas

@marmarek
Copy link
Member

Anyway:

# Test died: command 'sdw-admin --apply' timed out at /usr/lib/os-autoinst/autotest.pm line 412.

So, longer timeout? This is running virtualized, so runs slower than native.

And also, I recommend collecting and uploading logs. For example wrap it with script, or use tee (see https://github.com/QubesOS/openqa-tests-qubesos/blob/main/tests/update2.pm#L110-L111 for example). You can also do post-fail hook to collect extra info on failure too: https://github.com/QubesOS/openqa-tests-qubesos/blob/main/tests/update2.pm#L168-L174

@deeplow deeplow force-pushed the add-securedrop-test branch 8 times, most recently from bf7f90e to fb294a2 Compare September 13, 2024 16:48
@deeplow
Copy link
Contributor Author

deeplow commented Sep 16, 2024

So, longer timeout? This is running virtualized, so runs slower than native.

Fair point. I have added some timeout.

Now I am running into another issue. I have created a needle through the web interface added for this step an assert_and_click. However, when it runs, it's not even listing the needle. Do I need to add the needle's PNG and respective JSON to the commit?

@marmarek
Copy link
Member

Have you restarted the test after adding the needle? Or did you added it via developer mode?

@deeplow
Copy link
Contributor Author

deeplow commented Sep 16, 2024

I thought I had restarted it afterwards. But will try again. It for sure wasn't via developer mode. Let's see if it now finds the needle.

@marmarek
Copy link
Member

I see the issue: you haven't added the securedrop-launcher tag, it only has desktop tag (which shouldn't be there I think). I guess you added it by clicking on an earlier screenshot (you can do that too, but then you need to adjust tags manually, as the default will be about that other screenshot).

@deeplow
Copy link
Contributor Author

deeplow commented Sep 16, 2024

OK. Makes sense. I was afraid to create new tags. Where can I edit the needle? Or should I create a new one?

@marmarek
Copy link
Member

marmarek commented Sep 16, 2024

For this one I just edited it manually.
But generally create new one, and don't be afraid about adding tags. In fact, do add more of them :) for SD-specific needles add ENV-securedrop tag (in addition to any others).

@deeplow deeplow force-pushed the add-securedrop-test branch from fb294a2 to ff78699 Compare September 16, 2024 18:35
@deeplow deeplow force-pushed the add-securedrop-test branch 3 times, most recently from ce75b93 to 502e47b Compare November 19, 2024 15:20
next unless /Template/;
s/\|.*//;
$fname = $self->save_and_upload_log("qvm-run --no-gui -ap $_ 'rpm -qa; dpkg -l; pacman -Q; true'",
"template-$_-packages.txt", ('timeout' => 90, 'failok' => 1));
Copy link
Member

@marmarek marmarek Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a perl expert by any means, but in other places I see a syntax like this:

Suggested change
"template-$_-packages.txt", ('timeout' => 90, 'failok' => 1));
"template-$_-packages.txt", timeout => 90, failok => 1);

So, without extra quotes or parenthesis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll try this tomorrow. If you look on the other save_and_upload_log calls (the ones which have the timeout), it does call with {timeout => 90}. The thing here is that save_and_upload_log does not have explicit parameters like timeout and failok but instead a kind of "python dict". I'm really tempted to just converting the function to have explicit args just to make things easier and then I think will work as you suggest.

@deeplow deeplow force-pushed the add-securedrop-test branch 4 times, most recently from 64e7e92 to bd03839 Compare November 19, 2024 19:16
@marmarek
Copy link
Member

marmarek commented Nov 19, 2024

Test died: Can't open file "ulogs/upload_packages-dom0-packages.txt": No such file or directory at /var/lib/openqa/pool/4/openqa-tests-qubesos/tests/securedrop/upload_packages.pm line 32.

Upload failed after all...

Make it reusable, deduplicate the code (it's already in two places).
While at it, add support for failok argument and log info if it actually failed
to upload.
Diff at openqa webui doesn't show context at all. While it's possible to guess
from where a given package is based in the version format, lets make it easier
by making it explicit.
@marmarek
Copy link
Member

Ok, I think I got it working: #27, feel free to merge/rebase/whatever

@deeplow deeplow force-pushed the add-securedrop-test branch from bd03839 to 2b22546 Compare November 20, 2024 10:26
@deeplow deeplow force-pushed the add-securedrop-test branch from 2b22546 to ef7fbce Compare November 20, 2024 12:10
@marmarek
Copy link
Member

FYI now that the install job worked, you can click restart on just the second job and not wait for re-install (assuming that install was okay and you don't want to change something at that stage).

@marmarek
Copy link
Member

The failing assert_screen x11 is part of switching back to the graphical console - it expects no windows open... If SD client autostart is expected, then I suggest making a needle with x11 tag (and maybe also some "securedrop-login" or such, even if unused for now?).

@deeplow
Copy link
Contributor Author

deeplow commented Nov 20, 2024

FYI now that the install job worked, you can click restart on just the second job and not wait for re-install (assuming that install was okay and you don't want to change something at that stage).

Thanks! I've been waiting for quite a bit now for the main test to actually pass, so that I can iterate on the second test. That's why it's also failing. I think it's worth dealing with the X11 needle situation in a separate PR, since the most important part is to get this first test across the finish line.

As the order of operations, I would suggest the following:

  • merging Upload packages #27
  • then I rebase this PR on top of main
  • lastly your final review / merge of this branch

Whatever is left to polish can be done in a subsequent PR as you suggested

@marmarek
Copy link
Member

then I rebase this PR on top of main

you already did ;)

As for merging this PR, I'd prefer it to be green (even if the second job - the actual test - is more or less empty). For example I see the SECUREDROP_TEST variable was set to wrong value (I fixed it in settings now). If I'm correct about the X11 needle situation, it's just a matter of adding it via needle editor.

@deeplow
Copy link
Contributor Author

deeplow commented Nov 20, 2024

That was it - it's now passing. I added the x11 tag to a previous needle with the SecureDrop client. I don't know exactly how the needle matching works, but hopefully this doesn't break any other tests 🤞

So I'd say this is ready for merging.

@deeplow
Copy link
Contributor Author

deeplow commented Nov 20, 2024

then I rebase this PR on top of main

you already did ;)

I was assuming the break I was proposing would go into #27, but it it can be added on this one, I'm fine as well. Then there's no need to rebase anything.

@marmarek marmarek merged commit ef7fbce into QubesOS:main Nov 20, 2024
1 check passed
@deeplow deeplow deleted the add-securedrop-test branch November 21, 2024 09:53
@marmarek
Copy link
Member

marmarek commented Jan 9, 2025

@deeplow unrelated to this PR, but still about openqa tests:
https://openqa.qubes-os.org/tests/125341#downloads guest-sd-large-bookworm-template.log:

[2025-01-09 09:19:39] [    0.796063] Run /init as init process
[2025-01-09 09:19:39] Loading, please wait...
[2025-01-09 09:19:39] Starting systemd-udevd version 252.31-1~deb12u1

[2025-01-09 09:19:39] [    1.075869] Invalid max_queues (4), will use default max: 2.
[2025-01-09 09:19:40] [    1.157320] blkfront: xvda: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2025-01-09 09:19:40] [    1.180986]  xvda: xvda1 xvda2 xvda3
[2025-01-09 09:19:40] [    1.195462] blkfront: xvdb: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2025-01-09 09:19:40] [    1.211144] blkfront: xvdc: flush diskcache: enabled; persistent grants: enabled; indirect descriptors: enabled; bounce buffer: enabled
[2025-01-09 09:19:40] Begin: Loading essential drivers ... [    1.497039] device-mapper: core: CONFIG_IMA_DISABLE_HTABLE is disabled. Duplicate IMA measurements will not be recorded in the IMA log.
[2025-01-09 09:19:40] [    1.497183] device-mapper: uevent: version 1.0.3
[2025-01-09 09:19:40] [    1.497935] device-mapper: ioctl: 4.48.0-ioctl (2023-03-01) initialised: dm-devel@redhat.com
[2025-01-09 09:19:40] done.
[2025-01-09 09:19:40] Begin: Running /scripts/init-premount ... done.
[2025-01-09 09:19:40] Begin: Mounting root file system ... Begin: Running /scripts/local-top ... Begin: Waiting for /dev/xvda* devices... ... [2025-01-09 09:19:40] done.
[2025-01-09 09:19:40] Begin: Qubes: Doing R/W setup for TemplateVM... ... [2025-01-09 09:19:40] [    1.903477]  xvdc: xvdc1 xvdc3
[2025-01-09 09:19:40] Setting up swapspace version 1, size = 1073737728 bytes
[2025-01-09 09:19:40] UUID=038db4ec-7460-494c-91b9-1eb34836714b
[2025-01-09 09:19:40] [    1.973225] Adding 1048572k swap on /dev/xvdc1.  Priority:-2 extents:1 across:1048572k SS
[2025-01-09 09:19:41] done.
[2025-01-09 09:19:41] done.
[2025-01-09 09:19:41] Begin: Running /scripts/local-premount ... done.
[2025-01-09 09:19:41] Begin: Waiting for root file system ... [2025-01-09 09:19:42] Begin: Running /scripts/local-block ... done.
[2025-01-09 09:19:43] Begin: Running /scripts/local-block ... done.
[2025-01-09 09:19:44] Begin: Running /scripts/local-block ... done.
[2025-01-09 09:19:45] Begin: Running /scripts/local-block ... done.
[2025-01-09 09:19:46] Begin: Running /scripts/local-block ... done.
[2025-01-09 09:19:47] Begin: Running /scripts/local-block ... done.
...
[2025-01-09 09:20:11] Gave up waiting for root file system device.  Common problems:

Is it something you know about, or maybe need help with? This part is normally quite reliable...

@deeplow
Copy link
Contributor Author

deeplow commented Jan 10, 2025

Is it something you know about, or maybe need help with? This part is normally quite reliable...

Thanks for the prompt response, as always. I ended up not having time yesterday, but I was focusing more on first getting the dependent test right. But this issue is something we may need help with if it happens to be too unreliable and we can't figure it out. But I'll let you know.

FYI we also had sys-net failing in a previous run, but there there the xen/console/ logs provide less useful from what I could find.

If any case, I'll let you know if one of these issues is adding too much unreliability. Unless you think it's unwise to take a look at these failures later, of course.

Thanks again.

@marmarek
Copy link
Member

@deeplow I see you are hitting hard job timeout. Not sure why (it seems that install simply takes that long now? I guess Tor is having bad days...), but you can increase the timeout by setting MAX_JOB_TIME (in seconds).

@deeplow
Copy link
Contributor Author

deeplow commented Jan 28, 2025

Thanks! Just started a new one with a longer timeout. I'm going to eventually have to parallelize at least the server installation part. Otherwise it just takes too long.

@marmarek
Copy link
Member

marmarek commented Feb 1, 2025

FYI the fedora-41 update issue you've hit recently is QubesOS/qubes-issues#9744, it's fixed now.

@deeplow
Copy link
Contributor Author

deeplow commented Feb 3, 2025

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants