Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixosTests.keymap.qwertz fails on aarch64-linux #147294

Closed
vcunat opened this issue Nov 24, 2021 · 15 comments · Fixed by #147715 or #148491
Closed

nixosTests.keymap.qwertz fails on aarch64-linux #147294

vcunat opened this issue Nov 24, 2021 · 15 comments · Fixed by #147715 or #148491

Comments

@vcunat
Copy link
Member

vcunat commented Nov 24, 2021

In combination with 21.11 adding it to the tested job, this prevents nixos-21.11 channel.

The keymap tests don't have any maintainers. That's quite an issue for channel-blocking jobs, I think.

@vcunat
Copy link
Member Author

vcunat commented Nov 26, 2021

Well, we could remove keymap.qwertz.aarch64-linux from the blocking set, at least for now. It certainly doesn't feel so critical to me. (Though qwertz layout gets vast majority in my country actually.)

@erictapen
Copy link
Member

In #144106 (comment) it was discussed wether Linux 5.15 could be the cause, but the way I see it the test is running with Linux 5.10? 5.15 seems only to be used for the stdenv.

@rnhmjoj
Copy link
Contributor

rnhmjoj commented Nov 26, 2021

Well, we could remove keymap.qwertz.aarch64-linux from the blocking set, at least for now. It certainly doesn't feel so critical to me. (Though qwertz layout gets vast majority in my country actually.)

Yes, It is at least surprising that a large part of the channel tests is about keyboard layouts.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-2-4-and-what-s-next/16257/11

@nrdxp
Copy link

nrdxp commented Nov 27, 2021

if the keymap tests have no maintainers, perhaps they should be removed, or at leat disabled for now.

@Artturin
Copy link
Member

Artturin commented Nov 27, 2021

it looks like its random which one symbol fails

https://hydra.nixos.org/build/159750551/nixlog/8

machine # [   33.872518] root[1121]: testReader: FAIL: Expected '@|{[]}' but got 'q|{[]}'.

https://hydra.nixos.org/build/159733263/nixlog/17

machine # [   26.037142] root[1041]: testReader: FAIL: Expected '@|{[]}' but got '@|{[]0'.

azerty fails sometimes too
https://hydra.nixos.org/job/nixos/trunk-combined/nixos.tests.keymap.azerty.aarch64-linux
https://hydra.nixos.org/build/159749753/nixlog/8

machine # [   30.188818] root[1143]: testReader: FAIL: Expected 'az' but got 'æ«'.

@rnhmjoj
Copy link
Contributor

rnhmjoj commented Nov 27, 2021

This test has been flaky since forever and does not provide much value.
I propose to simply remove it from the tested job.

@tomberek
Copy link
Contributor

tomberek commented Nov 27, 2021

I'm thinking this is a race condition with sending keys too fast, maybe made worse when a builder is under high load. Something like this perhaps?

diff --git a/nixos/lib/test-driver/test-driver.py b/nixos/lib/test-driver/test-driver.py
index 643446f313e..adacca47abe 100755
--- a/nixos/lib/test-driver/test-driver.py
+++ b/nixos/lib/test-driver/test-driver.py
@@ -904,6 +904,7 @@ class Machine:
     def send_key(self, key: str) -> None:
         key = CHAR_TO_KEY.get(key, key)
         self.send_monitor_command("sendkey {}".format(key))
+        time.sleep(0.01)

     def start(self) -> None:
         if self.booted:

@Artturin
Copy link
Member

Artturin commented Nov 27, 2021

I'm thinking this is a race condition with sending keys too fast, maybe made worse when a builder is under high load. Something like this perhaps?

diff --git a/nixos/lib/test-driver/test-driver.py b/nixos/lib/test-driver/test-driver.py
index 643446f313e..adacca47abe 100755
--- a/nixos/lib/test-driver/test-driver.py
+++ b/nixos/lib/test-driver/test-driver.py
@@ -904,6 +904,7 @@ class Machine:
     def send_key(self, key: str) -> None:
         key = CHAR_TO_KEY.get(key, key)
         self.send_monitor_command("sendkey {}".format(key))
+        time.sleep(0.01)

     def start(self) -> None:
         if self.booted:

EDIT: it takes way too long to run the test without kvm so i didn't
i'll run the test with and without that. and with stress running to mimic high load

nix build ".#nixosTests.keymap.qwertz" --system aarch64-linux

i had to apply this patch to run the test vm because kvm was failing but its very very slow now

diff --git a/nixos/lib/qemu-common.nix b/nixos/lib/qemu-common.nix
index 1a1f7531feb..26584bc0f54 100644
--- a/nixos/lib/qemu-common.nix
+++ b/nixos/lib/qemu-common.nix
@@ -24,7 +24,7 @@ rec {
   qemuBinary = qemuPkg: {
     x86_64-linux = "${qemuPkg}/bin/qemu-kvm -cpu max";
     armv7l-linux = "${qemuPkg}/bin/qemu-system-arm -enable-kvm -machine virt -cpu host";
-    aarch64-linux = "${qemuPkg}/bin/qemu-system-aarch64 -enable-kvm -machine virt,gic-version=host -cpu host";
+    aarch64-linux = "${qemuPkg}/bin/qemu-system-aarch64  -machine virt, -cpu max";
     powerpc64le-linux = "${qemuPkg}/bin/qemu-system-ppc64 -machine powernv";
     powerpc64-linux = "${qemuPkg}/bin/qemu-system-ppc64 -machine powernv";
     x86_64-darwin = "${qemuPkg}/bin/qemu-kvm -cpu max";
vm-test-run-keymap-qwertz> Machine state will be reset. To keep it, pass --keep-vm-state
vm-test-run-keymap-qwertz> start all VLans
vm-test-run-keymap-qwertz> start vlan
vm-test-run-keymap-qwertz> running vlan (pid 14)
vm-test-run-keymap-qwertz> (0.03 seconds)
vm-test-run-keymap-qwertz> run the VM test script
vm-test-run-keymap-qwertz> additionally exposed symbols:
vm-test-run-keymap-qwertz>     machine,
vm-test-run-keymap-qwertz>     vlan1,
vm-test-run-keymap-qwertz>     start_all, test_script, machines, vlans, driver, log, os, create_machine, subtest, run_tests, join_all, retry, serial_stdout_off, serial_stdout_on, Machine
vm-test-run-keymap-qwertz> machine: waiting for the X11 server
vm-test-run-keymap-qwertz> machine: waiting for the VM to finish booting
vm-test-run-keymap-qwertz> machine: starting vm
vm-test-run-keymap-qwertz> machine # Formatting '/build/vm-state-machine/machine.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=1073741824 lazy_refcounts=off refcount_bits=16
vm-test-run-keymap-qwertz> machine # kvm version too old
vm-test-run-keymap-qwertz> machine # qemu-system-aarch64: failed to initialize kvm: Function not implemented
vm-test-run-keymap-qwertz> machine: QEMU running (pid 17)
vm-test-run-keymap-qwertz> machine: connected to guest root shell
vm-test-run-keymap-qwertz> machine: (connecting took 0.00 seconds)
vm-test-run-keymap-qwertz> (0.54 seconds)
vm-test-run-keymap-qwertz> cleanup
vm-test-run-keymap-qwertz> kill machine (pid 17)
vm-test-run-keymap-qwertz> (0.00 seconds)
vm-test-run-keymap-qwertz> Traceback (most recent call last):
vm-test-run-keymap-qwertz>   File "/nix/store/kzvmq57993jb0hyx5zn29c623bk7x3hd-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 1336, in <module>
vm-test-run-keymap-qwertz>     driver.run_tests()
vm-test-run-keymap-qwertz>   File "/nix/store/kzvmq57993jb0hyx5zn29c623bk7x3hd-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 1212, in run_tests
vm-test-run-keymap-qwertz>     self.test_script()
vm-test-run-keymap-qwertz>   File "/nix/store/kzvmq57993jb0hyx5zn29c623bk7x3hd-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 1208, in test_script
vm-test-run-keymap-qwertz>     exec(self.tests, symbols, None)
vm-test-run-keymap-qwertz>   File "<string>", line 49, in <module>
vm-test-run-keymap-qwertz>   File "/nix/store/kzvmq57993jb0hyx5zn29c623bk7x3hd-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 997, in wait_for_x
vm-test-run-keymap-qwertz>     retry(check_x)
vm-test-run-keymap-qwertz>   File "/nix/store/kzvmq57993jb0hyx5zn29c623bk7x3hd-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 192, in retry
vm-test-run-keymap-qwertz>     if fn(False):
vm-test-run-keymap-qwertz>   File "/nix/store/kzvmq57993jb0hyx5zn29c623bk7x3hd-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 990, in check_x
vm-test-run-keymap-qwertz>     status, _ = self.execute(cmd)
vm-test-run-keymap-qwertz>   File "/nix/store/kzvmq57993jb0hyx5zn29c623bk7x3hd-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 605, in execute
vm-test-run-keymap-qwertz>     self.shell.send(out_command.encode())
vm-test-run-keymap-qwertz> BrokenPipeError: [Errno 32] Broken pipe

@vcunat
Copy link
Member Author

vcunat commented Nov 28, 2021

So, at least the simple step for now? #147715

@sternenseemann sternenseemann removed the 1.severity: channel blocker Blocks a channel label Nov 28, 2021
@lheckemann
Copy link
Member

@tomberek the sleep patch does indeed fix the issue, at least when building on my Honeycomb. I do think it's odd though that the test fails in exactly the same way every time I run it. I'd expect it to fail less consistently, or in different ways…

@Artturin
Copy link
Member

Artturin commented Dec 2, 2021

azerty test fails sometimes too https://hydra.nixos.org/job/nixos/release-21.11/nixos.tests.keymap.azerty.aarch64-linux

@lheckemann
Copy link
Member

While it's far from elegant, I'd be fine with adding the sleep patch. Since it's only 10ms per character, I don't think the performance impact will be significant, and I'd prefer to start verifying the functionality again (major use case: encryption passphrase input, where the results aren't even visible).

Artturin added a commit to Artturin/nixpkgs that referenced this issue Dec 3, 2021
@Artturin
Copy link
Member

Artturin commented Dec 3, 2021

#148491

github-actions bot pushed a commit that referenced this issue Dec 4, 2021
attempt to fix #147294

(cherry picked from commit 60422ba)
@Artturin
Copy link
Member

Artturin commented Dec 9, 2021

I'm thinking this is a race condition with sending keys too fast, maybe made worse when a builder is under high load. Something like this perhaps?

diff --git a/nixos/lib/test-driver/test-driver.py b/nixos/lib/test-driver/test-driver.py
index 643446f313e..adacca47abe 100755
--- a/nixos/lib/test-driver/test-driver.py
+++ b/nixos/lib/test-driver/test-driver.py
@@ -904,6 +904,7 @@ class Machine:
     def send_key(self, key: str) -> None:
         key = CHAR_TO_KEY.get(key, key)
         self.send_monitor_command("sendkey {}".format(key))
+        time.sleep(0.01)

     def start(self) -> None:
         if self.booted:

i've run some tests on my rpi4b and found that applying this patch fixes the issue even with load and no delay(i still think we should leave the delay)

diff --git a/nixos/modules/virtualisation/qemu-vm.nix b/nixos/modules/virtualisation/qemu-vm.nix
index c7c3d747464..fa3e25afb03 100644
--- a/nixos/modules/virtualisation/qemu-vm.nix
+++ b/nixos/modules/virtualisation/qemu-vm.nix
@@ -835,6 +835,7 @@ in

     # FIXME: Consolidate this one day.
     virtualisation.qemu.options = mkMerge [
+      [ "-device virtio-keyboard" ]
       (mkIf pkgs.stdenv.hostPlatform.isx86 [
         "-usb" "-device usb-tablet,bus=usb-bus.0"
       ])
0.02 sometimes works if there is no other load
at least 0.04 is needed to sometimes not fail when i run `stress -c 4`
0.1 is safe

according to this [paper](https://userinterfaces.aalto.fi/136Mkeystrokes/resources/chi-18-analysis.pdf)
the average inter-key interval of a human is 238ms

INTER-KEY INTERVALS Average inter-key interval is 238.656 ms
(SD = 111.6). A lower bound of about 60 ms can be observed.

Artturin added a commit to Artturin/nixpkgs that referenced this issue Dec 9, 2021
by default a ps/2 keyboard input is used which seems to cause issues
on aarch64-linux when the machine is used high load, causing the keymap
qwertz test to always fail and azerty to sometimes fail
See NixOS#147294
github-actions bot pushed a commit that referenced this issue Dec 10, 2021
by default a ps/2 keyboard input is used which seems to cause issues
on aarch64-linux when the machine is used high load, causing the keymap
qwertz test to always fail and azerty to sometimes fail
See #147294

(cherry picked from commit 39c5525)
Artturin added a commit that referenced this issue Dec 19, 2021
by default a ps/2 keyboard input is used which seems to cause issues
on aarch64-linux when the machine is used high load, causing the keymap
qwertz test to always fail and azerty to sometimes fail
See #147294

(cherry picked from commit 39c5525)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
9 participants