Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.7.7 - Database unlock hangs in virtual machine with 1 CPU #10391

Closed
pican79 opened this issue Mar 11, 2024 · 37 comments · Fixed by #11155
Closed

2.7.7 - Database unlock hangs in virtual machine with 1 CPU #10391

pican79 opened this issue Mar 11, 2024 · 37 comments · Fixed by #11155

Comments

@pican79
Copy link

pican79 commented Mar 11, 2024

Overview

On Xubuntu 20.04, I updated from version 2.7.6 to 2.7.7 using the "phoerious PPA". Now, my database doesn't open anymore.

Steps to Reproduce

  1. Launch KeepassXC
  2. Enter Database password
  3. Click Unlock button

Expected Behavior

Database opens and I can use KeepassXC like before.

Actual Behavior

KeepassXC hangs indefinitely on "Unlock Database" screen: endless spinning wheel, password+key file fields greyed out, inactive close & unlock buttons

Context

Tried uninstalling & reinstalling.
Can not downgrade to 2.7.6 as it's no longer available in the PPA for focal.

KeePassXC - 2.7.7
Revision: 68e2dd8

Operating System: Linux
Desktop Env: XFCE
Windowing System: X11

@pican79 pican79 added the bug label Mar 11, 2024
@droidmonkey
Copy link
Member

droidmonkey commented Mar 11, 2024

Grab the 2.7.6 appimage from here: https://github.com/keepassxreboot/keepassxc/releases/download/2.7.6/KeePassXC-2.7.6-x86_64.AppImage

Unfortunately, we can't possibly fix your specific problem without more information or a debug run.

@pican79
Copy link
Author

pican79 commented Mar 12, 2024

How can I provide you additional info?
Does KeepassXC generate logs somewhere? How can I run KeepassXC in debug mode?

@droidmonkey
Copy link
Member

droidmonkey commented Mar 12, 2024

Without providing us your database (don't do that) it is nearly impossible to diagnose the issue. Did the 2.7.6 appimage solve your issue? Can you try the 2.7.7 appimage or flatpak?

@pican79
Copy link
Author

pican79 commented Mar 12, 2024

Something really weird is going on:

  • I uninstalled 2.7.7 from the ppa again
  • I installed the version from the focal repo (2.4.3+dfsg.1-1build1): I was able to open my database without any issue
  • I uninstalled 2.4.3 and its dependencies then reinstalled 2.7.7 again: same issue ie endless spinning wheel on unlock database screen
  • I clicked the Settings button then clicked the Cancel button: my database was unlocked.

I have to do that last step every time I launch KeepassXC 2.7.7.

I haven't tried the flatpak or appimage yet as I don't like alternative package formats.

@droidmonkey
Copy link
Member

Focal is a very old distro. This could very well be a library incompatibility problem, which is why I suggest a packaged deployment instead of ppa or native install.

@pican79
Copy link
Author

pican79 commented Mar 12, 2024

I just tried the 2.7.7 AppImage and got the same behavior as with the "2.7.7 ppa".

2.7.6 AppImage worked OK. I saw these messages in the console though:

OpenType support missing for "Saab", script 13
qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow), sequence: 999, resource id: 18967490, major code: 40 (TranslateCoords), minor code: 0

@michaelk83
Copy link

Focal is LTS, still under official support. If some specific libraries are too old in the Focal repositories, the PPA should provide updated versions of those libraries.

@droidmonkey
Copy link
Member

I'm just theorizing reasons for the behavior.

@michaelk83
Copy link

michaelk83 commented Mar 13, 2024

How can I provide you additional info?

You can try running a snapshot build with a debugger attached, and post the stack trace that you get when the hang occurs. And just in case, which version of Argon2 and Qt do you have installed?

@pican79
Copy link
Author

pican79 commented Mar 13, 2024

Qt version is 5.12.8 (according to apt list --installed | grep libqt) and Argon2 version is 0~20171227-0.2 (according to apt list --installed | grep argon2).
Let me know if I should've used other commands to determine those info.

I tried the latest snapstot build from 2024/03/10 with gdb but the log only contained such lines:
[New LWP random_number]
or
[LWP random_number exited]

During my tests, I found a new way to exit the "hang state": minimize the window than "unminimize" it.

@droidmonkey
Copy link
Member

This seems to be a user interface refresh issue and not a true hang. I tried to replicate on xfce and couldn't at all.

@pican79
Copy link
Author

pican79 commented Mar 13, 2024

In case it's relevant, my Xubuntu system is a Virtualbox (7.0.14) VM running on a Windows 10 host.

@droidmonkey
Copy link
Member

So is mine 😅

@norbertj123
Copy link

I encountered the same issue (hangs forever after password entry) in a VM with Fedora 39 Xfce Spin installed. The issue disappeared when I increased the number of CPUs from 1 to 2. That's very strange as I have successfully used VMs with only 1 CPU for years.

@pican79
Copy link
Author

pican79 commented Mar 16, 2024

I encountered the same issue (hangs forever after password entry) in a VM with Fedora 39 Xfce Spin installed. The issue disappeared when I increased the number of CPUs from 1 to 2. That's very strange as I have successfully used VMs with only 1 CPU for years.

Thanks for the tip. After increasing the number of CPUs to 2, the "hang issue" no longer occurred.
I never had problems before with only 1 CPU either.

When using 1 CPU, I also noticed a "side effect" with 2.7.7. When clicking the close button, it didn't close KeepassXC like before.
It simply closed the database. I had to click the close button again to actually close the app.

@the-wolfman
Copy link

the-wolfman commented Mar 16, 2024

@norbertj123 Very good catch.

In my setup (issue #10425) I did not have any systems with 1 VCPU, but increasing the number of VCPUs from 2 to 3 or higher solved it for me. I thought at first it could be related to database encryption settings, specifically "threads", but I strongly believe that the issue is happening earlier. Reason is I encounter the issue specifically when securing the database with a yubikey and it is not even getting to the challenge-response.

Further testing shows

  • I am having all kinds of issues when running with 1 VCPU
  • I can open databases, but not one secured with a yubikey running with 2 VCPUs (worked earlier)
  • Everything working fine - as far as I tested - running with 3 VCPUs or more

My take at the moment: This is a UI related race condition relying on threads to be available depending on features used. In my case, checking the yubikey and displaying the "please touch ribbon" starts a thread, or tries to, which won't come back as expected.

I believe issues #10391 and #10425 are closely related or even duplicates.

@pican79
Copy link
Author

pican79 commented Mar 16, 2024

I did a test in a 22.04 Xubuntu VM with only 1 CPU.

With 2.7.7 (ppa or appimage), KeepassXC completely freezes after selecting a database and clicking "I have a key file". I have no choice but to kill the app.
It doesn't happen with the 2.7.6 appimage.

With 2 CPUs, 2.7.7 works OK.

@droidmonkey
Copy link
Member

Excellent will debug this one next

@droidmonkey droidmonkey added this to the v2.7.8 milestone Mar 16, 2024
@droidmonkey droidmonkey self-assigned this Mar 16, 2024
@droidmonkey droidmonkey changed the title Can no longer open database after 2.7.7 update 2.7.7 - Database unlock hangs in virtual machine with 1 CPU Mar 16, 2024
@fugtui
Copy link

fugtui commented Mar 17, 2024

* I can open databases, but not one secured with a yubikey running with 2 VCPUs (worked earlier)
* Everything working fine - as far as I tested - running with 3 VCPUs or more

can confirm the behavior for hyper-v with 2 vcpus + yubikey on ubuntu 20.04. with 3 vcpus the challenge-response and everything works as expected. Thanks @norbertj123 for the workaround.

@droidmonkey
Copy link
Member

droidmonkey commented Mar 17, 2024

I find that minimizing and restoring while locked up ends up showing the unlocked database. I can get it to lock up without a yubikey so that doesn't seem to be relevant

@the-wolfman
Copy link

the-wolfman commented Mar 19, 2024

I agree, the yubikey challenge-response is not the reason for the hang but one possible trigger. This may be important. @droidmonkey says it's stuck at building the transformed key. The response from the yubikey should be an integral part of that build process. In my case, the "touch the yubikey" ribbon does not even appear, the challenge is never sent, there is no valid response other than maybe an empty one. Should the Argon build process even run at this point in time? Shouldn't it be waiting for the "second factor" first, whatever that may be, a key file, a security key response, both of them ... and then start building? Does it try to start a "thread per factor" but doesn't have enough resources in form of VCPUs to get them all aligned?

@droidmonkey
Copy link
Member

Can you test changing your database encryption to AES KDF (database -> database security -> encryption tab) and see if it still hangs.

@the-wolfman
Copy link

Yes, same behavior. I tried with my yubikey and 2 VCPU setup mentioned earlier. I was able to change encryption, store the db with yubikey chall-resp. When trying to open it again, it hangs. As before I can overcome it by clicking on another database tab which will then trigger the blue ribbon.

@droidmonkey
Copy link
Member

Ok, well, this is basically a ui problem, but not a ui hang since you can still interact with the app. Everything seems to function properly in the backend. Key transforms happen on encryption settings change and saving.

My only thought left is that something is preventing the event loop from receiving the finished signal and kicking over the ui in some way "releases it". Very strange.

@the-wolfman
Copy link

Don`t want to be finnicky, but this may be a hint: If I have only one database showing at the unlock dialog in my setup, the UI hangs. It doesn't do anything with the only exception of minimize/maximize window, which results in redraw issues of all sorts. I can only kill keepassxc in that case. If I have 2 database tabs, e.g. on of them open without yubikey interaction, then I can select another tab and get things going again.

@the-wolfman
Copy link

the-wolfman commented Mar 28, 2024

I just came across this comment in #9251 and similar notes on refactoring and GUI/core separation. Sounds to me this issue may influence the refactoring or even be solved by it:
" ... need to abstract the actual opening of a database and key material handling away from the open widget itself. This would include the dance with yubikey code since it is async. We should be able to just bypass the entire widget operation if given key material, basically render it disabled while processing. Then if unlock fails just reset the widget." (#9251 (comment))

@z1atk0
Copy link

z1atk0 commented Apr 7, 2024

Same problem & symptoms here as the OP. The strange workaround (Settings => Cancel during the unlock hang) also works here as well, but then a "hard hang" occurs on quitting the application with Ctrl+Q. In that state keepassxc then needs to be kill -KILLed from the command line.

I'm on Slackware64-15.0 on an Intel Celeron 743, which only has one core, and no hyperthreading (being a Celeron and all 🙂). Both AlienBob's SlackPkg and the official AppImage for 2.7.7 show exactly the same behaviour, and both versions/releases of 2.7.6 work just fine.

@xianwenchen
Copy link

I have the same problem, symptoms, and workarounds as z1atk0 on a Void Linux i686 system with latest packages from Void Linux official repositories.

If I type password and unlock the database, the UI shows KeePassXC is busy. The CPU usage is almost none. If I then click Settings, the UI was no longer busy. If I then click Cancel, I can use KeePassXC normally.

If I open KeePassXC, do not type anything, and only move the mouse cursor around, the UI becomes busy as well.

If I type password and unlock the database, if I then click anything that is not KeePassXC, KeePassXC freezes. I will have to kill the process.

@droidmonkey droidmonkey modified the milestones: v2.7.8, v2.8.0 May 5, 2024
@z1atk0
Copy link

z1atk0 commented May 9, 2024

Just for the record, the newly released 2.7.8 still has the same problem. That's probably to be expected with the milestone of this issue set to 2.8.0, but I thought I'd mention it nevertheless, for good measure. 😉

@droidmonkey
Copy link
Member

We couldn't find a cause which is rather disheartening

@ClaraCrazy
Copy link

+1 Also having this issue. My system runs Qubes OS R4.2.1, and this vault VM is on the stock fedora 39 template.

Giving the VM a second vCPU did indeed fix the issue, so thats good, but I'd love to help diagnose this further. I have to admit im short on time lately, but if theres any specific info that might help you guys or if you have patches that need testing (since some seem to be unable to replicate this), please let me know.

@droidmonkey
Copy link
Member

I finally fixed my fedora vm so will give this another go in a proper debugger to see what is happening here.

@c4rlo
Copy link
Contributor

c4rlo commented Jun 16, 2024

FWIW, I wasn't able to reproduce this using

$ taskset --all-tasks --cpu-list 0 keepassxc

on either of my two multi-core machines I just tested this on. The above is meant to pin the process to a single CPU. I've tried pinning it to a few different CPUs; database unlocking always works as normal.

This is with KeePassXC 2.7.8 on Wayland (sway) on Arch Linux.

@d-brasher
Copy link

d-brasher commented Aug 11, 2024

The first commit where I can reproduce the issue seems to be f20b531

Moreover, compiling without -DWITH_XC_YUBIKEY, the issue does not appear, both in f20b531 and tagged 2.7.9

@droidmonkey
Copy link
Member

OK, so the hangup occurs with the introduction of the concurrent function call that handles libusb_handle_events_completed interaction. The sequence is roughly:

  1. DB is loaded and initiates the hardware key poll which kicks off the loop that handles the above
  2. Unlocking the database initiates another loop to derive the master key
  3. The master key derivation never starts, basically it appears that the app is stuck processing the first loop
  4. When you minimize keepassxc that causes some sort of "flush" to the loops and the key derivation starts and finishes without issue

Basically this happens because the thread handling the usb events never exits/completes until it is forced to when the window is minimized or you switch database tabs. Since there is only 1 processor, Qt only allocates 1 thread to the global pool and the second thread never starts. Bumping up the thread pool to 2 fixes this problem, but we should still investigate why the libusb function never does what we expect. @phoerious

@droidmonkey droidmonkey modified the milestones: v2.8.0, v2.7.10 Aug 11, 2024
pbsds pushed a commit to NixOS/nixpkgs that referenced this issue Nov 8, 2024
See keepassxreboot/keepassxc#10391

An alternative would be to set

    virtualisation.cores = 2;

in the keepassxc nixos test.
TheRedstoneDEV-DE pushed a commit to TheRedstoneDEV-DE/nixpkgs that referenced this issue Nov 9, 2024
See keepassxreboot/keepassxc#10391

An alternative would be to set

    virtualisation.cores = 2;

in the keepassxc nixos test.
yadokani389 pushed a commit to yadokani389/nixpkgs that referenced this issue Nov 10, 2024
See keepassxreboot/keepassxc#10391

An alternative would be to set

    virtualisation.cores = 2;

in the keepassxc nixos test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment