Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA: Driver installation fails with error: Running %post for akmod-nvidia #286

Open
samjcarter opened this issue May 21, 2022 · 31 comments
Labels
bug Something isn't working external Issue related to external project not part of Fedora f39 Related to Fedora 39 f40 Related to Fedora 40

Comments

@samjcarter
Copy link

samjcarter commented May 21, 2022

Description

I have two workstations with Fedora 36 Silverblue. One of them has a 6 core, 12 thread Intel cpu, and the other, a 16 core 32 thread AMD cpu. Both use Nvidia graphics cards.

On the 16 core machine only, while attempting to rpm-ostree install akmod-nvidia xorg-x11-drv-nvidia-cuda drivers, and on all subsequence uses of rpm-ostree install with any other package, the install fails with an error. Before attempting to install nvidia drivers, other packages installed with rpm-ostree install package succeed without errors.

I can work around the error (and stop it from showing) by using a short bash script to disable some of the CPU cores on the 16 core computer. Imust run the script before every use of rpm-ostree install. The error never occurs when carrying out the same steps on the 6 core computer. The only difference being, that computer has an older graphics card, and so must rpm-ostree install akmod-nvidia-470xx xorg-x11-drv-nvidia-470xx-cuda instead.

To Reproduce

Please describe the steps needed to reproduce the bug:

  1. Use a 16 Core CPU (mine is a 16 core AMD Threadripper 1950x, GPU is a GTX 1080 Ti). Try other high core count CPUs if a 16 isn't available.
  2. Fresh install of Fedora 36 Silverblue.
  3. Install all available updates rpm-ostree update
  4. Try installing a layered package with rpm-ostree install htop (htop as an example)
  5. Note the install should finish without issue. You will be prompted to reboot with systemctl reboot.
  6. Add rpm-fusion repository sudo rpm-ostree install https://mirrors.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm https://mirrors.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm. Detailed instructions at https://rpmfusion.org/Configuration
  7. Install Nvidia drivers sudo rpm-ostree install akmod-nvidia xorg-x11-drv-nvidia-cuda
  8. Notice the install ends with an error message (see screenshots section for full terminal output and subsequent journal output of this error)
error: Running %post for akmod-nvidia: bwrap(/bin/sh): Child process killed by signal 1; run `journalctl -t 'rpm-ostree(akmod-nvidia.post)'` for more information
  1. Try disabling some CPU cores with the following script. I found the script over at Fedora Discussion from a Silverblue 31 issue. https://discussion.fedoraproject.org/t/fedora-silverblue-31-installing-nvidia-drivers-fails/14160
#!/bin/sh

do_enable="$1"
cpu_from="$2"
cpu_to="$3"

flag=0
if [ "$do_enable" = true ]; then
    flag=1
elif [ ! "$do_enable" = false ]; then
    echo "do_enable must be bool. It's value is $do_enable."
exit 1
fi

for ((i="$cpu_from"; i<="$cpu_to"; i++)); do
    echo "$flag" > /sys/devices/system/cpu/cpu"$i"/online
done
  1. Paste the script into a shell file called disable-threads.sh and make it executable chmod +x disable-threads.sh.
  2. Run the script from the terminal. For a 32 thread CPU, you can switch off all but 8 threads by running with the following arguments sudo ./dissable-threads.sh false 9 31. (confirm in System Monitor Resources tab)
  3. Repeat the attempt to install with rpm-ostree install package and it should complete without the error.
  4. Don't forget (assuming your nvidia install was successful) to run the final command from rpmfusion to load the correct kernel sudo rpm-ostree kargs --append=rd.driver.blacklist=nouveau --append=modprobe.blacklist=nouveau --append=nvidia-drm.modeset=1
  5. Reboot to complete the installation and all your threads should be active again.

Expected behavior

No error should interrupt the nvidia driver install on the 16 core computer. All subsequent uses of rpm-ostree install package should also not fail with the same error. The behaviour of the 16 core computer should match the 6 core computer, where the error never appears.

Screenshots / Terminal Output

Fresh Fedora 36 Silverblue install with updates done

[user@fedora ~]$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                    Commit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4

Install a layered package, eg; htop

[user@fedora ~]$ rpm-ostree install htop
Checking out tree 3d384f5... done
Enabled rpm-md repositories: fedora-cisco-openh264 fedora-modular updates-modular updates fedora phracek-PyCharm google-chrome rpmfusion-nonfree-nvidia-driver rpmfusion-nonfree-steam updates-archive
Importing rpm-md... done
rpm-md repo 'fedora-cisco-openh264' (cached); generated: 2022-04-07T16:52:38Z solvables: 4
rpm-md repo 'fedora-modular' (cached); generated: 2022-05-04T21:12:01Z solvables: 825
rpm-md repo 'updates-modular' (cached); generated: 2022-05-16T00:18:23Z solvables: 1129
rpm-md repo 'updates' (cached); generated: 2022-05-21T01:01:09Z solvables: 9741
rpm-md repo 'fedora' (cached); generated: 2022-05-04T21:16:11Z solvables: 67992
rpm-md repo 'phracek-PyCharm' (cached); generated: 2022-05-13T04:23:58Z solvables: 5
rpm-md repo 'google-chrome' (cached); generated: 2022-05-19T17:44:58Z solvables: 3
rpm-md repo 'rpmfusion-nonfree-nvidia-driver' (cached); generated: 2022-05-13T09:29:28Z solvables: 29
rpm-md repo 'rpmfusion-nonfree-steam' (cached); generated: 2022-02-13T17:48:12Z solvables: 2
rpm-md repo 'updates-archive' (cached); generated: 2022-05-21T02:57:21Z solvables: 8932
Resolving dependencies... done
Will download: 1 package (184.3 kB)
Downloading from 'updates'... done
Importing packages... done
Checking out packages... done
Running pre scripts... done
Running post scripts... done
Running posttrans scripts... done
Writing rpmdb... done
Writing OSTree commit... done
Staging deployment... done
Added:
  htop-3.2.0-1.fc36.x86_64
Changes queued for next boot. Run "systemctl reboot" to start a reboot

Check rpm-ostree status

first; systemctl reboot

State: idle
Deployments:
● fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                BaseCommit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4
           LayeredPackages: htop

  fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                    Commit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4


Remove HTOP again

rpm-ostree uninstall htop

reboot and check it's gone:

[user@fedora ~]$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                    Commit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4

  fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                BaseCommit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4
           LayeredPackages: htop

Htop successfully removed.

Add rpmfusion repositories (successful)

Downloading https://mirrors.rpmfusion.org/free/fedora/rpmfusion-free-release-36.noarch.rpm...done
Downloading https://mirrors.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-36.noarch.rpm...done
Checking out tree 3d384f5... done
Enabled rpm-md repositories: fedora-cisco-openh264 fedora-modular updates-modular updates fedora phracek-PyCharm google-chrome rpmfusion-nonfree-nvidia-driver rpmfusion-nonfree-steam updates-archive
Importing rpm-md... done
rpm-md repo 'fedora-cisco-openh264' (cached); generated: 2022-04-07T16:52:38Z solvables: 4
rpm-md repo 'fedora-modular' (cached); generated: 2022-05-04T21:12:01Z solvables: 825
rpm-md repo 'updates-modular' (cached); generated: 2022-05-16T00:18:23Z solvables: 1129
rpm-md repo 'updates' (cached); generated: 2022-05-21T01:01:09Z solvables: 9741
rpm-md repo 'fedora' (cached); generated: 2022-05-04T21:16:11Z solvables: 67992
rpm-md repo 'phracek-PyCharm' (cached); generated: 2022-05-13T04:23:58Z solvables: 5
rpm-md repo 'google-chrome' (cached); generated: 2022-05-19T17:44:58Z solvables: 3
rpm-md repo 'rpmfusion-nonfree-nvidia-driver' (cached); generated: 2022-05-13T09:29:28Z solvables: 29
rpm-md repo 'rpmfusion-nonfree-steam' (cached); generated: 2022-02-13T17:48:12Z solvables: 2
rpm-md repo 'updates-archive' (cached); generated: 2022-05-21T02:57:21Z solvables: 8932
Resolving dependencies... done
Checking out packages... done
Running pre scripts... done
Running post scripts... done
Running posttrans scripts... done
Writing rpmdb... done
Writing OSTree commit... done
Staging deployment... done
Added:
  rpmfusion-free-release-36-1.noarch
  rpmfusion-nonfree-release-36-1.noarch
Changes queued for next boot. Run "systemctl reboot" to start a reboot

Attempt Nvidia driver install (Fails with Error)

[user@fedora ~]$ sudo rpm-ostree install akmod-nvidia xorg-x11-drv-nvidia-cuda
[sudo] password for sam: 
Checking out tree 3d384f5... done
Enabled rpm-md repositories: fedora-cisco-openh264 fedora-modular updates-modular updates fedora rpmfusion-free-updates rpmfusion-free rpmfusion-nonfree-updates rpmfusion-nonfree phracek-PyCharm google-chrome rpmfusion-nonfree-nvidia-driver rpmfusion-nonfree-steam updates-archive
Updating metadata for 'rpmfusion-free-updates'... done
Updating metadata for 'rpmfusion-free'... done
Updating metadata for 'rpmfusion-nonfree-updates'... done
Updating metadata for 'rpmfusion-nonfree'... done
Importing rpm-md... done
rpm-md repo 'fedora-cisco-openh264' (cached); generated: 2022-04-07T16:52:38Z solvables: 4
rpm-md repo 'fedora-modular' (cached); generated: 2022-05-04T21:12:01Z solvables: 825
rpm-md repo 'updates-modular' (cached); generated: 2022-05-16T00:18:23Z solvables: 1129
rpm-md repo 'updates' (cached); generated: 2022-05-21T01:01:09Z solvables: 9741
rpm-md repo 'fedora' (cached); generated: 2022-05-04T21:16:11Z solvables: 67992
rpm-md repo 'rpmfusion-free-updates'; generated: 2022-05-18T15:49:28Z solvables: 10
rpm-md repo 'rpmfusion-free'; generated: 2022-05-04T04:48:11Z solvables: 506
rpm-md repo 'rpmfusion-nonfree-updates'; generated: 2022-05-18T16:10:50Z solvables: 2
rpm-md repo 'rpmfusion-nonfree'; generated: 2022-05-04T05:11:55Z solvables: 225
rpm-md repo 'phracek-PyCharm' (cached); generated: 2022-05-13T04:23:58Z solvables: 5
rpm-md repo 'google-chrome' (cached); generated: 2022-05-19T17:44:58Z solvables: 3
rpm-md repo 'rpmfusion-nonfree-nvidia-driver' (cached); generated: 2022-05-13T09:29:28Z solvables: 29
rpm-md repo 'rpmfusion-nonfree-steam' (cached); generated: 2022-02-13T17:48:12Z solvables: 2
rpm-md repo 'updates-archive' (cached); generated: 2022-05-21T02:57:21Z solvables: 8932
Resolving dependencies... done
Will download: 149 packages (389.5 MB)
Downloading from 'updates'... done
Downloading from 'fedora'... done
Downloading from 'rpmfusion-nonfree'... done
Importing packages... done
Checking out packages... done
Running pre scripts... done
Running post scripts... done
error: Running %post for akmod-nvidia: bwrap(/bin/sh): Child process killed by signal 1; run `journalctl -t 'rpm-ostree(akmod-nvidia.post)'` for more information

Output of journalctl -t 'rpm-ostree(akmod-nvidia.post)'

[user@fedora ~]$ journalctl -t 'rpm-ostree(akmod-nvidia.post)'
May 22 00:49:20 fedora rpm-ostree(akmod-nvidia.post)[3795]: Building /usr/src/akmods/nvidia-kmod-510.68.02-1.fc36.src.rpm for kernel 5.17.8-300.fc36.x86_64
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:    { echo ; echo '/tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5.17.8-300.f>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]: make[1]: *** [Makefile:1841: /tmp/akmodsbuild.A8nzpiPo/BUILD/nvidia-kmod-510.68.02/_kmod_build_5>
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]: make[1]: Leaving directory '/usr/src/kernels/5.17.8-300.fc36.x86_64'
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]: make: *** [Makefile:82: modules] Error 2
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]: error: Bad exit status from /var/tmp/rpm-tmp.J15NC9 (%build)
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]: RPM build errors:
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     Unable to open sqlite database /usr/share/rpm/rpmdb.sqlite: unable to open database file
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     cannot open Packages index using sqlite - Operation not permitted (1)
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     cannot open Packages database in /usr/share/rpm
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     user mockbuild does not exist - using root
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     group mock does not exist - using root
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     user mockbuild does not exist - using root
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     group mock does not exist - using root
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     user mockbuild does not exist - using root
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     group mock does not exist - using root
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     user mockbuild does not exist - using root
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     group mock does not exist - using root
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     Unable to open sqlite database /usr/share/rpm/rpmdb.sqlite: unable to open database file
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     cannot open Packages index using sqlite - Operation not permitted (1)
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     cannot open Packages database in /usr/share/rpm
May 22 00:49:44 fedora rpm-ostree(akmod-nvidia.post)[19865]:     Bad exit status from /var/tmp/rpm-tmp.J15NC9 (%build)

Run our dissable-threads.sh script

sudo ./dissable-threads.sh false 9 31

Attempt Nvidia driver install second time (success)

[user@fedora ~]$ sudo rpm-ostree install akmod-nvidia xorg-x11-drv-nvidia-cuda
Checking out tree 3d384f5... done
Enabled rpm-md repositories: fedora-cisco-openh264 fedora-modular updates-modular updates fedora rpmfusion-free-updates rpmfusion-free rpmfusion-nonfree-updates rpmfusion-nonfree phracek-PyCharm google-chrome rpmfusion-nonfree-nvidia-driver rpmfusion-nonfree-steam updates-archive
Importing rpm-md... done
rpm-md repo 'fedora-cisco-openh264' (cached); generated: 2022-04-07T16:52:38Z solvables: 4
rpm-md repo 'fedora-modular' (cached); generated: 2022-05-04T21:12:01Z solvables: 825
rpm-md repo 'updates-modular' (cached); generated: 2022-05-16T00:18:23Z solvables: 1129
rpm-md repo 'updates' (cached); generated: 2022-05-21T01:01:09Z solvables: 9741
rpm-md repo 'fedora' (cached); generated: 2022-05-04T21:16:11Z solvables: 67992
rpm-md repo 'rpmfusion-free-updates' (cached); generated: 2022-05-18T15:49:28Z solvables: 10
rpm-md repo 'rpmfusion-free' (cached); generated: 2022-05-04T04:48:11Z solvables: 506
rpm-md repo 'rpmfusion-nonfree-updates' (cached); generated: 2022-05-18T16:10:50Z solvables: 2
rpm-md repo 'rpmfusion-nonfree' (cached); generated: 2022-05-04T05:11:55Z solvables: 225
rpm-md repo 'phracek-PyCharm' (cached); generated: 2022-05-13T04:23:58Z solvables: 5
rpm-md repo 'google-chrome' (cached); generated: 2022-05-19T17:44:58Z solvables: 3
rpm-md repo 'rpmfusion-nonfree-nvidia-driver' (cached); generated: 2022-05-13T09:29:28Z solvables: 29
rpm-md repo 'rpmfusion-nonfree-steam' (cached); generated: 2022-02-13T17:48:12Z solvables: 2
rpm-md repo 'updates-archive' (cached); generated: 2022-05-21T02:57:21Z solvables: 8932
Resolving dependencies... done
Checking out packages... done
Running pre scripts... done
Running post scripts... done
Running posttrans scripts... done
Writing rpmdb... done
Writing OSTree commit... done
Staging deployment... done
Freed: 45.9 MB (pkgcache branches: 0)
Added:
  akmod-nvidia-3:510.68.02-1.fc36.x86_64
  akmods-0.5.7-8.fc36.noarch
  annobin-docs-10.71-1.fc36.noarch
  annobin-plugin-gcc-10.71-1.fc36.x86_64
  ansible-srpm-macros-1-5.fc36.noarch
  binutils-2.37-27.fc36.x86_64
  binutils-gold-2.37-27.fc36.x86_64
  bison-3.8.2-2.fc36.x86_64
  debugedit-5.0-3.fc36.x86_64
  dwz-0.14-2.fc35.x86_64
  ed-1.14.2-12.fc36.x86_64
  efi-srpm-macros-5-5.fc36.noarch
  egl-gbm-1.1.0-2.fc36.x86_64
  egl-wayland-1.1.9-4.fc36.x86_64
  elfutils-0.187-4.fc36.x86_64
  elfutils-libelf-devel-0.187-4.fc36.x86_64
  fakeroot-1.28-2.fc36.x86_64
  fakeroot-libs-1.28-2.fc36.x86_64
  flex-2.6.4-10.fc36.x86_64
  fonts-srpm-macros-1:2.0.5-7.fc36.noarch
  fpc-srpm-macros-1.3-5.fc36.noarch
  gc-8.0.6-2.fc36.x86_64
  gcc-12.1.1-1.fc36.x86_64
  gcc-plugin-annobin-12.1.1-1.fc36.x86_64
  gdb-minimal-12.1-1.fc36.x86_64
  ghc-srpm-macros-1.5.0-6.fc36.noarch
  glibc-devel-2.35-6.fc36.x86_64
  glibc-headers-x86-2.35-6.fc36.noarch
  gnat-srpm-macros-4-15.fc36.noarch
  go-srpm-macros-3.0.15-1.fc36.noarch
  guile22-2.2.7-5.fc36.x86_64
  http-parser-2.9.4-6.fc36.x86_64
  info-6.8-3.fc36.x86_64
  kernel-devel-5.17.8-300.fc36.x86_64
  kernel-devel-matched-5.17.8-300.fc36.x86_64
  kernel-headers-5.17.6-300.fc36.x86_64
  kernel-srpm-macros-1.0-14.fc36.noarch
  kmodtool-1.1-3.fc36.noarch
  koji-1.28.1-1.fc36.noarch
  libcomps-0.1.18-2.fc36.x86_64
  libgit2-1.3.0-2.fc36.x86_64
  libssh2-1.10.0-4.fc36.x86_64
  libvdpau-1.5-1.fc36.x86_64
  libxcrypt-devel-4.4.28-1.fc36.x86_64
  lua-srpm-macros-1-6.fc36.noarch
  m4-1.4.19-3.fc36.x86_64
  make-1:4.3-7.fc36.x86_64
  nim-srpm-macros-3-6.fc36.noarch
  nvidia-persistenced-3:510.68.02-1.fc36.x86_64
  nvidia-settings-3:510.68.02-1.fc36.x86_64
  ocaml-srpm-macros-6-6.fc36.noarch
  ocl-icd-2.3.1-1.fc36.x86_64
  openblas-srpm-macros-2-11.fc36.noarch
  opencl-filesystem-1.0-15.fc36.noarch
  openssl-1:3.0.2-5.fc36.x86_64
  openssl-devel-1:3.0.2-5.fc36.x86_64
  package-notes-srpm-macros-0.4-14.fc36.noarch
  patch-2.7.6-16.fc36.x86_64
  perl-AutoLoader-5.74-486.fc36.noarch
  perl-B-1.82-486.fc36.x86_64
  perl-Carp-1.52-479.fc36.noarch
  perl-Class-Struct-0.66-486.fc36.noarch
  perl-Data-Dumper-2.183-3.fc36.x86_64
  perl-Digest-1.20-2.fc36.noarch
  perl-Digest-MD5-2.58-479.fc36.x86_64
  perl-DynaLoader-1.50-486.fc36.x86_64
  perl-Encode-4:3.17-485.fc36.x86_64
  perl-Errno-1.33-486.fc36.x86_64
  perl-Exporter-5.76-480.fc36.noarch
  perl-Fcntl-1.14-486.fc36.x86_64
  perl-File-Basename-2.85-486.fc36.noarch
  perl-File-Path-2.18-479.fc36.noarch
  perl-File-Temp-1:0.231.100-479.fc36.noarch
  perl-File-stat-1.09-486.fc36.noarch
  perl-FileHandle-2.03-486.fc36.noarch
  perl-Getopt-Long-1:2.52-479.fc36.noarch
  perl-Getopt-Std-1.13-486.fc36.noarch
  perl-HTTP-Tiny-0.080-2.fc36.noarch
  perl-IO-1.46-486.fc36.x86_64
  perl-IO-Socket-IP-0.41-480.fc36.noarch
  perl-IO-Socket-SSL-2.074-2.fc36.noarch
  perl-IPC-Open3-1.21-486.fc36.noarch
  perl-MIME-Base64-3.16-479.fc36.x86_64
  perl-Mozilla-CA-20211001-2.fc36.noarch
  perl-NDBM_File-1.15-486.fc36.x86_64
  perl-Net-SSLeay-1.92-2.fc36.x86_64
  perl-POSIX-1.97-486.fc36.x86_64
  perl-PathTools-3.80-479.fc36.x86_64
  perl-Pod-Escapes-1:1.07-479.fc36.noarch
  perl-Pod-Perldoc-3.28.01-480.fc36.noarch
  perl-Pod-Simple-1:3.43-3.fc36.noarch
  perl-Pod-Usage-4:2.01-479.fc36.noarch
  perl-Scalar-List-Utils-5:1.62-464.fc36.x86_64
  perl-SelectSaver-1.02-486.fc36.noarch
  perl-Socket-4:2.033-1.fc36.x86_64
  perl-Storable-1:3.25-2.fc36.x86_64
  perl-Symbol-1.09-486.fc36.noarch
  perl-Term-ANSIColor-5.01-480.fc36.noarch
  perl-Term-Cap-1.17-479.fc36.noarch
  perl-Text-ParseWords-3.31-1.fc36.noarch
  perl-Text-Tabs+Wrap-2021.0814-2.fc36.noarch
  perl-Time-Local-2:1.300-479.fc36.noarch
  perl-URI-5.10-1.fc36.noarch
  perl-base-2.27-486.fc36.noarch
  perl-constant-1.33-480.fc36.noarch
  perl-if-0.60.900-486.fc36.noarch
  perl-interpreter-4:5.34.1-486.fc36.x86_64
  perl-libnet-3.13-480.fc36.noarch
  perl-libs-4:5.34.1-486.fc36.x86_64
  perl-mro-1.25-486.fc36.x86_64
  perl-overload-1.33-486.fc36.noarch
  perl-overloading-0.02-486.fc36.noarch
  perl-parent-1:0.238-479.fc36.noarch
  perl-podlators-1:4.14-479.fc36.noarch
  perl-srpm-macros-1-43.fc36.noarch
  perl-subs-1.04-486.fc36.noarch
  perl-vars-1.05-486.fc36.noarch
  python-srpm-macros-3.10-17.fc36.noarch
  python3-argcomplete-2.0.0-2.fc36.noarch
  python3-babel-2.9.1-5.fc36.noarch
  python3-cffi-1.15.0-5.fc36.x86_64
  python3-dateutil-1:2.8.1-8.fc36.noarch
  python3-decorator-5.1.1-2.fc36.noarch
  python3-gssapi-1.7.2-2.fc36.x86_64
  python3-koji-1.28.1-1.fc36.noarch
  python3-libcomps-0.1.18-2.fc36.x86_64
  python3-ply-3.11-15.fc36.noarch
  python3-progressbar2-3.53.2-4.fc36.noarch
  python3-pycparser-2.20-6.fc36.noarch
  python3-pygit2-1.7.1-3.fc36.x86_64
  python3-pytz-2022.1-1.fc36.noarch
  python3-requests-gssapi-1.2.3-4.fc36.noarch
  python3-rpmautospec-0.2.6-1.fc36.noarch
  python3-utils-2.5.6-5.fc36.noarch
  qt5-srpm-macros-5.15.3-1.fc36.noarch
  redhat-rpm-config-219-1.fc36.noarch
  rpm-build-4.17.0-10.fc36.x86_64
  rpmautospec-rpm-macros-0.2.6-1.fc36.noarch
  rpmdevtools-9.6-1.fc36.noarch
  rust-srpm-macros-21-1.fc36.noarch
  systemd-rpm-macros-250.3-8.fc36.noarch
  xorg-x11-drv-nvidia-3:510.68.02-1.fc36.x86_64
  xorg-x11-drv-nvidia-cuda-3:510.68.02-1.fc36.x86_64
  xorg-x11-drv-nvidia-cuda-libs-3:510.68.02-1.fc36.x86_64
  xorg-x11-drv-nvidia-kmodsrc-3:510.68.02-1.fc36.x86_64
  xorg-x11-drv-nvidia-libs-3:510.68.02-1.fc36.x86_64
  xorg-x11-drv-nvidia-power-3:510.68.02-1.fc36.x86_64
  zlib-devel-1.2.11-31.fc36.x86_64
  zstd-1.5.2-1.fc36.x86_64
Changes queued for next boot. Run "systemctl reboot" to start a reboot

Reboot and check status

systemctl reboot

[user@fedora ~]$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                BaseCommit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4
           LayeredPackages: akmod-nvidia xorg-x11-drv-nvidia-cuda
             LocalPackages: rpmfusion-free-release-36-1.noarch rpmfusion-nonfree-release-36-1.noarch

  fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                BaseCommit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4
             LocalPackages: rpmfusion-free-release-36-1.noarch rpmfusion-nonfree-release-36-1.noarch

Try to install HTOP again; without running the disable-threads.sh script (Fails with error)

[user@fedora ~]$ rpm-ostree install htop
Checking out tree 3d384f5... done
Enabled rpm-md repositories: fedora-cisco-openh264 fedora-modular updates-modular updates fedora rpmfusion-free-updates rpmfusion-free rpmfusion-nonfree-updates rpmfusion-nonfree phracek-PyCharm google-chrome rpmfusion-nonfree-nvidia-driver rpmfusion-nonfree-steam updates-archive
Importing rpm-md... done
rpm-md repo 'fedora-cisco-openh264' (cached); generated: 2022-04-07T16:52:38Z solvables: 4
rpm-md repo 'fedora-modular' (cached); generated: 2022-05-04T21:12:01Z solvables: 825
rpm-md repo 'updates-modular' (cached); generated: 2022-05-16T00:18:23Z solvables: 1129
rpm-md repo 'updates' (cached); generated: 2022-05-21T01:01:09Z solvables: 9741
rpm-md repo 'fedora' (cached); generated: 2022-05-04T21:16:11Z solvables: 67992
rpm-md repo 'rpmfusion-free-updates' (cached); generated: 2022-05-18T15:49:28Z solvables: 10
rpm-md repo 'rpmfusion-free' (cached); generated: 2022-05-04T04:48:11Z solvables: 506
rpm-md repo 'rpmfusion-nonfree-updates' (cached); generated: 2022-05-18T16:10:50Z solvables: 2
rpm-md repo 'rpmfusion-nonfree' (cached); generated: 2022-05-04T05:11:55Z solvables: 225
rpm-md repo 'phracek-PyCharm' (cached); generated: 2022-05-13T04:23:58Z solvables: 5
rpm-md repo 'google-chrome' (cached); generated: 2022-05-19T17:44:58Z solvables: 3
rpm-md repo 'rpmfusion-nonfree-nvidia-driver' (cached); generated: 2022-05-13T09:29:28Z solvables: 29
rpm-md repo 'rpmfusion-nonfree-steam' (cached); generated: 2022-02-13T17:48:12Z solvables: 2
rpm-md repo 'updates-archive' (cached); generated: 2022-05-21T02:57:21Z solvables: 8932
Resolving dependencies... done
Will download: 1 package (184.3 kB)
Downloading from 'updates'... done
Importing packages... done
Checking out packages... done
Running pre scripts... done
Running post scripts... done
error: Running %post for akmod-nvidia: bwrap(/bin/sh): Child process killed by signal 1; run `journalctl -t 'rpm-ostree(akmod-nvidia.post)'` for more information

Output from journalctl -t 'rpm-ostree(akmod-nvidia.post)'

May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]: RPM build errors:
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     Unable to open sqlite database /usr/share/rpm/rpmdb.sqlite: unable to open database file
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     cannot open Packages index using sqlite - Operation not permitted (1)
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     cannot open Packages database in /usr/share/rpm
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     user mockbuild does not exist - using root
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     group mock does not exist - using root
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     user mockbuild does not exist - using root
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     group mock does not exist - using root
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     user mockbuild does not exist - using root
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     group mock does not exist - using root
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     user mockbuild does not exist - using root
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     group mock does not exist - using root
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     Unable to open sqlite database /usr/share/rpm/rpmdb.sqlite: unable to open database file
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     cannot open Packages index using sqlite - Operation not permitted (1)
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     cannot open Packages database in /usr/share/rpm
May 22 00:59:05 fedora rpm-ostree(akmod-nvidia.post)[31690]:     Bad exit status from /var/tmp/rpm-tmp.NcNSDa (%build)
May 22 01:00:07 fedora rpm-ostree(akmod-nvidia.post)[32056]: Building /usr/src/akmods/nvidia-kmod-510.68.02-1.fc36.src.rpm for kernel 5.17.8-300.fc36.x86_64
May 22 01:01:44 fedora rpm-ostree(akmod-nvidia.post)[32062]: /tmp/akmods-post.9wQT4n7J/results/kmod-nvidia-5.17.8-300.fc36.x86_64-510.68.02-1.fc36.x86_64.rpm
May 22 01:01:58 fedora rpm-ostree(akmod-nvidia.post)[38899]: /lib/modules/5.17.8-300.fc36.x86_64/kernel/arch/x86/crypto/twofish-x86_64.ko.xz needs "twofish_setkey": /lib/modules/5.17.8-300.fc36.x86_64/kernel/crypto/twofish_common.>
May 22 01:01:58 fedora rpm-ostree(akmod-nvidia.post)[38899]: /lib/modules/5.17.8-300.fc36.x86_64/kernel/arch/x86/crypto/twofish-x86_64-3way.ko.xz needs "twofish_dec_blk": /lib/modules/5.17.8-300.fc36.x86_64/kernel/arch/x86/crypto/>
May 22 01:01:58 fedora rpm-ostree(akmod-nvidia.post)[38899]: /lib/modules/5.17.8-300.fc36.x86_64/kernel/arch/x86/crypto/twofish-x86_64-3way.ko.xz needs "twofish_setkey": /lib/modules/5.17.8-300.fc36.x86_64/kernel/crypto/twofish_co>
May 22 01:01:58 fedora rpm-ostree(akmod-nvidia.post)[38899]: /lib/modules/5.17.8-300.fc36.x86_64/kernel/arch/x86/crypto/twofish-avx-x86_64.ko.xz needs "twofish_dec_blk": /lib/modules/5.17.8-300.fc36.x86_64/kernel/arch/x86/crypto/t>
lines 1-74/8539 1%

Journlctl Log continues for approximately 8500 lines listing similar output; `/lib/modules/...kernel...needs... "something": /lib/modules/...

At this point, I have saved a copy of all the log items journalctl -t 'rpm-ostree(akmod-nvidia.post)' > journalctl.txt.

Try to install HTOP again; AFTER running the disable-threads.sh script (succeeds)

sudo disable-threads.sh false 9 31

[user@fedora silverblue-houdini-toolbox]$ rpm-ostree install htop
Checking out tree 3d384f5... done
Enabled rpm-md repositories: fedora-cisco-openh264 fedora-modular updates-modular updates fedora rpmfusion-free-updates rpmfusion-free rpmfusion-nonfree-updates rpmfusion-nonfree phracek-PyCharm google-chrome rpmfusion-nonfree-nvidia-driver rpmfusion-nonfree-steam updates-archive
Importing rpm-md... done
rpm-md repo 'fedora-cisco-openh264' (cached); generated: 2022-04-07T16:52:38Z solvables: 4
rpm-md repo 'fedora-modular' (cached); generated: 2022-05-04T21:12:01Z solvables: 825
rpm-md repo 'updates-modular' (cached); generated: 2022-05-16T00:18:23Z solvables: 1129
rpm-md repo 'updates' (cached); generated: 2022-05-21T01:01:09Z solvables: 9741
rpm-md repo 'fedora' (cached); generated: 2022-05-04T21:16:11Z solvables: 67992
rpm-md repo 'rpmfusion-free-updates' (cached); generated: 2022-05-18T15:49:28Z solvables: 10
rpm-md repo 'rpmfusion-free' (cached); generated: 2022-05-04T04:48:11Z solvables: 506
rpm-md repo 'rpmfusion-nonfree-updates' (cached); generated: 2022-05-18T16:10:50Z solvables: 2
rpm-md repo 'rpmfusion-nonfree' (cached); generated: 2022-05-04T05:11:55Z solvables: 225
rpm-md repo 'phracek-PyCharm' (cached); generated: 2022-05-13T04:23:58Z solvables: 5
rpm-md repo 'google-chrome' (cached); generated: 2022-05-19T17:44:58Z solvables: 3
rpm-md repo 'rpmfusion-nonfree-nvidia-driver' (cached); generated: 2022-05-13T09:29:28Z solvables: 29
rpm-md repo 'rpmfusion-nonfree-steam' (cached); generated: 2022-02-13T17:48:12Z solvables: 2
rpm-md repo 'updates-archive' (cached); generated: 2022-05-21T02:57:21Z solvables: 8932
Resolving dependencies... done
Checking out packages... done
Running pre scripts... done
Running post scripts... done
Running posttrans scripts... done
Writing rpmdb... done
Writing OSTree commit... done
Staging deployment... done
Added:
  htop-3.2.0-1.fc36.x86_64
Changes queued for next boot. Run "systemctl reboot" to start a reboot

Check rpm-ostree status

[user@fedora ~]$ rpm-ostree status
State: idle
Deployments:
● fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                BaseCommit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4
           LayeredPackages: akmod-nvidia htop xorg-x11-drv-nvidia-cuda
             LocalPackages: rpmfusion-free-release-36-1.noarch
                            rpmfusion-nonfree-release-36-1.noarch

  fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                BaseCommit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4
           LayeredPackages: akmod-nvidia xorg-x11-drv-nvidia-cuda
             LocalPackages: rpmfusion-free-release-36-1.noarch
                            rpmfusion-nonfree-release-36-1.noarch

OS version:

[user@fedora ~]$ rpm-ostree status -b
State: idle
BootedDeployment:
● fedora:fedora/36/x86_64/silverblue
                   Version: 36.20220521.0 (2022-05-21T00:42:36Z)
                BaseCommit: 3d384f53a1a46d53a06e9eccc3f52a7a0587cb8147b397298559a59f113a1fed
              GPGSignature: Valid signature by 53DED2CB922D8B8D9E63FD18999F7CBF38AB71F4
           LayeredPackages: akmod-nvidia code google-chrome-stable kmod-nvidia mozilla-openh264 steam xorg-x11-drv-nvidia xorg-x11-drv-nvidia-cuda
             LocalPackages: rpmfusion-free-release-36-1.noarch rpmfusion-nonfree-release-36-1.noarch

Additional context

I've not yet tested this with Fedora 36 Workstation. I may be able to but it will take a little time - I have limited machines and drives to set up the install with and these machines have a lot asked of them ;)

@travier
Copy link
Member

travier commented May 22, 2022

Thanks for the detailed report. I'll re-read everything later but as a quick workaround, you can try setting the AllowedCPUs option in the rpm-ostreed.service systemd unit to limit the number of cores available during all rpm-ostree operations: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#AllowedCPUs=

@jbirch-atlassian
Copy link

jbirch-atlassian commented May 23, 2022

I have this issue as well. At some point in the past, the "open file limit" workaround seemed to work for me (https://bugzilla.redhat.com/show_bug.cgi?id=1901218#c1), but it doesn't seem to anymore — I'm unsure if it was a red herring I just got lucky and the kmod build just got sequenced in a way that it didn't fail to compile, or if there have been recent changes that make it fail even with this change.

I tried to set some rpmbuild macros and vars (like _smp_build_ncpus) to try and restrict the parallelism of the compilation, but to no avail — it always ended up forking down to make -j36 ... in my case. When I get a moment, I'll see if setting AllowedCPUs for rpm-ostreed.service makes a difference — but I suspect the root cause is that the kmods for nvidia are the things that are cooked, not rpm-ostreed.

edit: I'm pretty sure this is #106 too.

@samjcarter
Copy link
Author

Thanks for the detailed report. I'll re-read everything later but as a quick workaround, you can try setting the AllowedCPUs option in the rpm-ostreed.service systemd unit to limit the number of cores available during all rpm-ostree operations: https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#AllowedCPUs=

Thanks @travier , I'd love to test out your workaround because it's much more convenient than what I was doing.

On my Silverblue install, I find that unit file at /usr/lib/systemd/system/rpm-ostreed.service. This location is part of the read-only file system (according to nano: [ Error wroting rpm-ostreed.service: Read-only file system ]).

How can I add the suggested setting, e.g. AllowedCPUs=0-7 to that file? Can an overide file be placed in user-space somewhere? I'm not very familiar with systemd, and less so with Silverblue. Doing my best hehe.

@samjcarter
Copy link
Author

@jbirch-atlassian I think you are right about #106

At some point in the past, the "open file limit" workaround seemed to work for me (https://bugzilla.redhat.com/show_bug.cgi?id=1901218#c1), but it doesn't seem to anymore

I also tried the open file limit suggestion on that bugzilla, but it didn't work for me either.

@travier
Copy link
Member

travier commented May 23, 2022

On my Silverblue install, I find that unit file at /usr/lib/systemd/system/rpm-ostreed.service. This location is part of the read-only file system (according to nano: [ Error wroting rpm-ostreed.service: Read-only file system ]).

How can I add the suggested setting, e.g. AllowedCPUs=0-7 to that file? Can an overide file be placed in user-space somewhere? I'm not very familiar with systemd, and less so with Silverblue. Doing my best hehe.

Use systemctl edit. It will place the override file in the right folder in /etc. Don't forget to add the section in the override ([Service]), to reload systemd configuration with systemctl daemon-reload and to restart rpm-ostreed.service before making a test.

@jbirch-atlassian
Copy link

jbirch-atlassian commented May 23, 2022

Took it for a spin — no dice.

I can see that AllowedCPUs=0-7 is set for the service with systemctl show, but unfortunately the post scripts still eventually fork out to make V=1 -j36 ..., and fail to compile — despite forking from /usr/bin/rpm-ostree start-daemon. I'll keep poking around at this a little bit more to make sure I'm not SystemD'ing wrong. This probably makes sense though — the problem isn't the parallelism from what I can tell, it's the concurrency. That is, it doesn't matter if I have one CPU or 128 CPUs to play with, if I'm still trying to make -j36 it...

@travier
Copy link
Member

travier commented May 23, 2022

Then I would recommend that you report that against the NVIDIA package on RPM Fusion Bugzilla.

@samjcarter
Copy link
Author

samjcarter commented May 23, 2022

It looks like it's been posted there a few times and marked as invalid.

https://bugzilla.rpmfusion.org/show_bug.cgi?id=5851
https://bugzilla.rpmfusion.org/show_bug.cgi?id=6031

On bugzilla 6031 the OP was using rawhide kernel so it was dismissed. I am not using rawhide but I do see the same error (I think) as described in bugzilla 6031 and perhaps 5851. It's suggested there that:

This not an ostree issue as you are using a rawhide kernel with debug enabled to build the nvidia kernel module. This is impossible and is well enough documented.
https://rpmfusion.org/Howto/NVIDIA#Rawhide

The usual way of doing is to use the fedora-rawhide-kernel-nodebug repository to use a nodebug. But it's unlikely to be possible with ostree.

Your only chance is from time to time rawhide is using a nodebug kernel by default. So you can only pick this one.

I read this bug report before posting here. Having seen on a couple of separate bug reports there and marked as invalid, I decided to post here.

@travier @jbirch-atlassian What are your thoughts on moving the issue to rpm-ostree? Do you think it's possible or even likely that the problem/solution could be there? Or make the bug report again over on rpmfusion's bugzilla? The nvidia-kmod package, I guess.

@jbirch-atlassian
Copy link

jbirch-atlassian commented May 24, 2022

What are your thoughts on moving the issue to rpm-ostree? Do you think it's possible or even likely that the problem/solution could be there?

Truthfully, I only have opinions, and they're uninformed of how Silverblue-the-project, rpm-ostree-the-tool, kmod-the-pattern, and Fedora-and-RPMFusion-and-NVidia-the-entities interact and own their components. But here's my best guess based on my personal experience.

Fedora 35 and 36 work with akmod-nvidia just fine out of the box on systems with high CPU core counts. Fedora 36 Silverblue does not. It is unclear to me if this means there's a lingering bug in the kmod that only shows itself reliably when being invoked in Silverblue, or if there's something specific about how Silverblue uses rpm-ostree that isn't quite correct, or if there's a latent issue in rpm-ostree that makes it not work correctly.

To be honest, I think every party involved here "cares" — I just can't hazard a guess as to where the root cause lies (or miscommunication between components, or incorrect assumptions, or whatever).

  • Fedora Silverblue cares in that it first-classes RPMFusion support for users if they choose to opt into it. But choosing to opt in will leave some packages natively unbuildable — and worse in rpm-ostree land, if they were every build successfully, now no operation will work even when updating things like kernel arguments.
  • RPMFusion cares in that Silverblue is a supported target generally, and specifically for NVidia.

So I dunno — my best guess is that this issue should remain open for people to see; right now if you install Silverblue on a 5950x or something with an NVidia GPU, it's likely that you're going to have a bad time if that's also a relatively new or expensive GPU. But ultimately the thing that's throwing the issue is the kmod-nvidia packages when compiled, through whatever environment is set up for them through rpm-ostree in Silverblue (which is a fully supported platform). Which I guess is a long-winded way of saying "¯\_(ツ)_/¯ both I guess?".

There are mitigations that could be added to the out-of-the-box configuration, to only use RPMFusion in a safe way if there are large core counts. If those knobs exist, I'm of the opinion that they would be positive things to exercise in the default Silverblue configuration.

@travier
Copy link
Member

travier commented May 25, 2022

I'd suggest you try to figure out what makes the build in the akmod-nividia package use so many cores by default even when you restrict it via systemd. Then we can suggest a change to the RPM package to workaround that.

You can also open an issue for rpm-ostree and link this one there but I don't think it's an rpm-ostree issue.

@travier travier added bug Something isn't working help wanted labels May 25, 2022
@jbirch-atlassian
Copy link

jbirch-atlassian commented May 25, 2022

Remember, it's not the number of cores, but the concurrency of the compile. My understanding is we are restricting the number of cores used by the build, but it's still doing "make -j<a billion>", with its cut-down number of cores.

What isn't clear to me is why I can't get rpmbuild to do -j 6 or something even when I set macros like _smp_build_ncpus, when invoked from rpm-ostreed. Though again, even if I knew how to make that one stick, it would only be a mitigation for the kmod just being... bad at compiling lol.

@travier
Copy link
Member

travier commented May 25, 2022

If you find where to set the max amount of thread to use to do the compilation then we can make a tweak in the RPM spec file to set that only for rpm-ostree system (checking if /run/ostree-booted exists).

@samjcarter
Copy link
Author

Hi @jbirch-atlassian @travier ,

Thank you for your thoughts on this so far. I will open the the issue on rpm-ostree. It will go against the CoreOS repository for rpm-ostree because there's no issue tracking on the Silverblue one.

I will also post the bug report to bugzilla under the nvidia-kmod package, and add the links back to these issues with some encouragement to view discussion at this issue as context.

@samjcarter
Copy link
Author

I've posted this Bug report over at rpmfusion bugzilla https://bugzilla.rpmfusion.org/show_bug.cgi?id=6317

@nelsonaloysio
Copy link

On Silverblue/Kinoite, akmods calls /usr/sbin/akmods-ostree-post in %post to generate the kernel modules right after a package is layered (for example, akmod-nvidia). In my machine, it works, though the modules are unsigned, causing #272.

@ivanvorstanenko
Copy link

ivanvorstanenko commented Jun 22, 2022

Fix was released for the bugreport of this problem in rpmfusion bugzilla.

But, it is about 390 version.

@tpopela tpopela closed this as completed Jun 22, 2022
@jbirch-atlassian
Copy link

jbirch-atlassian commented Jun 23, 2022

@ivanvorstanenko I had a quick look at that issue, and that doesn't look like the same problem. That looks like a legitimate compilation failure for old versions of the NVidia driver in newer kernels, whereas this issue seems like a race condition caused by trying to compile the drivers with too high a concurrency — the compilation succeeds if done at a lower level of concurrency.

I see you've helpfully gone to all of the linked tickets and mentioned the same thing, and most people have closed the issues, believing them resolved — but I'm not convinced this is the same problem. It's not even the same package — even if that was the same problem and same fix, it's only for the 390.xx versions of the NVidia driver, not the current 510/515. I'm worried that all of these linked tickets have been or will be closed erroneously. Can you confirm if https://bugzilla.rpmfusion.org/show_bug.cgi?id=6337 showed up at the time only when being built with many cores, and that whatever changes happened there have been applied to all versions of the kmod?

@ivanvorstanenko
Copy link

Can you confirm if https://bugzilla.rpmfusion.org/show_bug.cgi?id=6337 showed up at the time only when being built with many cores...

No, I can't confirm it. I have 8 cores and 8 threads (FX8300) and compilation successfully.
As I understand, problem occurs with 10 cores and more.

and that whatever changes happened there have been applied to all versions of the kmod?

I think, no, because problem about compilation with too high a concurrency and problem about 390th driver version don't same.
But, it is interesting situation - I have the same output, described in first comment and after fix in 390th driver version, the problem is gone.
Also, the bugreport, described this problem, of the last version driver was closed. So, I think, this bugreport also need to be closed with same reason :)

@jbirch-atlassian
Copy link

jbirch-atlassian commented Jun 23, 2022

I'll see what concurrency is reliably required to trigger this bug and return with more details for others. Unfortunately, I'm running on a 5.17 kernel at the moment, so if the other report is related to 5.18 only, it is a very different problem.

edit: 26 threads reliably compiles. 27 threads reliably fails.

There are unfortunately many old bug reports of this long-standing problem that have been closed without the problem being fixed. The one you have linked there I think has been closed for the wrong reason, and have commented to that effect a few days ago. I'm hoping to make sure maintainers don't keep closing legitimate issues.

@travier travier reopened this Jun 24, 2022
@samjcarter
Copy link
Author

@jbirch-atlassian Thank you for testing that. I've been meaning to do a similar check.

@travier @jbirch-atlassian Thanks for keeping these issues open. I have checked the bug report at https://bugzilla.rpmfusion.org/show_bug.cgi?id=6337 again (I read it at some point after posting this issue, and the rpm-ostree duplicate). As you've pointed out it's not the same Nvidia driver version as this report.

I agree that these issues (including rpm-ostree issue 3706) should remain open until we understand (a) what caused it and (b) that it is resolved to a degree that it won't happen again. As of today, it appears no one can confirm (though there are suspicions) in what package or combination of packages the cause lies. Nor can we currently assume it will not reoccur with recent, current or future drivers and hardware.

I think it's worth noting here for context, for anyone arriving here experiencing this issue still, that (with apologies if I am misreading the communication) it appears bugzilla 6337 was closed before investigation due to an admitted lack of interest. Ergo, there was no intention to investigate and an assumption was made that the problem was elsewhere.

Again, thanks to @jbirch-atlassian for offering to help out over there on the rpm-ostree / Silverblue side.

I still have the same problem. I have some new hardware arriving soon which will include some more high core count parts, but different CPU and GPU models to the hardware I used for the original report. I will test just in case and post back with more information (if anything new) once they're up and running.

The tl;dr for anyone experiencing this issue still (especially non-developer users like me just trying to read the room) is it probably isn't fixed yet. Though, I'd love to be wrong.

@jbirch-atlassian
Copy link

jbirch-atlassian commented Jun 26, 2022

I need to find some time to dig a little further, and I still don't know what the root cause is, but I've discovered that I can invoke akmodsbuild myself in the same way that rpm-ostreed does, and have things compile successfully — so there's something about the way that akmods-ostree-post forks out to akmodsbuild that makes this issue show up. I haven't had luck invoking akmods-ostree-post myself, because I haven't been able to figure out how to bwrap the world to make it something I can invoke successfully. My best guess, at this stage, is that something about having more than 26 logical CPU cores makes the source be extracted incorrectly before compilation — but I can't definitely verify that yet.

Investigating this is particularly slow, because I don't have the faintest idea about how to sanely debug any of this. For example, I have no idea how to invoke rpm-ostree enough to give me the filesystem tree it wants to invoke the akmod build on, without attempting to do the whole thing, and then deleting any WIP FS tree it has. Thus, all of my investigation so far involves running commands that take minutes, and trying to sneak a peek at something in the critical few seconds they're visible to the rest of the world. If anyone has handy links about "how to rpm-ostree good and do other things good too" I'd love to see them.

Quick thanks to @nelsonaloysio for their comment about akmods-ostree-post, which was the bit of the puzzle I needed to understand how Silverblue handles dynamic compilation on its static fs tree.

@travier
Copy link
Member

travier commented Jun 27, 2022

You should try reproducing the akmod commands on your system directly. Maybe directly building the module from source might trigger the issue.

@jbirch-atlassian
Copy link

Sorry, I don't think I was clear — I can reproduce the akmodsbuild command successfully. I haven't yet figured out how to make akmods-ostree-post work, because it makes a lot of assumptions about what it can write to, which it can't unless bwrap or whatever sets up the world for it it. Sadly, it doesn't have very much in the way of documentation to help me, so I haven't been able to opportunistically look further.

I have a bit of free time next week, that I'm hoping to spend slowly chipping away at this. References on akmods-ostree-post would be extremely welcome here.

@jbirch-atlassian
Copy link

jbirch-atlassian commented Jul 7, 2022

I have good news and bad news.

The bad news is, I can no longer reproduce the original problem's root cause. I made the mistake of keeping my system up to date, and that has changed the state of the world enough that I can't reproduce it exactly.

The good news is, the problem still occurs with slightly different internal symptoms — same visible outcome to users — but the fix this time is way more trivial. It's plausible that it's related to the original issue as well, which I'll explain later. Let me take you on a journey so you can check my work and see if it makes sense, because frankly I'm in too deep for my own good right now.


Historically, I had seen this problem crop up as "oh, there's too many files open". The logs were very similar to those posted here, but the small amount of information I could find implicated open files (https://bugzilla.redhat.com/show_bug.cgi?id=1901218#c1) Indeed, the first time I worked around this issue was by setting some limit of open file descriptors to something ridiculous, and then installing the NVidia drivers. However, this was still busted on updates some time later, and I put it down to me not really knowing how to configure the number of allowed open file descriptors for things in systemd.

More recently, I had seen this in the same way @samjcarter had — as a failure to compile. The open file descriptor fix wasn't working for me as well. Disabling some cores was though, and that's how we ended up here.

As I dug into it this week, I noticed I was never getting to the compilation failure stage anymore — akmodsbuild was dying during conftest due to too many open files. More specifically, something was getting make'd that was looking at a bunch of header files at the same time that the conftest-ery was happening, but it didn't seem to be conftest itself that died, as it managed to squeeze out another log line:

Jul 06 02:23:21 fart rpm-ostree(akmod-nvidia.post)[218126]: Building /usr/src/akmods/nvidia-kmod-510.68.02-2.fc36.src.rpm for kernel 5.18.9-200.fc36.x86_64
Jul 06 02:24:48 fart rpm-ostree(akmod-nvidia.post)[218134]: /tmp/akmods-post.jtL9NGx9/results/kmod-nvidia-5.18.9-200.fc36.x86_64-510.68.02-2.fc36.x86_64.rpm
Jul 06 02:25:03 fart rpm-ostree(akmod-nvidia.post)[225010]: /lib/modules/5.18.9-200.fc36.x86_64/kernel/arch/x86/crypto/twofish-x86_64.ko.xz needs "twofish_setkey": /lib/modules/5.18.9-200.fc36.x86_64/kernel/crypto/twofish_common.ko.xz

 .... [ snip ] ....
 
Jul 06 02:25:04 fart rpm-ostree(akmod-nvidia.post)[225010]: /lib/modules/5.18.9-200.fc36.x86_64/extra/nvidia/nvidia-uvm.ko.xz needs "nvUvmInterfaceDisableAccessCntr": /lib/modules/5.18.9-200.fc36.x86_64/extra/nvidia/nvidia.ko.xz
Jul 07 17:23:09 fart rpm-ostree(akmod-nvidia.post)[291356]: Building /usr/src/akmods/nvidia-kmod-515.57-1.fc36.src.rpm for kernel 5.18.9-200.fc36.x86_64
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]:  CONFTEST: migrate_vma_added_flags

 .... [ snip ] ....

Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]:  CONFTEST: drm_init_function_args
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]: /bin/sh: line 1: /bin/sh: Too many open files
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]: make[2]: *** [/tmp/akmodsbuild.mpEjcKDL/BUILD/nvidia-kmod-515.57/_kmod_build_5.18.9-200.fc36.x86_64/Kbuild:167: /tmp/akmodsbuild.mpEjcKDL/BUILD/nvidia-kmod-515.57/_kmod_build_5.18.9-200.fc36.x86_64/Kbuild:167: /tmp/akmodsbuild.mpEjcKDL/BUILD/nvidia-kmod-515.57/_kmod_build_5.18.9-200.fc36.x86_64/conftest/compile-tests/drm_driver_has_set_busid.h] Error 126
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]: make[2]: *** Deleting file '/tmp/akmodsbuild.mpEjcKDL/BUILD/nvidia-kmod-515.57/_kmod_build_5.18.9-200.fc36.x86_64/conftest/compile-tests/drm_driver_has_set_busid.h'
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]: make[2]: *** Waiting for unfinished jobs....
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]:  CONFTEST: drm_helper_mode_fill_fb_struct
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]: make[1]: *** [Makefile:1842: /tmp/akmodsbuild.mpEjcKDL/BUILD/nvidia-kmod-515.57/_kmod_build_5.18.9-200.fc36.x86_64] Error 2
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]: make[1]: Leaving directory '/usr/src/kernels/5.18.9-200.fc36.x86_64'
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]: make: *** [Makefile:82: modules] Error 2
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]: error: Bad exit status from /var/tmp/rpm-tmp.dCmLtW (%build)
Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]: RPM build errors:

 .... [ snip ] ....

Jul 07 17:23:30 fart rpm-ostree(akmod-nvidia.post)[295677]:     Bad exit status from /var/tmp/rpm-tmp.dCmLtW (%build)

I had previously set the default LimitNOFILE to something absurd in system.conf and user.conf and also set something stupid for nofile in limits.conf as well, because I don't know what the hell I'm doing. Buuuut I don't recall if I had ever logged out of that machine, or otherwise did anything that would let that really take effect — and I don't remember if I had reverted any of those things in between. So I used systemctl edit to set a LimitNOFILE=65535 override for rpm-ostreed.service, and lo and behold everything compiles successfully now.


If the original issue was caused by any of:

  • Archives being improperly extracted as a result of trying to open too many files at once — but swallowing that error in a way that we never noticed, or
  • Trying to read too many files at once, and I was just terrible at reliably specifying to my system that "yes let anyone use a dumb amount of file descriptors", or
  • Trying to read too many files at once, including all of the other modules and what they want, and something has changed in the my system (such as more kernel modules) that means there are more files held open and it dies before compile time instead of at compile time.

then it's conceivable that it's all the same issue, just showing up in a slightly different way after an update. If it's the first of these, then there might be a lingering loose end to tie up with particularly large akmod packages, but I somehow doubt it's that, as I was able to manually invoke akmodsbuild itself. I kind of suspect it's option 3, but I don't know how to confirm it. If it is option three, then that implies the more kernel modules we ship in Silverblue, the less make parallelism will be needed to trigger this bug, without any further changes.


So in short, I never got to a true root cause. But at least we seem to have a persistent workaround that doesn't involve murdering CPU cores for a little while:

  1. Use sudo systemctl edit rpm-ostreed.service to add an override.conf for rpm-ostreed, containing the following content:
[Service]
LimitNOFILE=65535
  1. sudo systemctl daemon-reload
  2. sudo systemctl restart rpm-ostreed.service

@jbirch-atlassian
Copy link

jbirch-atlassian commented Jul 7, 2022

Additionally, it should be noted that LimitNOFILE=2048 is sufficient for a 36-thread environment, but a 128-thread environment requires LimitNOFILE=8192. This is the most obscene environment I have access to test on, unfortunately — but it does point to something about the install process scaling with the concurrency of the build. This makes me think that perhaps the makefile is just bonghits?

Presumably the default is 1024, though I don't know where this would normally be set. Perhaps it's sufficient to set LimitNOFILE=16384 going forward for rpm-ostreed.service and pray that it also holds true for like, multi-socket AMD Milan systems? 🤷

If we agree this is an adequate fix for Silverblue, I'm happy to put together the PR. However, I appreciate there might still be some open questions around truly root causing this.

@travier
Copy link
Member

travier commented Jul 12, 2022

I have:

$ systemctl show rpm-ostreed.service | grep LimitNO
LimitNOFILE=524288
LimitNOFILESoft=1024

If raising the limit fixes the issue then we can safely raise that in rpm-ostree upstream.

@travier
Copy link
Member

travier commented Jul 12, 2022

Already reported in coreos/rpm-ostree#3706. Can you open a PR to raise that limit in rpm-ostreed.service? See the logic in https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties. Thanks!

@jbirch-atlassian
Copy link

Done and done! This will hopefully be resolved with coreos/rpm-ostree#3853.

Thanks for your help, @travier!

@samjcarter
Copy link
Author

@jbirch-atlassian & @travier Thank you both very much for your hard work on this. It means I (and others I'm sure) can keep using Silverblue for workstations and render machines where otherwise I would have at the very least moved to regular Fedora version 34, due to the version constraints of my work software. Silverblue 36 lets me toolbox it. I have since realised that I could also install toolbox on Fedora Workstation 36 and set up a 34 toolbox. Always learning something & I'm glad I can keep using Silverblue.

@travier travier added f36 Related to Fedora 36 and removed help wanted labels Aug 19, 2022
@travier travier added the external Issue related to external project not part of Fedora label Oct 31, 2022
@travier travier changed the title Fedora 36 Silverblue rpm-ostree install of rpmfusion Nvidia driver fails with error: Running %post for akmod-nvidia NVIDIA: Driver installation fails with error: Running %post for akmod-nvidia Oct 31, 2022
@aleskandro
Copy link

aleskandro commented Feb 9, 2023

(edit): I was hitting coreos/rpm-ostree#1614 and in particular coreos/rpm-ostree#4201, due to a missing link for the linker in /usr/bin. I added it manually with ln -s /usr/bin/ld.bfd /usr/bin/ld (in an ostree-native container layer defined for my systems).


I'm not sure my issue is related to this one, but I was able to successfully install the drivers without issues ~1 month ago and that is not happening now after I had to temporarily reset a kernel override and remove the layered akmod-nvidia.

  1. I tried on Fedora rawhide with kernel 6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64
  2. I rebased to Fedora 37 the whole system and tried with the kernel 6.1.9-200.fc37.x86_64
  3. I tried the solutions in this discussion

What I did, on a system that was working fine with nvidia-drivers till a few days ago (before updating), is:

  1. rpm-ostree override reset kernel kernel-core kernel-modules kernel-modules-extra --uninstall=akmod-nvidia
  2. rpm-ostree upgrade
  3. rpm-ostree install akmod-nvidia
journalctl -t 'rpm-ostree(akmod-nvidia.post)'
-- Boot 08d2b0dc656546639371299780d145c2 --
Feb 09 14:26:43 mifune rpm-ostree(akmod-nvidia.post)[2352]: Building /usr/src/akmods/nvidia-kmod-525.85.05-1.fc38.src.rpm for kernel 6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:   ./scripts/check-local-export /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-modeset.o
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:   ./scripts/check-local-export /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-fb.o
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:    { echo ; echo '/tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-format.o: $(wildcard ./tools/objtool/objtool)' ; } >> /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/.nvidia-drm-format.o.cmd
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:    { echo ; echo '/tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nv-pci-table.o: $(wildcard ./tools/objtool/objtool)' ; } >> /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/.nv-pci-table.o.cmd
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:    { echo ; echo '/tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-modeset.o: $(wildcard ./tools/objtool/objtool)' ; } >> /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/.nvidia-drm-modeset.o.cmd
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:    { echo ; echo '/tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-fb.o: $(wildcard ./tools/objtool/objtool)' ; } >> /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/.nvidia-drm-fb.o.cmd
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:   ./scripts/check-local-export /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-helper.o
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:    { echo ; echo '/tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-helper.o: $(wildcard ./tools/objtool/objtool)' ; } >> /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/.nvidia-drm-helper.o.cmd
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:   ./scripts/check-local-export /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-gem-user-memory.o
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:   ./scripts/check-local-export /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-gem-dma-buf.o
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:   ./scripts/check-local-export /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-gem-nvkms-memory.o
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:    { echo ; echo '/tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-gem-user-memory.o: $(wildcard ./tools/objtool/objtool)' ; } >> /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/.nvidia-drm-gem-user-memory.o.cmd
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:    { echo ; echo '/tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-gem-dma-buf.o: $(wildcard ./tools/objtool/objtool)' ; } >> /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/.nvidia-drm-gem-dma-buf.o.cmd
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:    { echo ; echo '/tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/nvidia-drm-gem-nvkms-memory.o: $(wildcard ./tools/objtool/objtool)' ; } >> /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-drm/.nvidia-drm-gem-nvkms-memory.o.cmd
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:   ./scripts/check-local-export /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-peermem/nvidia-peermem.o
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:    { echo ; echo '/tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-peermem/nvidia-peermem.o: $(wildcard ./tools/objtool/objtool)' ; } >> /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64/nvidia-peermem/.nvidia-peermem.o.cmd
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]: make[1]: *** [Makefile:2031: /tmp/akmodsbuild.B7K4GLJn/BUILD/nvidia-kmod-525.85.05/_kmod_build_6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64] Error 2
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]: make[1]: Leaving directory '/usr/src/kernels/6.2.0-0.rc7.20230207git05ecb680708a.51.fc38.x86_64'
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]: make: *** [Makefile:82: modules] Error 2
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]: error: Bad exit status from /var/tmp/rpm-tmp.9HQJPr (%build)
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]: RPM build warnings:
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     user mockbuild does not exist - using root
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     group mock does not exist - using root
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     user mockbuild does not exist - using root
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     group mock does not exist - using root
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]: RPM build errors:
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     Unable to open sqlite database /usr/share/rpm/rpmdb.sqlite: unable to open database file
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     cannot open Packages index using sqlite - Operation not permitted (1)
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     cannot open Packages database in /usr/share/rpm
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     Unable to open sqlite database /usr/share/rpm/rpmdb.sqlite: unable to open database file
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     cannot open Packages index using sqlite - Operation not permitted (1)
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     cannot open Packages database in /usr/share/rpm
Feb 09 14:27:35 mifune rpm-ostree(akmod-nvidia.post)[11193]:     Bad exit status from /var/tmp/rpm-tmp.9HQJPr (%build)

@travier travier added f37 Related to Fedora 37 f38 Related to Fedora 38 f39 Related to Fedora 39 and removed f36 Related to Fedora 36 labels May 15, 2023
@travier travier added f40 Related to Fedora 40 and removed f37 Related to Fedora 37 labels Nov 24, 2023
@travier travier removed the f38 Related to Fedora 38 label Apr 29, 2024
@Yonnji
Copy link

Yonnji commented Aug 13, 2024

I just got the same error on Silverblue 40. Raising the limit doesn't helps.

Aug 13 09:31:03 origami rpm-ostree(akmod-nvidia-open.post)[52244]: Building /usr/src/akmods/nvidia-open-kmod-550.76-1.fc40.src.rpm for kernel 6.10.3-200.fc40.x86_64
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: /tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open/common/inc/nv-linux.h:584:37: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:   584 |             NV_MEMDBG_ADD(ptr, size); \
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:       |                                     ^
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: /tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open/nvidia/os-interface.c:1164:5: note: in expansion of macro ‘NV_KMALLOC_ATOMIC’
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:  1164 |     NV_KMALLOC_ATOMIC(oqd, sizeof(os_queue_data_t));
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:       |     ^~~~~~~~~~~~~~~~~
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: /tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open/nvidia/os-interface.c: In function ‘os_alloc_wait_queue’:
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: /tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open/common/inc/nv-linux.h:570:37: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:   570 |             NV_MEMDBG_ADD(ptr, size); \
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:       |                                     ^
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: /tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open/nvidia/os-interface.c:2018:5: note: in expansion of macro ‘NV_KMALLOC’
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:  2018 |     NV_KMALLOC(*wq, sizeof(os_wait_queue));
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:       |     ^~~~~~~~~~
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: make[4]: *** [scripts/Makefile.build:244: /tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open/nvidia/os-mlock.o] Error 1
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: make[4]: *** Waiting for unfinished jobs....
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: /tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open/nvidia/nv-kthread-q.c: In function ‘thread_create_on_node’:
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: /tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open/nvidia/nv-kthread-q.c:179:5: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:   179 |     const static unsigned attempts = 3;
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:       |     ^~~~~
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: make[3]: *** [/usr/src/kernels/6.10.3-200.fc40.x86_64/Makefile:1946: /tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open] Error 2
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: make[2]: *** [Makefile:252: __sub-make] Error 2
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: make[2]: Leaving directory '/usr/src/kernels/6.10.3-200.fc40.x86_64'
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: make[1]: *** [Makefile:85: modules] Error 2
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: make[1]: Leaving directory '/tmp/akmodsbuild.A7bQJnJB/BUILD/nvidia-open-kmod-550.76/_kmod_build_6.10.3-200.fc40.x86_64/kernel-open'
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: make: *** [Makefile:59: modules] Error 2
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: error: Bad exit status from /var/tmp/rpm-tmp.RnyzS7 (%build)
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]: RPM build errors:
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:     Unable to open sqlite database /usr/share/rpm/rpmdb.sqlite: unable to open database file
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:     cannot open Packages index using sqlite - Operation not permitted (1)
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:     cannot open Packages database in /usr/share/rpm
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:     Unable to open sqlite database /usr/share/rpm/rpmdb.sqlite: unable to open database file
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:     cannot open Packages index using sqlite - Operation not permitted (1)
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:     cannot open Packages database in /usr/share/rpm
Aug 13 09:32:26 origami rpm-ostree(akmod-nvidia-open.post)[69252]:     Bad exit status from /var/tmp/rpm-tmp.RnyzS7 (%build)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working external Issue related to external project not part of Fedora f39 Related to Fedora 39 f40 Related to Fedora 40
Projects
None yet
Development

No branches or pull requests

8 participants