Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia driver and noroot setting #841

Closed
reinerh opened this issue Oct 6, 2016 · 29 comments
Closed

nvidia driver and noroot setting #841

reinerh opened this issue Oct 6, 2016 · 29 comments
Labels
bug Something isn't working graphics Issues related to GPU acceleration and drivers (mesa, nvidia, etc)

Comments

@reinerh
Copy link
Collaborator

reinerh commented Oct 6, 2016

A Debian user reported that Steam was segfaulting when run with firejail after upgrading the nvidia driver to 367.44-2.
(Also glxgears behaves wrong with the steam profile)

The problematic line in the steam profile is the "noroot" setting, though I don't know why exactly this is causing issues.

The complete bug report is here.

@netblue30 netblue30 added the bug Something isn't working label Oct 8, 2016
@netblue30
Copy link
Owner

I have a fix in 40ed53c

I don't have a Nvidia card, but I think it will fix the issue. I added video and games to the groups allowed in the user namespace. I close the bug for now and reopen it if necessary. Thanks.

@kevinoid
Copy link

I'm seeing this same issue with Firejail 0.9.44.8-1 on Debian with an nVidia NVS 5400M. It looks like 40ed53c either did not fix the issue or it has reappeared.

I compiled from the current master (22414ad) and captured the strace output for glxgears with and without --noroot and posted it as a gist: https://gist.github.com/kevinoid/cb1c4ed6c8f073d41b1c4e1039e04e99#file-glxgears-strace-diff

Notably, when running with --noroot it tries and fails to chmod /dev/nvidia-modeset, /dev/nvidia-modeset, and /dev/nvidia0 then forks a child which fails to chmod the device, then opens it (which succeeds, since the devices are still root:root 0666 which stat showed before chmod). It also does not try to write anything to ~/.nv/GLCache (although the train may have left the tracks before this point). It doesn't make much sense to me. Any ideas?

Thanks,
Kevin

@netblue30 netblue30 reopened this Mar 13, 2017
@chiraag-nataraj
Copy link
Collaborator

I have an Nvidia card and have used noroot with with nvidia driver for a while now with no issues. Is anyone still seeing this issue?

@kevinoid
Copy link

I can confirm that this is still an issue for me on Debian with firejail 0.9.54-1, nvidia 390.77-1, and Linux 4.18.0 (from kernel.org, without Debian patches). firejail --noprofile glxgears works fine, but firejail --noprofile --noroot glxgears shows a black window without the spinning gears. I updated the gist at https://gist.github.com/kevinoid/cb1c4ed6c8f073d41b1c4e1039e04e99#file-glxgears-strace-diff with the strace differences between the two.

@chiraag-nataraj
Copy link
Collaborator

That's very interesting. I'm on Debian sid/experimental, with firejail 0.9.55 (from git), nvidia 396.51-1, and Linux 4.17.8 (custom-built from linux-source-4.17). Both firejail --noprofile glxgears and firejail --noprofile --noroot glxgears give me the normal window (spinning gears).

@chiraag-nataraj
Copy link
Collaborator

@kevinoid I'm curious to see if you run into the issue with a Debian stock kernel. Can you test and report back?

@kevinoid
Copy link

@chiraag-nataraj Sure. I can confirm that the same behavior occurs when running kernel 4.17.0-1-amd64 from the linux-image-4.17.0-1-amd64 package (version 4.17.8-1) with nVidia modules built using nvidia-kernel-dkms (version 390.77-1).

@chiraag-nataraj
Copy link
Collaborator

Damn, that's weird. So I have a newer version of nvidia-driver, but I've never had this issue, so I suspect that the driver version doesn't really matter too much.

@SkewedZeppelin
Copy link
Collaborator

Multiple monitors? Different outputs?

it has always been arbitrary when the driver calls its suid binary iirc

@kevinoid
Copy link

I agree. I don't think it is version-specific. I'm currently using the laptop screen. I can test the VGA output soon.

If you run firejail --noprofile --noroot strace -f -o glxgears.strace glxgears do you see any chmod("/dev/nvidiactl", 0666) calls in glxgears.strace? I'm curious whether it is making those calls and whether the call succeeds.

Also, for reference, my system has nVidia Optimus, so it has both an Intel graphics card and an nVidia card. The problem does not occur when using Mesa on the Intel card.

@kevinoid
Copy link

I can confirm the same behavior occurs when using the VGA output with LVDS (the laptop screen) disabled.

@chiraag-nataraj
Copy link
Collaborator

I agree. I don't think it is version-specific. I'm currently using the laptop screen. I can test the VGA output soon.

I doubt the output itself would matter, since the same driver would be driving whichever screen(s) you're using.

If you run firejail --noprofile --noroot strace -f -o glxgears.strace glxgears do you see any chmod("/dev/nvidiactl", 0666) calls in glxgears.strace? I'm curious whether it is making those calls and whether the call succeeds.

I actually do see those calls and they do fail with EPERM (Operation not permitted) and glxgears works just fine regardless.

Also, for reference, my system has nVidia Optimus, so it has both an Intel graphics card and an nVidia card. The problem does not occur when using Mesa on the Intel card.

Yup, I have Optimus as well.

@chiraag-nataraj
Copy link
Collaborator

Is this still an issue?

@kevinoid
Copy link

Thanks for checking in. I can confirm that this is still an issue for me on Debian with firejail built from the current master branch (feae44c), nvidia 390.116-1, and Linux 5.1.2 (from kernel.org, without Debian patches). firejail --noprofile glxgears works fine, but firejail --noprofile --noroot glxgears shows a black window without the spinning gears.

@chiraag-nataraj
Copy link
Collaborator

Hmm, I see. Is this an issue if you use the Debian-supplied kernel instead of the custom build?

@kevinoid
Copy link

Yep. I can confirm that the same symptoms occur with firejail built from feae44c, nvidia 390.116-1, and Linux 4.19.0-5-amd64 (from the linux-image-4.19.0-5-amd64 Debian package version 4.19.37-3).

@chiraag-nataraj
Copy link
Collaborator

I'm wondering if it just has to do with something they changed on their end, since I'm on 418.74...

@chiraag-nataraj
Copy link
Collaborator

But looking back, that doesn't seem to matter...I'm stumped.

@chiraag-nataraj
Copy link
Collaborator

If you do firejail --noroot --noprofile glxinfo, what is the output?

@kevinoid
Copy link

I added the output of firejail --noroot --noprofile glxinfo to the gist with the strace outputs: https://gist.github.com/kevinoid/cb1c4ed6c8f073d41b1c4e1039e04e99#file-glxinfo-noroot-txt

@chiraag-nataraj
Copy link
Collaborator

Hold on. Your glxinfo output is showing that glxinfo has no problem accessing your Nvidia card. Something's really screwy here, since you're reporting that glxgears has the problem, but your output shows that glxinfo does not.

@kevinoid
Copy link

That is correct. On my system glxinfo appears to work fine with --noroot. glxgears does not work with --noroot.

@chiraag-nataraj
Copy link
Collaborator

That's so odd...I would presume they'd both attempt to access the Nvidia driver the same way...

@czka
Copy link

czka commented Mar 15, 2020

For the record: similar thing with nvidia driver 390.132-32 on Arch Linux, trying to run Operation Flashpoint (aka Arma: Cold War Assult these days) on Steam 1.0.0.61-4 in Gnome 3.36.0, X11 mode.

With noroot disabled in /etc/firejail/steam.profile the game runs fine. When noroot is set (as per Firejail's default) game's video freezes as long as its full-screen "window" is focused. However, when I press alt+tab to switch windows, I can see the game's video unfreezed underneath the list of windows. And so forth - once I switch to game's window, the video gets stuck, and when I press alt+tab I can see the game actually runnig fine underneath the list of my active windows.

@netblue30 Maybe add a hint regarding noroot in steam.profile for people trying to run games on Linux+Steam+Nvidia? I took me a long moment to figure this out.

rusty-snake pushed a commit that referenced this issue Mar 15, 2020
@rusty-snake
Copy link
Collaborator

@czka done.

@czka
Copy link

czka commented Mar 22, 2020

@rusty-snake Kewl!

@matu3ba
Copy link
Contributor

matu3ba commented Sep 8, 2020

@rusty-snake Could you close due fixed and no feedback?

@rusty-snake
Copy link
Collaborator

Maybe we want to keep it open until we have a better fix then adding a note that nvidia-users should add ignore noroot to some profiles. I'm not sure.

@netblue30
Copy link
Owner

Temporary fix in for the next release: I disable nogroups if /dev/nvidiactl is detected in the system. If it is working we go with it in the next release and find a better way to do it later.

@netblue30 netblue30 added the in testing A bugfix that is being tested label Oct 2, 2020
kmk3 added a commit to kmk3/firejail that referenced this issue Nov 30, 2021
Remove workaround from commit 623e682 ("temporary fix for
nvidia/nogroups/noroot issue (netblue30#3644, netblue30#841)", 2020-10-02) and from commit
cb460c3 ("more nvidia (netblue30#3644)", 2020-10-03).

The handling of the "render" and "video" groups is separate from
`nogroups` now, so disabling `nogroups` on nvidia shouldn't be necessary
anymore.  See the previous 2 commits for details.

See also the discussion on PR netblue30#4632.
kmk3 added a commit to kmk3/firejail that referenced this issue Nov 30, 2021
`nogroups` should not have been causing issues with rendering on nvidia
since commit 623e682 ("temporary fix for nvidia/nogroups/noroot issue
(netblue30#3644, netblue30#841)", 2020-10-02) and commit cb460c3 ("more nvidia (netblue30#3644)",
2020-10-03), which had made it a no-op on nvidia.  And the handling of
the "render" and "video" groups are independent to the handling of
`nogroups` now; see the previous 3 commits.

Commits which introduced the comments on each profile:

* kodi.profile: commit ce462b6 ("fix netblue30#3501", 2020-07-16)
* mpsyt.profile: commit e17b48f ("new profile mpsyt.profile",
  2018-11-28)
* mpv.profile: commit cc7c489 ("Document netblue30#1945", 2018-07-25)
* steam.profile: commit d6f8169 ("steam fixes; netblue30#841, netblue30#3267",
  2020-03-15)

Commands used to find the comments:

    git grep -i nvidia -- etc/profile-* | grep -v private-etc

Relates to netblue30#4632.
@rusty-snake rusty-snake removed the in testing A bugfix that is being tested label Jun 21, 2022
kmk3 added a commit to kmk3/firejail that referenced this issue Jun 17, 2024
It has been reported in netblue30#6372 that after upgrading the nvidia
proprietary driver from version 550.78 to 550.90.07, programs using
hardware acceleration fail unless paths in `/sys/module/nvidia*` are
accessible.  Example:

    $ firejail --noprofile prime-run /bin/glxdemo
    [...]
    X Error of failed request:  BadValue (integer parameter out of range for operation)
      Major opcode of failed request:  150 (GLX)
      Minor opcode of failed request:  3 (X_GLXCreateContext)
      Value in failed request:  0x0
      Serial number of failed request:  22
      Current serial number in output stream:  23
    [...]

Meanwhile, the AMD proprietary driver (AMDGPU Pro) seems to depend on
`/sys/module/amdgpu` for OpenCL (though it is unclear how to detect that
driver).  See commit 95c8e28 ("Allow accessing /sys/module directory",
2018-05-08) and commit 9dd581d ("Allow AMD GPU usage by Blender",
2018-05-08) from PR netblue30#1932.

So whitelist `/sys/module/nvidia*` by default if the nvidia proprietary
driver is detected and `no3d` is not used.

Note: The driver check is copied from src/firejail/util.c (see netblue30#841).

To keep the current behavior (that is, block all modules), add
`blacklist /sys/module` to globals.local.

Fixes netblue30#6372.

Reported-by: @GreatBigWhiteWorld
Reported-by: @orzogc
Reported-by: @krop
Reported-by: @michelesr
Suggested-by: @glitsj16
kmk3 added a commit to kmk3/firejail that referenced this issue Jun 25, 2024
It has been reported in netblue30#6372 that after upgrading the nvidia
proprietary driver from version 550.78 to 550.90.07, programs using
hardware acceleration fail unless paths in `/sys/module/nvidia*` are
accessible.  Example:

    $ firejail --noprofile prime-run /bin/glxdemo
    [...]
    X Error of failed request:  BadValue (integer parameter out of range for operation)
      Major opcode of failed request:  150 (GLX)
      Minor opcode of failed request:  3 (X_GLXCreateContext)
      Value in failed request:  0x0
      Serial number of failed request:  22
      Current serial number in output stream:  23
    [...]

Meanwhile, the AMD proprietary driver (AMDGPU Pro) seems to depend on
`/sys/module/amdgpu` for OpenCL (though it is unclear how to detect that
driver).  See commit 95c8e28 ("Allow accessing /sys/module directory",
2018-05-08) and commit 9dd581d ("Allow AMD GPU usage by Blender",
2018-05-08) from PR netblue30#1932.

So whitelist `/sys/module/nvidia*` by default if the nvidia proprietary
driver is detected and `no3d` is not used.

Note: The driver check is copied from src/firejail/util.c (see netblue30#841).

To keep the current behavior (that is, block all modules), add
`blacklist /sys/module` to globals.local.

Fixes netblue30#6372.

Reported-by: @GreatBigWhiteWorld
Reported-by: @orzogc
Reported-by: @krop
Reported-by: @michelesr
Suggested-by: @glitsj16
Tested-by: @flyxyz123
kmk3 added a commit that referenced this issue Jun 25, 2024
It has been reported in #6372 that after upgrading the nvidia
proprietary driver from version 550.78 to 550.90.07, programs using
hardware acceleration fail unless paths in `/sys/module/nvidia*` are
accessible.  Example:

    $ firejail --noprofile prime-run /bin/glxdemo
    [...]
    X Error of failed request:  BadValue (integer parameter out of range for operation)
      Major opcode of failed request:  150 (GLX)
      Minor opcode of failed request:  3 (X_GLXCreateContext)
      Value in failed request:  0x0
      Serial number of failed request:  22
      Current serial number in output stream:  23
    [...]

Meanwhile, the AMD proprietary driver (AMDGPU Pro) seems to depend on
`/sys/module/amdgpu` for OpenCL (though it is unclear how to detect that
driver).  See commit 95c8e28 ("Allow accessing /sys/module directory",
2018-05-08) and commit 9dd581d ("Allow AMD GPU usage by Blender",
2018-05-08) from PR #1932.

So whitelist `/sys/module/nvidia*` by default if the nvidia proprietary
driver is detected and `no3d` is not used.

Note: The driver check is copied from src/firejail/util.c (see #841).

To keep the current behavior (that is, block all modules), add
`blacklist /sys/module` to globals.local.

Fixes #6372.

Reported-by: @GreatBigWhiteWorld
Reported-by: @orzogc
Reported-by: @krop
Reported-by: @michelesr
Suggested-by: @glitsj16
Tested-by: @flyxyz123
@kmk3 kmk3 added the graphics Issues related to GPU acceleration and drivers (mesa, nvidia, etc) label Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working graphics Issues related to GPU acceleration and drivers (mesa, nvidia, etc)
Projects
None yet
Development

No branches or pull requests

9 participants