Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux: add zero-copy screen capture using KMS and EGL #1758

Closed
wants to merge 26 commits into from

Conversation

w23
Copy link

@w23 w23 commented Mar 18, 2019

Note: This is still a work-in-progress change. It does work reliably, but the code quality is yet to become mergeable to upstream. There are two reasons for opening this pull request now: visibility and discussion. There are things that I'd like to have feedback on. More context is here: https://obsproject.com/forum/threads/experimental-zero-copy-screen-capture-on-linux.101262/

Approach

The purpose of this change is to add zero-copy screen capture on Linux, avoiding slow XSHM copies, and also accidentally enabling Wayland screen capture.
It works by importing KMS (DRM) display framebuffer directly as GL texture, using EGL_EXT_image_dma_buf_import extension. Access to that framebuffer requires either root or CAP_SYS_ADMIN, so a helper setuid/setcap binary is used to retrieve framebuffer's fd and sendmsg() it to main OBS process.

There are two main changes:

  • add EGL OpenGL context creation path alongside older GLX. EGL is newer and has necessary extensions for importing DMA-BUF objects as GL textures. GLX-vs-EGL is user-selectable in settings, and it defaults to GLX.
  • add dmabuf_source plugin that imports KMS framebuffer as GL texture. It is added as a part of linux-capture module.

dmabuf_source

Alongside the main plugin there's obs-drmsend utility that requires either setuid or setcap cap_sys_admin+ep. It works like this:

  1. dmabuf_source plugin opens a unix domain socket and runs the obs-drmsend utility.
  2. It enumerates currently available framebuffers, acquires their DMA-BUF fds, and sends them along with metadata to this socket (fds are sent using ancillary sendmsg data with SCM_RIGHTS flag) and exits.
  3. dmabuf_source remembers these fds and allows the user to pick from available framebuffers.

Issues

  • Linking EGL still requires presence of libEGL.so on clients machine, even if it is not used. Not depending on libEGL requires splitting or programmatically loading (dlopen, dlsym) libEGL in glad and linux-capture.
  • Xcomposite source is not patched to support EGL.
  • Naming is hard. "dmabuf_source" feels weird. Common term for such functionality is "kmsgrab".
  • Running compton as X compositor messes with framebuffers, changing them too often. This breaks this screen capture method. (It is not known whether other compositors do the same)
  • xrandr/mode change wasn't tested. It will definitely break something (e.g. replace framebuffer, or change it's dimensions).
  • cursor is supported using xcursor-xcb and kind of depends that framebuffer and X11 coordinates are aligned.
  • It likely doesn't work on nvidia.
  • No synchronization with framebuffer updates is made

Further possible improvements

  • follow CRTCs (display devices), not framebuffers. Necessary to properly support xrandr/mode changes. Requires running obs-drmsend in background, and more involved IPC.
  • read cursor from planes, not from xcb. Will also support wayland.

I will strive to update the information above as I make changes.

EGL is a modern replacement for GLX, and it is required for
importing/exporting DMA-BUF fds as GL textures. This is needed for
zero-copy screen grabbing and GPU encoding on Linux.

This implementation is naive and only make a first step into the right
direction.
- It is a compile-time decision for now. I plan to make it a runtime
decision using a separate libobs-opengl-egl.so module (or something).
- It breaks all current ways to capture screen on X11, because those
assume GLX. It would be possible to fix them, but I don't have any plans
to do that ATM.
- requires github.com/w23/drmtoy/drmsend running as root or
cap_sys_admin
- drmsend socket is hardcoded as /home/steaream/tmp/drmsend.sock
- see previous commit comment about EGL
Unfortunately QFileDialog won't show unix sockets, so user has to type
in the final filename manually.

sad
- Detect libEGL at build time
- Create special libobs-opengl-egl for EGL
- Show renderer selection in settings on Linux
- Fallback on GLX if EGL failed
- check for GS_DEVICE_OPENGL_EGL in dmabuf_source

linux-capture/XSHM works under EGL context
linux-capture/Xcomposite is disabled under EGL context

fixes #1
Basically just a copy fron w23/drmtoy repo.
It is build as `obs-drmsend` executable alongside `obs`.
Note that cmake requires to be run under superuser to be able to add
CAP_SYS_ADMIN or set setuid bit on executable, so this step is commented
for now. Builder must perform that step manually, and also ensure that
this binary is ran from fs mounted w/o nosuid option.

Fix #4
Doesn't require drmtoy/enum/drmsend anymore.

Stil work-in-progress. Hardcoded paths:
- DRI card: /dev/dri/card0
- obs-drmsend executable: ./obs-drmsend
- unix socket: /tmp/drmsend.sock

Developer still needs to manually `setcap cap_sys_admin+ep obs-drmsend`
on non-nosuid mounted fs.

Framebuffers are selectable from properties screen. Framebuffer id is
stored in settings. But these ids are volatile, they can change on mode
setting, xrandr updates, compositor (compton) stuff, and they certainly
don't survive reboots.
On some machines EGLContext cannot be created with
EGL_CONTEXT_OPENGL_DEBUG attribute. Retry without.
@jp9000
Copy link
Member

jp9000 commented Mar 18, 2019

Hi, thanks for making this PR, I've been looking forward for a higher performance way to capture on Linux.

Just to warn you, we're a bit backlogged on PRs at the moment, so it might be quite some time before we'll be able to review it. Looking forward to it though.

@TheMuso
Copy link
Contributor

TheMuso commented Mar 18, 2019 via email

@w23
Copy link
Author

w23 commented Mar 18, 2019

Just to warn you, we're a bit backlogged on PRs at the moment, so it might be quite some time before we'll be able to review it. Looking forward to it though.

No problem. I think it will take me at least a few more weeks to clean this out of WIP status, given how little free time I have to work on this.
In the meantime I'd still appreciate a high-level feedback on whether I'm doing things right or moving the system in the right way. I'll update the description with more details a bit later to give possible reviewers easier time understanding this.

If we really must have a suid binary to fetch the fd, then it should be authenticated via policykit IMO. That way if distros want to lock it down with a password, then they can do so, even if the default is to allow it with no need for a password.

Thanks for the suggestion! I was scratching my head on how to approach this elevated capabilities thing, but had zero knowledge of polkit. I will have to read about it and figure out whether it supports running a binary with just one required capability, instead of full-root.

@TheMuso
Copy link
Contributor

TheMuso commented Mar 18, 2019 via email

@Sunderland93
Copy link

There is no PipeWire support?

@w23
Copy link
Author

w23 commented Mar 19, 2019

There is no PipeWire support?

Nope. While I'm fascinated by pipewire and certainly looking forward to it being stabilized, it is out of scope of this relatively conservative change.

There was a separate pipewire-source effort here: https://gitlab.com/petejohanson/obs-pipewire-screen-casting and discussed here: https://obsproject.com/mantis/view.php?id=719

@Sunderland93
Copy link

Does it support capture of game's window only?

@w23
Copy link
Author

w23 commented Mar 19, 2019

Does it support capture of game's window only?

It doesn't. Moreover, in its current state it breaks existing Xcomposite support :D (on EGL, GLX mode is not affected), but I will fix that in a subsequent update before marking it as ready for review.

For this particular plugin to support capture of a single game window, there should be support for that in either the game itself, or window system (X11 or Wayland). I'm not aware of any robust way around that (it's technically possible to crop a region from a full framebuffer based on X11 window rect, but that is a bit weird).

I feel that more pressing is the issue of capturing a particular CRTC/monitor regardless of framebuffer/mode/rotation changes. I haven't thought about it that much.

@Sunderland93
Copy link

there should be support for that in either the game itself, or window system

Ok. Can PipeWire solve that (like Syphon on macOS)?

@kkartaltepe
Copy link
Collaborator

there should be support for that in either the game itself, or window system

Ok. Can PipeWire solve that (like Syphon on macOS)?

PipeWire is just a transport, it doesnt do anything more than send audio/video streams between two programs. Syphon is an openGL capture solution which is a very different beast.

This PR is a wonderful example of how to use dmabuf's which someone could use to inform an implementation of one of the wayland screen recording protocol that sits on top of pipewire for a zero-copy capture on supporting wayland compositors (though if its actually zerocopy depends on the compositor as they may chose to serialize the textures in any format pipewire supports).

@Sunderland93
Copy link

This PR is a wonderful example of how to use dmabuf's which someone could use to inform an implementation of one of the wayland screen recording protocol that sits on top of pipewire for a zero-copy capture on supporting wayland compositors (though if its actually zerocopy depends on the compositor as they may chose to serialize the textures in any format pipewire supports).

What about this? https://github.com/swaywm/wlr-protocols/blob/master/unstable/wlr-export-dmabuf-unstable-v1.xml and this https://github.com/swaywm/wlr-protocols/blob/master/unstable/wlr-screencopy-unstable-v1.xml

@kkartaltepe
Copy link
Collaborator

What about this?

One of the wayland screen capture protocols not based on pipewire. Basically all the protocols will result in passing dmabuf's (with or without pipewire) if they want to be performant. I didnt mean to derail this PR with wayland talk as its not directly related to this PR.

@phaitonican
Copy link
Contributor

having much better performance with this on wayland and xorg. I can get it running on Xorg though (for FreeSync), but once I go into a Fullscreen game, it seems it doesn't record anymore? The Image kinda freezes, maybe it's only a bug for me? Using Xorg from arch repos... Thanks!

@w23
Copy link
Author

w23 commented Mar 26, 2019

once I go into a Fullscreen game, it seems it doesn't record anymore? The Image kinda freezes, maybe it's only a bug for me? Using Xorg from arch repos... Thanks!

This is expected at this stage. Basically, when a game goes fullscreen it changes a framebuffer that is assigned to your monitor. Currently this plugin doesn't monitor for framebuffer changes, so it continues to read the old one, which becomes stale.

I plan to investigate into what can be done. We'd need to keep obs-drmsend running, somehow listening for framebuffer changes, and notifying parent obs process. There are also issues like: framebuffer can be resized, what to do with multi-monitor configurations, how to remap cursor if it is enabled, etc.

How does performance compare to using XComposite for capturing your game?

@jp9000
Copy link
Member

jp9000 commented Feb 29, 2020

How is this looking? (By the way, watch out for those unnecessary merge commits in there)

@kkartaltepe
Copy link
Collaborator

kkartaltepe commented Feb 29, 2020

It likely doesn't work on nvidia.

On my nvidia system I see (using obs-drmsend as root and setuid)

obs-drmsend: Opening card /dev/dri/card0
Cannot get drm planes: 95: Operation not supported

@w23
Copy link
Author

w23 commented Feb 29, 2020

How is this looking? (By the way, watch out for those unnecessary merge commits in there)

The change itself works fine, I've been actively using it for streaming for about a year now, without any issues (related to this change) whatsoever.

However, there are things that needs fixing before I'd consider it merge-able, and they seem to be rather involving and time-consuming to address (I've been stuck for months in this state without being able to allocate enough time to even design possible approaches). Most importantly there's an issue with dynamic linking with EGL vs GLX vs glad. It is likely that upstream glad patches would be needed to support everything properly. Xcomposite is one extension that would have to have non-trivially different GLX/EGL modes.

Then there are known and unknown failure modes and other issues (arguably we could bear with these and let greater community figure out solutions):

  • I found no way to get GPU fd/device name from existing context, so there has to be /dev/dri/card%d picker, and whole diagnostics around it.
  • Making it work on nvidia requires additional research
  • Some compositors are known to break this method due to the way they juggle framebuffers
  • It doesn't like mode changes (xrandr, games, etc)
  • There is no explicit sync, so maybe some tearing is possible. Never ran into this myself even once, though, so maybe there's some implicit syncing somewhere
  • Actual performance characteristics were never profiles, except for anecdotal "capture rate was <20fps, and now it's 60".

@w23
Copy link
Author

w23 commented Feb 29, 2020

Forgot to ask, what about unnecessary merge commits? Are you suggesting rebasing and force-pushing?

@jp9000
Copy link
Member

jp9000 commented Mar 1, 2020

You don't have to do anything about the merge commits for now, as long as you're aware of them and know how to deal with them that's fine. When I see unnecessary merge commits in a branch it usually just usually means a bit of git inexperience, so it kind of makes me a little bit nervous. If no one else is modifying the branch right now you can rebase any time you want, but you don't have to rebase now or anything, but you do when it's ready for a final review.

@ghost
Copy link

ghost commented Mar 27, 2020

Would love to be able to use this in Wayland, maybe this can be rebased on top of #2482 at some point.

@w23
Copy link
Author

w23 commented Mar 27, 2020

I've been using this on Wayland for many-many months now. OBS itself has to run in X11 mode, but this mechanism works for capturing Wayland screen just fine.

In terms of rebasing -- it feels weird to rebase on top of thing that hasn't been merged, and this thing here doesn't really depend on Wayland in any way.
Unfortunately, I've been lacking bandwidth to give this any substantial attention, but TL;DR I'd feel the sequence of merge events should be like this:

  1. Get OBS to support EGL on X11 on the same functional level as GLX.
  2. After this, both Wayland and window-system-agnostic KMS grabbing are mostly orthogonal and can be done independently.
    Change here supports EGL in not very clean way and just barely enough to get started. Many essential things are missing (e.g. Xcomposite on X11 is completely broken).
    There are many questions on how EGL should be done properly, but again, it requires attention and effort, of which there isn't much available right now unfortunately.

@zakk4223
Copy link

zakk4223 commented Jul 5, 2020

Would GL_EXT_memory_object_fd work here to help remove the need to convert things to EGL? I guess you could even take the long way around by using the dma_buf vulkan extension and then using EXT_memory_object_fd to share the vulkan object with OpenGL. Not sure if all this is more of a pain than just fixing up the EGL stuff though.

@w23
Copy link
Author

w23 commented Jul 5, 2020

Thanks for pointing out this extension. I didn't know it existed.
I'll try to experiment with it.

@shmerl
Copy link

shmerl commented Oct 5, 2020

Will be there some alternative that works without Vulkan/WSI instead of EGL?

@ghost
Copy link

ghost commented Feb 22, 2021

I think this can be rebased again now that EGL support is merged in master?

@w23
Copy link
Author

w23 commented Feb 22, 2021

I think this can be rebased again now that EGL support is merged in master?
I'll try to get to do it later today or tomorrow.

Btw, I already tried newer master w/ EGL and wlr-obs (without this KMS thing), and ran into weird colorspace issues. It's probably wlr-obs/wlroots/sway issue though.
It also works a bit better than my KMS thing, given that there's proper sync and output tracking.

@w23
Copy link
Author

w23 commented Feb 23, 2021

TWIMC: I've quickly and rather dirtily "rebased" this on top of master in a separate branch here: https://github.com/w23/obs-studio/tree/linux-libdrm-grab-new
Now that's EGL part is out of the way this seem rather clean change-wise (huge thanks to everyone involved in EGL+Wayland effort).
I now feel that this doesn't really belong to the obs repo. Given it's dances with root/setcap and general kludginess. I think it would be better as a standalone forever-experimental plugin.

@w23
Copy link
Author

w23 commented Feb 25, 2021

Extracted the libdrm/kms portion of this into a standalone linux-kmsgrab plugin here: https://github.com/w23/obs-kmsgrab, if anyone is still interested.

@w23 w23 closed this Feb 25, 2021
@danir-de
Copy link

What is the status on this?
Has the feature been implemented in some other form or was it completely dropped?

@kkartaltepe
Copy link
Collaborator

kkartaltepe commented Sep 19, 2022

this functionality is provided without any elevated permissions via the xdg-desktop-portal capture that was implemented in obs to support wayland. For X11 if your compositor provides an xdg-desktop-portal implementation (on x11) you can also use that on v28 where egl is the default.

@stephematician
Copy link

this functionality is provided without any elevated permissions via the xdg-desktop-portal capture that was implemented in obs to support wayland. For X11 if your compositor provides an xdg-desktop-portal implementation (on x11) you can also use that on v28 where egl is the default.

@kkartaltepe - apologies for my ignorance, but I'm having difficulty figuring out which compositor provides a xdg-desktop-portal? I'm running Ubuntu MATE 22.04 which does have xdg-desktop-portal-gtk installed - but I cannot tell if it is used/helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Improvement to existing functionality Linux Affects Linux Seeking Other Contributors This PR needs help from any other developers who are willing to give a crack at this code Work In Progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.