don't copy pixmap data to ram: avoid the round-trips and stay on the GPU if we can #365

totaam · 2013-07-01T15:27:20Z

EXT_texture_from_pixmap should allow us to use the pixels on the GPU for doing CSC and/or encoding without first needing to copy them to RAM (via XShmGetImage or XGetImage as we do now)

[[BR]]

XShmGetImage is pretty fast but then we have to upload the data again to the graphics card (assuming we do csc on the GPU - which is the fastest option) and then download the results. Quite wasteful, especially at high res.

The text was updated successfully, but these errors were encountered:

totaam · 2013-12-12T06:37:24Z

Here's how I think this can work.
Note: it might be easier to test this using "xpra shadow" and a full display copy since the "root window" never goes away, and we already have code to override behaviour for the root window (see GTKRootWindowModel).

when we reach window.get_image(x, y, w, h) in WindowSource, we can start copying the display pixels to a PBO (maybe even asynchronously!) using glReadPixels or glCopyTexImage2D and return an image wrapper for the PBO
when we reach driver.memcpy_htod in nvenc, we can just skip that part and instead use pycuda's gl functions to access the GL buffer (maybe it can be done as part of the NV12 CSC step anyway - even if we have to copy it to a CUDA aligned buffer, this is no big deal)

Obviously, we'll also need fallback code for dealing with non-nvenc encoders, and lots of other little details I can't foresee..

Links:

totaam · 2014-08-19T11:00:56Z

Another API we could potentially use (maybe just on win32?) is NvIFROpenGL, for which there is zero documentation...
Only this entry in the 319.49 driver changelog:
''
Added the NVIDIA OpenGL-based Inband Frame Readback (NvIFROpenGL) library to the Linux driver package. This library provides a high performance, low latency interface to capture and optionally encode an individual OpenGL framebuffer. NvIFROpenGL captures pixels rendered by OpenGL only and is ideally suited to application capture and remoting.
''

Although DRC seems to think it's not worth it:
''
I determined that the IFR stuff is not any faster than using PBOs
''

totaam · 2016-09-27T10:41:16Z

For win32 we now have an API we can use #1317 "nvidia capture sdk support"

totaam · 2017-06-18T14:34:52Z

See #1552 comment 1.

totaam · 2017-07-22T14:10:22Z

Boom! Done for NVFBC (#1317) + NVENC v8 (#1552) in r16458!
Improved in r16459, made the default in r16476.

Still TODO:

could keep the gpu buffer active, just in case we decide to actually use a video encoder, even after doing the scrolling detection and downloading the pixels to the CPU side (and we could make the scrolling detection tighter too when the gpu buffer is present)
could do scrolling detection via CUDA on the GPU
could do lz4 / zlib / whatever on GPU for small changes

totaam · 2017-07-23T14:23:13Z

Will follow up in #1597. Closing at last!

totaam · 2017-07-24T14:13:40Z

Also done for Linux in r16492: needed new NVENC kernels as Linux uses a different pixel format (XRGB vs BGRX).

totaam · 2018-02-05T03:47:38Z

Regression: #1763

totaam closed this as completed Jul 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

don't copy pixmap data to ram: avoid the round-trips and stay on the GPU if we can #365

don't copy pixmap data to ram: avoid the round-trips and stay on the GPU if we can #365

totaam commented Jul 1, 2013 •

edited

Loading

totaam commented Dec 12, 2013 •

edited

Loading

totaam commented Aug 19, 2014 •

edited

Loading

totaam commented Sep 27, 2016 •

edited

Loading

totaam commented Jun 18, 2017 •

edited

Loading

totaam commented Jul 22, 2017 •

edited

Loading

totaam commented Jul 23, 2017 •

edited

Loading

totaam commented Jul 24, 2017 •

edited

Loading

totaam commented Feb 5, 2018 •

edited

Loading

don't copy pixmap data to ram: avoid the round-trips and stay on the GPU if we can #365

don't copy pixmap data to ram: avoid the round-trips and stay on the GPU if we can #365

Comments

totaam commented Jul 1, 2013 • edited Loading

totaam commented Dec 12, 2013 • edited Loading

totaam commented Aug 19, 2014 • edited Loading

totaam commented Sep 27, 2016 • edited Loading

totaam commented Jun 18, 2017 • edited Loading

totaam commented Jul 22, 2017 • edited Loading

totaam commented Jul 23, 2017 • edited Loading

totaam commented Jul 24, 2017 • edited Loading

totaam commented Feb 5, 2018 • edited Loading

totaam commented Jul 1, 2013 •

edited

Loading

totaam commented Dec 12, 2013 •

edited

Loading

totaam commented Aug 19, 2014 •

edited

Loading

totaam commented Sep 27, 2016 •

edited

Loading

totaam commented Jun 18, 2017 •

edited

Loading

totaam commented Jul 22, 2017 •

edited

Loading

totaam commented Jul 23, 2017 •

edited

Loading

totaam commented Jul 24, 2017 •

edited

Loading

totaam commented Feb 5, 2018 •

edited

Loading