Support the application of stencils defined in CPU RAM to GPU images #42

bwsw · 2022-12-19T06:20:18Z

Linked (to do first): #43

In the Savant, the standard Nvidia approach is used to draw graphics on a GPU-defined image: one must map it to the CPU, draw changes, and unmap it to apply changes in GPU RAM. The approach requires two copies (to CPU and back to GPU) which is excessive.

Another approach is possible with a single copy from the CPU to the GPU RAM. To implement that, Savant must provide the "overlay" operation that allows the conjunction of two images (background one (GPU) and foreground one (stencil, defined in CPU to be moved to GPU)).

The approach requires the foreground image to be defined in the colorspace that supports alpha-channel (transparency) - RGBA. After the FG image is moved to GPU RAM, it is to be placed over the original background image with respect to the transparency defined.

The user can provide the following parameters:

left_X, top_Y to define the placement offset;
right_X, bottom_Y to define the scale (the module must be able to scale the image in the GPU (to be able to transfer smaller images and scale them later);
scale method (one of those that is supported by NVIDIA libs like https://docs.nvidia.com/vpi/algo_rescale.html) that is applied when stencil size varies from the placement defined (or no scale);
modifier method that supports preprocessing modifiers like 'blur' for the area where the stencil is applied ('blur', 'pixelize', and 'grayscale' are examples); the candidates may be found here

To make it efficient, two features are required:

to apply a single stencil to a whole batch;
to map multiple stencils to every BG image from the batch;

FG1: [<linear|catmull|nearest|noscale>, LX1,TY1, RX1, BY1],  [BG1, BG2, BG3, ...., BGN], []
FG2: [<linear|catmull|nearest|noscale>, LX2,TY2, RX2, BY2],  [BG1], [blur]

To optimize CPU-GPU transfers, the method may require clipping all the stencils on a single image (knapsack placement) with properly defined coordinates to avoid multiple transfer operations over the PCI-E bus.

From the user-side perspective, there are two different use-case models:

a user-defined function that constructs clips to be applied (most flexible configuration);
static images, defined in the config, which are read once (when the pipeline starts) and are always applied to the exact coordinates to the whole batch (if possible, they can be loaded to the GPU RAM once as well);
support cached transfers that are valid for X milliseconds without additional transfers (-1 - never expires, 0 - immediately expires, X>0 expires in X ms);
support named shared map transfers accessible by OS-level SHM descriptor (cached as well)

The text was updated successfully, but these errors were encountered:

bwsw · 2022-12-19T08:39:28Z

Implementation approach: provide access to GpuMat to OpenCV.

Access to GpuMat: https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_sample_custom_gstream.html

No support is planned to graphics drawing primitives in OpenCV GpuMat: https://answers.opencv.org/question/190695/will-puttextcirclelinerectangle-for-cvcudagpumat-be-implement/

GPU Mat Supported Operations: https://docs.opencv.org/3.4/d1/d1a/namespacecv_1_1cuda.html

Useful Function: https://docs.opencv.org/3.4/db/d8c/group__cudaimgproc__color.html#ga08a698700458d9311390997b57fbf8dc

tomskikh · 2022-12-23T07:03:04Z

CuPy (NumPy/SciPy-compatible Array Library for GPU-accelerated Computing with Python) also could be a useful solution: https://cupy.dev/

bwsw changed the title ~~Overlay graphics image defined in CPU RAM to GPU image~~ Apply alpha-channel stensils defined in CPU RAM to GPU image Dec 19, 2022

bwsw changed the title ~~Apply alpha-channel stensils defined in CPU RAM to GPU image~~ Apply alpha-channel stencils defined in CPU RAM to GPU image Dec 19, 2022

bwsw changed the title ~~Apply alpha-channel stencils defined in CPU RAM to GPU image~~ Apply stencils defined in CPU RAM to GPU images Dec 19, 2022

bwsw changed the title ~~Apply stencils defined in CPU RAM to GPU images~~ Support the application of stencils defined in CPU RAM to GPU images Dec 19, 2022

bwsw closed this as completed Jan 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support the application of stencils defined in CPU RAM to GPU images #42

Support the application of stencils defined in CPU RAM to GPU images #42

bwsw commented Dec 19, 2022 •

edited

Loading

bwsw commented Dec 19, 2022 •

edited

Loading

tomskikh commented Dec 23, 2022

Support the application of stencils defined in CPU RAM to GPU images #42

Support the application of stencils defined in CPU RAM to GPU images #42

Comments

bwsw commented Dec 19, 2022 • edited Loading

bwsw commented Dec 19, 2022 • edited Loading

tomskikh commented Dec 23, 2022

bwsw commented Dec 19, 2022 •

edited

Loading

bwsw commented Dec 19, 2022 •

edited

Loading