You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the Savant, the standard Nvidia approach is used to draw graphics on a GPU-defined image: one must map it to the CPU, draw changes, and unmap it to apply changes in GPU RAM. The approach requires two copies (to CPU and back to GPU) which is excessive.
Another approach is possible with a single copy from the CPU to the GPU RAM. To implement that, Savant must provide the "overlay" operation that allows the conjunction of two images (background one (GPU) and foreground one (stencil, defined in CPU to be moved to GPU)).
The approach requires the foreground image to be defined in the colorspace that supports alpha-channel (transparency) - RGBA. After the FG image is moved to GPU RAM, it is to be placed over the original background image with respect to the transparency defined.
The user can provide the following parameters:
left_X, top_Y to define the placement offset;
right_X, bottom_Y to define the scale (the module must be able to scale the image in the GPU (to be able to transfer smaller images and scale them later);
scale method (one of those that is supported by NVIDIA libs like https://docs.nvidia.com/vpi/algo_rescale.html) that is applied when stencil size varies from the placement defined (or no scale);
modifier method that supports preprocessing modifiers like 'blur' for the area where the stencil is applied ('blur', 'pixelize', and 'grayscale' are examples); the candidates may be found here
To make it efficient, two features are required:
to apply a single stencil to a whole batch;
to map multiple stencils to every BG image from the batch;
To optimize CPU-GPU transfers, the method may require clipping all the stencils on a single image (knapsack placement) with properly defined coordinates to avoid multiple transfer operations over the PCI-E bus.
From the user-side perspective, there are two different use-case models:
a user-defined function that constructs clips to be applied (most flexible configuration);
static images, defined in the config, which are read once (when the pipeline starts) and are always applied to the exact coordinates to the whole batch (if possible, they can be loaded to the GPU RAM once as well);
support cached transfers that are valid for X milliseconds without additional transfers (-1 - never expires, 0 - immediately expires, X>0 expires in X ms);
support named shared map transfers accessible by OS-level SHM descriptor (cached as well)
The text was updated successfully, but these errors were encountered:
bwsw
changed the title
Overlay graphics image defined in CPU RAM to GPU image
Apply alpha-channel stensils defined in CPU RAM to GPU image
Dec 19, 2022
bwsw
changed the title
Apply alpha-channel stensils defined in CPU RAM to GPU image
Apply alpha-channel stencils defined in CPU RAM to GPU image
Dec 19, 2022
bwsw
changed the title
Apply alpha-channel stencils defined in CPU RAM to GPU image
Apply stencils defined in CPU RAM to GPU images
Dec 19, 2022
bwsw
changed the title
Apply stencils defined in CPU RAM to GPU images
Support the application of stencils defined in CPU RAM to GPU images
Dec 19, 2022
Linked (to do first): #43
In the Savant, the standard Nvidia approach is used to draw graphics on a GPU-defined image: one must map it to the CPU, draw changes, and unmap it to apply changes in GPU RAM. The approach requires two copies (to CPU and back to GPU) which is excessive.
Another approach is possible with a single copy from the CPU to the GPU RAM. To implement that, Savant must provide the "overlay" operation that allows the conjunction of two images (background one (GPU) and foreground one (stencil, defined in CPU to be moved to GPU)).
The approach requires the foreground image to be defined in the colorspace that supports alpha-channel (transparency) - RGBA. After the FG image is moved to GPU RAM, it is to be placed over the original background image with respect to the transparency defined.
The user can provide the following parameters:
To make it efficient, two features are required:
To optimize CPU-GPU transfers, the method may require clipping all the stencils on a single image (knapsack placement) with properly defined coordinates to avoid multiple transfer operations over the PCI-E bus.
From the user-side perspective, there are two different use-case models:
X>0
expires inX
ms);The text was updated successfully, but these errors were encountered: