Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the application of stencils defined in CPU RAM to GPU images #42

Closed
bwsw opened this issue Dec 19, 2022 · 2 comments
Closed

Support the application of stencils defined in CPU RAM to GPU images #42

bwsw opened this issue Dec 19, 2022 · 2 comments

Comments

@bwsw
Copy link
Contributor

bwsw commented Dec 19, 2022

Linked (to do first): #43

In the Savant, the standard Nvidia approach is used to draw graphics on a GPU-defined image: one must map it to the CPU, draw changes, and unmap it to apply changes in GPU RAM. The approach requires two copies (to CPU and back to GPU) which is excessive.

Another approach is possible with a single copy from the CPU to the GPU RAM. To implement that, Savant must provide the "overlay" operation that allows the conjunction of two images (background one (GPU) and foreground one (stencil, defined in CPU to be moved to GPU)).

The approach requires the foreground image to be defined in the colorspace that supports alpha-channel (transparency) - RGBA. After the FG image is moved to GPU RAM, it is to be placed over the original background image with respect to the transparency defined.

The user can provide the following parameters:

  • left_X, top_Y to define the placement offset;
  • right_X, bottom_Y to define the scale (the module must be able to scale the image in the GPU (to be able to transfer smaller images and scale them later);
  • scale method (one of those that is supported by NVIDIA libs like https://docs.nvidia.com/vpi/algo_rescale.html) that is applied when stencil size varies from the placement defined (or no scale);
  • modifier method that supports preprocessing modifiers like 'blur' for the area where the stencil is applied ('blur', 'pixelize', and 'grayscale' are examples); the candidates may be found here

To make it efficient, two features are required:

  • to apply a single stencil to a whole batch;
  • to map multiple stencils to every BG image from the batch;
FG1: [<linear|catmull|nearest|noscale>, LX1,TY1, RX1, BY1],  [BG1, BG2, BG3, ...., BGN], []
FG2: [<linear|catmull|nearest|noscale>, LX2,TY2, RX2, BY2],  [BG1], [blur]

To optimize CPU-GPU transfers, the method may require clipping all the stencils on a single image (knapsack placement) with properly defined coordinates to avoid multiple transfer operations over the PCI-E bus.

From the user-side perspective, there are two different use-case models:

  • a user-defined function that constructs clips to be applied (most flexible configuration);
  • static images, defined in the config, which are read once (when the pipeline starts) and are always applied to the exact coordinates to the whole batch (if possible, they can be loaded to the GPU RAM once as well);
  • support cached transfers that are valid for X milliseconds without additional transfers (-1 - never expires, 0 - immediately expires, X>0 expires in X ms);
  • support named shared map transfers accessible by OS-level SHM descriptor (cached as well)
@bwsw bwsw changed the title Overlay graphics image defined in CPU RAM to GPU image Apply alpha-channel stensils defined in CPU RAM to GPU image Dec 19, 2022
@bwsw bwsw changed the title Apply alpha-channel stensils defined in CPU RAM to GPU image Apply alpha-channel stencils defined in CPU RAM to GPU image Dec 19, 2022
@bwsw bwsw changed the title Apply alpha-channel stencils defined in CPU RAM to GPU image Apply stencils defined in CPU RAM to GPU images Dec 19, 2022
@bwsw bwsw changed the title Apply stencils defined in CPU RAM to GPU images Support the application of stencils defined in CPU RAM to GPU images Dec 19, 2022
@bwsw
Copy link
Contributor Author

bwsw commented Dec 19, 2022

@tomskikh
Copy link
Collaborator

CuPy (NumPy/SciPy-compatible Array Library for GPU-accelerated Computing with Python) also could be a useful solution: https://cupy.dev/

@bwsw bwsw closed this as completed Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants