-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kernel-wrapper around NPP debayer calls #4486
Conversation
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
CI MESSAGE: [6649042]: BUILD STARTED |
CI MESSAGE: [6649042]: BUILD PASSED |
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
I decided to drop the idea of per-sample pattern as strings, as those become deficult if used per-frame. So the enum is now exposed in python. |
namespace kernels { | ||
namespace debayer { | ||
|
||
enum class DALIBayerAlgorithm { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enum class DALIBayerAlgorithm { | |
enum class DALIDebayerAlgorithm { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
namespace debayer { | ||
|
||
enum class DALIBayerAlgorithm { | ||
DALI_BAYER_BILINEAR_NPP = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DALI_BAYER_BILINEAR_NPP = 0 | |
DALI_DEBAYER_BILINEAR_NPP = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
dali/pipeline/data/types.h
Outdated
@@ -574,6 +578,7 @@ DALI_REGISTER_TYPE(string, DALI_STRING); | |||
DALI_REGISTER_TYPE(DALIImageType, DALI_IMAGE_TYPE); | |||
DALI_REGISTER_TYPE(DALIDataType, DALI_DATA_TYPE); | |||
DALI_REGISTER_TYPE(DALIInterpType, DALI_INTERP_TYPE); | |||
DALI_REGISTER_TYPE(DALIColorFilter, DALI_COLOR_FILTER); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really, really need this? I don't like adding stuff to types.h and, honestly, I'd very much love to remove other enums (like ImageType and InterpType) from here instead of adding more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it is best we have for now. We need the ability to specify the one of four avialable options per-sample as tensor inputs. I've tried going around it with np.bytes_("bg") in external source, but 1. it requires "parsing" strings per-sample in each iteration, 2. is unfamiliar to users, 3. has limitations, for example, I don't see a way to create a tensor of strings that could be digested by DALI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the motivation for this is to allow the user to specify the pattern per sample, then the likeliest reason is so that you can demosaic a crop. In this case, dealing with blue (or red) position within a Bayer cell would be much easier to work with:
(blue coordinates in XY order)
color_filter=(0,0) # BGGR
color_filter=(0,1) # GRBG
color_filter=(1,0) # GBRG
color_filter=(1,1) # RGGB
If you know your original layout, you can calculate the layout of the crop as:
(layout_x - crop_x)&1, (layout_y - crop_y)&1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption I see in the requirements is more that the samples in a batch may come from different sources and thus have different patterns. I guess we could use a pair of integers to describe position within 2x2 tile if I get your idea correctly. However, both in case of opencv and npp those are enums, it seems like a more familiar way of doing so. For users who migrate from opencv, using the enum should be more straightforward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per slack disussion: Reverting the exposure of enum, because passing the enums per-sample is cumbersome (requires reinterpret op along the way),
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
!build |
CI MESSAGE: [6661813]: BUILD STARTED |
CI MESSAGE: [6661813]: BUILD FAILED |
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
!build |
CI MESSAGE: [6662792]: BUILD STARTED |
@@ -227,7 +213,7 @@ struct NppRemapKernel : public RemapKernel<Backend, T> { | |||
} | |||
|
|||
|
|||
NppStreamContext npp_ctx_{}; | |||
NppStreamContext npp_ctx_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? This is quite risky and offers little, if any, performance benefits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has hidden a problem with the call CUDA_CALL(cudaStreamGetFlags(npp_ctx_.hStream, &npp_ctx_.nStreamFlags));
accessing hStream that has not been set. So I left a version that actually reported problems that stream is not set when it is not set.
|
||
|
||
protected: | ||
NppStreamContext npp_ctx_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NppStreamContext npp_ctx_; | |
NppStreamContext npp_ctx_{}; |
or
NppStreamContext npp_ctx_; | |
NppStreamContext npp_ctx_{ cudaStream_t(-1), 0 }; |
if you want to make sure that the stream is invalid until reasonably initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
3a93bdd
to
541b29d
Compare
CI MESSAGE: [6662792]: BUILD PASSED |
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
!build |
CI MESSAGE: [6681266]: BUILD STARTED |
for (int i = 0; i < 2; i++) { | ||
for (int j = 0; j < 2; j++) { | ||
bayer_sample.data[(h + i) * width + w + j] = | ||
rgb_sample.data[(h + i) * width * num_channels + (w + j) * num_channels + | ||
pattern2channel[static_cast<int>(pattern)][i][j]]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this is simpler? Your call.
for (int i = 0; i < 2; i++) { | |
for (int j = 0; j < 2; j++) { | |
bayer_sample.data[(h + i) * width + w + j] = | |
rgb_sample.data[(h + i) * width * num_channels + (w + j) * num_channels + | |
pattern2channel[static_cast<int>(pattern)][i][j]]; | |
} | |
} | |
int i = y & 1; | |
int j = x & 1; | |
int c = pattern2channel[static_cast<int>(pattern)][i][j]; | |
bayer_sample.data[h * width + w] = | |
rgb_sample.data[(h * width + w) * num_channels + c]; |
BTW, isn't num_channels always 3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, it is simpler. As to the number of channels, it is. I made it static constexpr int num_channels = 3;
to "track usages".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, you can apply the suggestion if CI hasn't been run yet.
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
!build |
std::array<std::array<std::array<int, 2>, 2>, 4> pattern2channel{{ | ||
{{{0, 1}, {1, 2}}}, // bggr -> rggb -> 0112 | ||
{{{1, 0}, {2, 1}}}, // gbrg -> grbg -> 1021 | ||
{{{1, 2}, {0, 1}}}, // grbg -> gbrg -> 1201 | ||
{{{2, 1}, {1, 0}}} // rggb -> bggr -> 2110 | ||
}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the braces are mismatched. Plain C array would be easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::array<std::array<std::array<int, 2>, 2>, 4> pattern2channel{{ | |
{{{0, 1}, {1, 2}}}, // bggr -> rggb -> 0112 | |
{{{1, 0}, {2, 1}}}, // gbrg -> grbg -> 1021 | |
{{{1, 2}, {0, 1}}}, // grbg -> gbrg -> 1201 | |
{{{2, 1}, {1, 0}}} // rggb -> bggr -> 2110 | |
}}; | |
int pattern2channel[4][2][2] = { | |
{{0, 1}, {1, 2}}, // bggr -> rggb -> 0112 | |
{{1, 0}, {2, 1}}, // gbrg -> grbg -> 1021 | |
{{1, 2}, {0, 1}}, // grbg -> gbrg -> 1201 | |
{{2, 1}, {1, 0}} // rggb -> bggr -> 2110 | |
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
CI MESSAGE: [6681553]: BUILD STARTED |
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
!build |
CI MESSAGE: [6681606]: BUILD STARTED |
CI MESSAGE: [6681606]: BUILD PASSED |
Signed-off-by: Kamil Tokarski ktokarski@nvidia.com
Category:
New feature (non-breaking change which adds functionality)
Description:
Adds a simple kernel that wraps calls to NPP
bayer -> RGB
functions together with debayer-specific enums.Exposes the enum with bayer pattern in python so that it can be specified per-sample/per-frame easily.Additional information:
This PR does not contain cpp tests - the NPP like bilinear debayer is different form opencv's so it requires a separate implemenation for the baseline. I've done it in Python and it will be tested with the operator. I don't see much value in implementing the baseline twice for, essentialy, testing the NPP.The cpp tests check is done on simple gradient samples - one that should yield similar results irrespecitve of the concrete debayer algorithm used.
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: DALI-3149