-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RSDK-1109] Add custom Depth MIMEType #1666
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the way this preserves all of our old formats too! There is essentially one change I would like you to do, which is change the image.Config{} function to just read the first bytes of the new file type, rather than the whole file, so that users can get the config info quick
@@ -331,3 +332,36 @@ func TestDepthColorModel(t *testing.T) { | |||
test.That(t, convGray, test.ShouldHaveSameTypeAs, gray) | |||
test.That(t, convGray.(color.Gray16).Y, test.ShouldEqual, 6168) | |||
} | |||
|
|||
func TestDepthMapEncoding(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great test! Glad to see it all works as expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Can you remind me of the context for this change? Is it that sending raw depth maps to app/sdks is faster than sending pngs? And are app/sdks all able to understand raw depth maps?
rimage/depth_map_raw.go
Outdated
if rawWidth == 6363110499870197078 { // magic number for VERSIONX | ||
return readDepthMapFormat2(f) | ||
switch firstBytes { | ||
case 6363110499870197078: // magic number for VERSIONX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use constants for the magic numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah on it
Context for the change in general... Right now we use PNGs for depth which is slower than raw but the main thing is that it's less correct. Like, PNG does compression (not a lot but still) and that's not the vibe for depth images. You probably want every pixel of information you're supposed to get |
4ba7fb9
to
edb30df
Compare
Thank you! Are app/sdks all able to understand raw depth maps, or will there need to be changes on that end? |
Uhhh... they kinda don't understand depth maps in general rn. On the app side, we normally use something called "depth_to_pretty" which is a transform that turns the depth image into a color one so ppl can kinda visualize it. On the SDK side rn we'd just spit back out the raw bytes. So probably some changes needed eventually. |
|
Thanks, that makes sense. I guess I'm still wondering: if we return a custom format, what do we expect people to do with it? Will they be able to use it? |
Adding to this comment, given that python PIL does not support 16bit grayscale images (or else you could use this Image Register function to decode the images) is it instead possible to return a uint16 numpy 2D array, or even just a standard 2D list of unint16 values? |
Actually, https://pillow.readthedocs.io/en/stable/reference/ImageFile.html you may be able to work around PIL's limitations if the ImageFile methods are flexible enough |
To more directly answer this question - the hope is that the end user would not have to interact with the bytes directly, but be given the 2D array of depth info. The custom mime type is mostly that - it just adds the three extra parts at the beggining of the bytes stream, the "DEPTHMAP" name, the height, and the width. But asking a user to decode it themselves is undesireable. In the Go SDK, it gets decoded to the DepthMap struct, in App in gets converted into a 16bit grayscale image, and in Python they get the raw bytes. Ideally, Python would instead return the 2D array of the depth info. |
Thank you for the explanation! Is there a ticket for the python SDK work? |
No, I can make it soon |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one fix with creating a descriptive error for if the first bytes are not the magic number.
Once that is fixed, LGTM!
if err != nil || firstBytes != MagicNumIntViamType { | ||
return image.Config{}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the first bytes != the Magic Number, create an error message saying that, and return that error. Though- I think that error should never happen, as the RegisterFormat function won't run the function if the magic number doesn't match.
Maybe create a test for that, to make sure the behavior is as expected.
No description provided.