-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RSDK-9132] Add GetImage to camera interface and make builtin resources use it #4487
base: main
Are you sure you want to change the base?
Conversation
Warning your change may break code samples. If your change modifies any of the following functions please contact @viamrobotics/fleet-management. Thanks!
|
I'm gonna be out next week but this is ready for a first pass review. Also lmk if we should add Nick or/(and?) Dan to this review. Perhaps for future reviews that are a bit more invasive? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're breaking go modules by adding another method to the interface, we should make that method follow 1:1 the inputs and returns of the proto api.
This will mean bookkeeping the release
of each image stream,should you choose to use ReadImage. This will definitely go away once we get the image from the underlying comms connection like we do in the intel realsense and oak-d camera, I don't know how intense that is.
If changing the returns from GetImage to just be 1:1 the API type cause more problems in changing code outside of camera, we can focus on replacements in the camera packages, and let datamanger and vision packages use Stream and Next for this pr, but I think you're nearly there.
components/camera/camera.go
Outdated
@@ -119,6 +119,9 @@ type VideoSource interface { | |||
// that may have a MIME type hint dictated in the context via gostream.WithMIMETypeHint. | |||
Stream(ctx context.Context, errHandlers ...gostream.ErrorHandler) (gostream.VideoStream, error) | |||
|
|||
// GetImage returns a single image that may have a MIME type hint dictated in the context via gostream.WithMIMETypeHint. | |||
GetImage(ctx context.Context) (image.Image, func(), error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For idiomatic go, we remove the 'Get' prepend from our getter APIs. I don't know why go does it this way, but this is the pattern we follow for all our go wrappers.
GetImage(ctx context.Context) (image.Image, func(), error) | |
Image(ctx context.Context) (image.Image, func(), error) |
Also, please check if the PAI includes an extra for no further breaks to the interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, dont we want to keep Get prefix to match the proto api?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, look at all our other APIs and how they're wrapped in go. We drop the Get
prepend because there is no getter/setter idiom in go, unlike our other sdks. So we drop the prepend in rdk because that's the style that people that are opinionated about this have followed.
We could have not done this, but we should keep the style now that we're in deep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh lord there is an extra field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the request msg struct also contains camera name and mime type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dug around and found the camera package's Extra
type for extras. So prob gonna use that Image(ctx context.Context, mimeType string, extra Extra) ([]byte, string, error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes a lot of effort to go through and change the Image signature, so I want to verify that Image(ctx context.Context, mimeType string, extra *structpb.Struct) ([]byte, string, error) looks good before I do it
I like this signature since it's 1:1 to our APIs and follows the other sdk's signature, would you like it to go through a scope though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A scope is only required for breaking proto changes right? This is a go interface not proto, so maybe not? Unless we feel like as a team it's worth getting more eyes on it.
Also what you quoted is the extra *structpb.Struct
and not the extra Extra
using the Extra
type alias for map[string]interface{}
in the camera
package. Was this intentional? Just wanted to clarify since in Python we use extra: Optional[Dict[str, Any]]
which is a native type not a pb type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ I do not think anything about stream and next are intentional, is that type alias used anywhere else in the camera package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not seen in the camera.go
signature since it's tucked away inside ctx
, but it is what's currently being used to handle extra
rdk/components/camera/extra.go
Line 19 in c097ff1
func FromContext(ctx context.Context) (Extra, bool) { |
components/camera/camera_test.go
Outdated
img, _, err := camera.ReadImage( | ||
gostream.WithMIMETypeHint(context.Background(), rutils.WithLazyMIMEType(rutils.MimeTypePNG)), | ||
noProj2) | ||
img, _, err := noProj2.GetImage(gostream.WithMIMETypeHint(context.Background(), rutils.WithLazyMIMEType(rutils.MimeTypePNG))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at these two tests I think we could delete them, they seem to be testing cameras with ProjectorProvided
, which is removed now.
func (replay *pcdCamera) GetImage(ctx context.Context) (image.Image, func(), error) { | ||
stream, err := replay.Stream(ctx) | ||
if err != nil { | ||
return nil, func() {}, err | ||
} | ||
defer func() { | ||
if err := stream.Close(ctx); err != nil { | ||
replay.logger.Errorf("stream failed to close: %w", err) | ||
} | ||
}() | ||
return stream.Next(ctx) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stream is actually unimplemented in replaypcd
, you can just return an empty image and an error.
components/camera/camera.go
Outdated
@@ -119,6 +119,9 @@ type VideoSource interface { | |||
// that may have a MIME type hint dictated in the context via gostream.WithMIMETypeHint. | |||
Stream(ctx context.Context, errHandlers ...gostream.ErrorHandler) (gostream.VideoStream, error) | |||
|
|||
// GetImage returns a single image that may have a MIME type hint dictated in the context via gostream.WithMIMETypeHint. | |||
GetImage(ctx context.Context) (image.Image, func(), error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, dont we want to keep Get prefix to match the proto api?
components/camera/client.go
Outdated
@@ -228,6 +228,10 @@ func (c *client) Stream( | |||
return stream, nil | |||
} | |||
|
|||
func (c *client) GetImage(ctx context.Context) (image.Image, func(), error) { | |||
return c.Read(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we Read
and handle release so we do not need to include release func in returns? (perhaps that would make sense in the videosourcewrapper layer).
As mentioned, this would be a larger departure from Stream.Next which assuming will cause major problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna try to get rid of release in camera components and see what happens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we just need to move the rimage.EncodeImage
step one level in i.e. outBytes
should come out of the Image
call instead of being handled in the data collector or camera server/client. Currently release
is called in the collector/server/client, so we should just move it into the Image
implementation. Same logic, just handled more nested in our abstraction.
I guess future module writers should get used to using rimage
to encode their output i.e. read from the source bytes and output a newly constructed []byte
result? Does that sound okay to everyone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we just need to move the rimage.EncodeImage step one level in i.e. outBytes should come out of the Image call instead of being handled in the data collector or camera server/client.
In the client, I think that makes sense to return rimage
to match the python sdk functionality -- avoid unnecessary decodes if you just want bytes.
I guess future module writers should get used to using rimage to encode their output i.e. read from the source bytes and output a newly constructed []byte result? Does that sound okay to everyone?
Would this be similar to viamimage
in the python sdk, i think that works pretty well.
Looks like the server is already handling release and encoding for the caller.
Are you also suggesting removing encoding step in server GetImage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my WIP in server.go
switch castedCam := cam.(type) {
case ReadImager:
// RSDK-8663: If available, call a method that reads exactly one image. The default
// `ReadImage` implementation will otherwise create a gostream `Stream`, call `Next` and
// `Close` the stream. However, between `Next` and `Close`, the stream may have pulled a
// second image from the underlying camera. This is particularly noticeable on camera
// clients. Where a second `GetImage` request can be processed/returned over the
// network. Just to be discarded.
// RSDK-9132(sean yu): In addition to what Dan said above, ReadImager is important
// for camera components that rely on the `release` functionality provided by gostream's `Read`
// such as viamrtsp.
// (check that this comment is 100% true before code review then delete this paranthetical statement)
img, release, err := castedCam.Read(ctx)
if err != nil {
return nil, err
}
defer func() {
if release != nil {
release()
}
}()
actualMIME, _ := utils.CheckLazyMIMEType(req.MimeType)
resp.MimeType = actualMIME
outBytes, err := rimage.EncodeImage(ctx, img, req.MimeType)
if err != nil {
return nil, err
}
resp.Image = outBytes
default:
imgBytes, mimeType, err := cam.Image(ctx, req.MimeType, ext)
if err != nil {
return nil, err
}
actualMIME, _ := utils.CheckLazyMIMEType(mimeType)
resp.MimeType = actualMIME
resp.Image = imgBytes
}
So I think yes, in the default case we don't encode anymore since the return type is now bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok cool I like that.
I am little confused on why we still need the ReadImager
path here. Shouldnt the camera interface now always have Image
defined so we can just use that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking here is since viamrtsp
uses the VideoReaderFunc
and Read
's release
functionality to keep track of pool frames in the frame allocation optimization flow, for the server.go
that serves viamrtsp
, we need to be able to call release()
I think we could refactor viamrtsp
though to just copy out the bytes on return early and use the new .Image
pathway... I'm down to remove ReadImager
entirely as long as we make it a high priority to refactor viamrtsp
to use .Image
and []byte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, interesting. I think it makes sense to leave in for now for viamrtsp compatibility.
I think we could refactor viamrtsp though to just copy out the bytes on return early and use the new .Image pathway... I'm down to remove ReadImager entirely as long as we make it a high priority to refactor viamrtsp to use .Image and []byte
Yep the whole point to having a memory manager was the issue with passing in a pointer to the avframe in VideoReaderFunc
return causing segfaults with the encoding in the server.
Since we are removing encoding from the server and relying on the user to handle this, we should eventually be able to use Image pathway and simplify things quite a bit : )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All our go cameras will need a refactor for the full fix of this interface, I do not anticipate that we will have the best version of this code and go modules after one pr. Eventually we will get rid of Stream and Next completely. The problem is those videosourcewrapper helpers are all over the place in go modules since they're exported convenience functions.
Always export code conscientiously. But we're breaking this interface anyway, so we'll deal with the following breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay 👍 will remove ReadImager in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if it makes sense to do a full removal of Read
from client & server, ReadImage
, and ReadImager
here instead of just aliasing in those cases.
Yes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we replace the cemeraservice GetImage call in RenderFrame
with Image
?
func (s *serviceServer) RenderFrame(
ctx context.Context,
req *pb.RenderFrameRequest,
) (*httpbody.HttpBody, error) {
ctx, span := trace.StartSpan(ctx, "camera::server::RenderFrame")
defer span.End()
if req.MimeType == "" {
req.MimeType = utils.MimeTypeJPEG // default rendering
}
resp, err := s.GetImage(ctx, (*pb.GetImageRequest)(req))
serviceServers use the gRPC named version, so GetImage is right |
Uhoh, should I be using rebase from now on? Edit: something else is weird with my version control. gonna try to fix sigh |
services/vision/vision.go
Outdated
@@ -351,11 +357,14 @@ func (vm *vizModel) ClassificationsFromCamera( | |||
if err != nil { | |||
return nil, errors.Wrapf(err, "could not find camera named %s", cameraName) | |||
} | |||
img, release, err := camera.ReadImage(ctx, cam) | |||
imgBytes, mimeType, err := cam.Image(ctx, gostream.MIMETypeHint(ctx, utils.MimeTypeJPEG), extra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bhaney wanted to pick your brain rq on if it's okay to default to JPEG like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You definitely want to test out on depth cameras, which never have jpeg available. That is the most important edge case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything, why not just leave it blank?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, you repeat the camera.Image call + rimage.Decode call a lot -- why not make a helper function in vision.go and use it everywhere instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideas on what to call it? WIP name is GetGoImage
@@ -149,11 +152,14 @@ func (o *obsDepth) obsDepthWithIntrinsics(ctx context.Context, src camera.VideoS | |||
if o.intrinsics == nil { | |||
return nil, errors.New("tried to build obstacles depth with intrinsics but no instrinsics found") | |||
} | |||
pic, release, err := camera.ReadImage(ctx, src) | |||
imgBytes, mimeType, err := src.Image(ctx, utils.MimeTypeRawDepth, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bhaney similar thing here... should we be checking the context first for a mimetype? Also is raw depth the right mimetype to use here?
@@ -117,12 +117,15 @@ func (o *obsDepth) buildObsDepth(logger logging.Logger) func( | |||
|
|||
// buildObsDepthNoIntrinsics will return the median depth in the depth map as a Geometry point. | |||
func (o *obsDepth) obsDepthNoIntrinsics(ctx context.Context, src camera.VideoSource) ([]*vision.Object, error) { | |||
pic, release, err := camera.ReadImage(ctx, src) | |||
imgBytes, mimeType, err := src.Image(ctx, utils.MimeTypeRawDepth, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just leave it blank here as well for the MimeType request
@@ -149,11 +152,14 @@ func (o *obsDepth) obsDepthWithIntrinsics(ctx context.Context, src camera.VideoS | |||
if o.intrinsics == nil { | |||
return nil, errors.New("tried to build obstacles depth with intrinsics but no instrinsics found") | |||
} | |||
pic, release, err := camera.ReadImage(ctx, src) | |||
imgBytes, mimeType, err := src.Image(ctx, utils.MimeTypeRawDepth, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should make a helper function in services/vision/vision.go that does the camera.Image + rimage.Decode step so you don't end up re-writing a lot of code
services/vision/vision.go
Outdated
@@ -351,11 +357,14 @@ func (vm *vizModel) ClassificationsFromCamera( | |||
if err != nil { | |||
return nil, errors.Wrapf(err, "could not find camera named %s", cameraName) | |||
} | |||
img, release, err := camera.ReadImage(ctx, cam) | |||
imgBytes, mimeType, err := cam.Image(ctx, gostream.MIMETypeHint(ctx, utils.MimeTypeJPEG), extra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, you repeat the camera.Image call + rimage.Decode call a lot -- why not make a helper function in vision.go and use it everywhere instead?
@hexbabe okay to do in another pr given that we are adding more changes to the GetImage API method, and I'd rather focus on profiling the data collector performance if you haven't started that next. |
…here we convert to go image; Change default mimetypes for classifier
…ng default mimetypes for vision since we are failing unit tests with 'do not know how to encode' errors
@@ -70,15 +72,32 @@ type NamedImage struct { | |||
SourceName string | |||
} | |||
|
|||
// ImageMetadata contains useful information about returned image bytes such as its mimetype. | |||
type ImageMetadata struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stubbing this struct out now so that width, height, and other data can be included here?
Media: frame, | ||
Release: release, | ||
Media: img, | ||
Release: func() {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So release is now a no-op if using client stream?
if err != nil { | ||
for _, handler := range errHandlers { | ||
handler(streamCtx, err) | ||
} | ||
} | ||
|
||
img, err := rimage.DecodeImage(ctx, resBytes, resMetadata.MimeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not using GetGoImage
here because of specific error handler?
resp.MimeType = mimeType | ||
} | ||
|
||
resp.MimeType = utils.WithLazyMIMEType(resp.MimeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mime type handling seems to be making sense but just want to make sure I am understanding.
- Use
CheckLazyMIMEType
to trim off lazy suffix if present in request. - Handle requested and response mimetype diff.
- Add lazy suffix back in.
return nil, ImageMetadata{}, err | ||
} | ||
|
||
if expectedType != "" && resp.MimeType != expectedType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little confused about the handling here.
What happens here if expectedType is empty? Does this mean we fill with the empty mime type to signal we do not know?
if expectedType != "" && resp.MimeType != expectedType { | ||
c.logger.CDebugw(ctx, "got different MIME type than what was asked for", "sent", expectedType, "received", resp.MimeType) | ||
} else { | ||
resp.MimeType = mimeType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this redundant? Shouldnt we always fill with the response mime type? What should we do if response mimeType is empty?
Overview
https://viam.atlassian.net/browse/RSDK-9132
This PR does not remove
Stream
from theCamera
interface or any gostream logic. It just wraps it all in better English that corresponds with the gRPC methodGetImage
. Mostly just adding syntactic sugar overstream.Next
/ReadImage
using a newGetImage
method (for existing builtins that use gostream), and modifying the camera server and client appropriately to utilizeGetImage
.My hopes are that this encourages people to never be tempted to use
Stream
in modules. We will eventually removeStream
from the baseCamera
interfaceSummary of changes
GetImage
to camera interfaceStreamFunc
test injection and replace withGetImageFunc
GetImage
incamera/client.go
,camera/server.go
and data collectorGetImage
in builtins (webcam, replaypcd, ffmpeg, entire transform pipeline, fake etc.) instead of Stream wrappers in preparation for removing Stream entirely from the camera interfaceTests
viamrtsp
h264 with and without passthrough