[RSDK-9132] Add GetImage to camera interface and make builtin resources use it #4487

hexbabe · 2024-10-24T19:00:39Z

Overview

https://viam.atlassian.net/browse/RSDK-9132

This PR does not remove Stream from the Camera interface or any gostream logic. It just wraps it all in better English that corresponds with the gRPC method GetImage. Mostly just adding syntactic sugar over stream.Next/ReadImage using a new GetImage method (for existing builtins that use gostream), and modifying the camera server and client appropriately to utilize GetImage.

My hopes are that this encourages people to never be tempted to use Stream in modules. We will eventually removeStream from the base Camera interface

Summary of changes

Add GetImage to camera interface
Remove StreamFunc test injection and replace with GetImageFunc
Use GetImage in camera/client.go, camera/server.go and data collector
Use GetImage in builtins (webcam, replaypcd, ffmpeg, entire transform pipeline, fake etc.) instead of Stream wrappers in preparation for removing Stream entirely from the camera interface

Tests

viamrtsp h264 with and without passthrough
Webcam
Webcam hot swap (reconfigure/rebuild due to switching sources)
Transform pipeline on webcam (resize and rotate)
ffmpeg behavior with video0 is on par with the main branch (broken?? yes made ticket to track)
ffmpeg works for a fake mediamtx+ffmpeg rtsp stream though! No hang and stream looks great
Fake camera looks good with and without rtp passthrough toggled
OAK-FFC-3P

github-actions · 2024-10-24T19:00:54Z

Warning your change may break code samples. If your change modifies any of the following functions please contact @viamrobotics/fleet-management. Thanks!

component	function
base	IsMoving
board	GPIOPinByName
camera	Properties
encoder	Properties
motor	IsMoving
sensor	Readings
servo	Position
arm	EndPosition
audio	MediaProperties
gantry	Lengths
gripper	IsMoving
input_controller	Controls
movement_sensor	LinearAcceleration
power_sensor	Power
pose_tracker	Poses
motion	GetPose
vision	GetProperties

hexbabe · 2024-10-25T18:59:35Z

I'm gonna be out next week but this is ready for a first pass review. Also lmk if we should add Nick or/(and?) Dan to this review. Perhaps for future reviews that are a bit more invasive?

randhid

Since we're breaking go modules by adding another method to the interface, we should make that method follow 1:1 the inputs and returns of the proto api.

This will mean bookkeeping the release of each image stream,should you choose to use ReadImage. This will definitely go away once we get the image from the underlying comms connection like we do in the intel realsense and oak-d camera, I don't know how intense that is.

If changing the returns from GetImage to just be 1:1 the API type cause more problems in changing code outside of camera, we can focus on replacements in the camera packages, and let datamanger and vision packages use Stream and Next for this pr, but I think you're nearly there.

components/camera/camera.go

randhid · 2024-10-28T13:47:54Z

components/camera/camera.go

@@ -119,6 +119,9 @@ type VideoSource interface {
 	// that may have a MIME type hint dictated in the context via gostream.WithMIMETypeHint.
 	Stream(ctx context.Context, errHandlers ...gostream.ErrorHandler) (gostream.VideoStream, error)

+	// GetImage returns a single image that may have a MIME type hint dictated in the context via gostream.WithMIMETypeHint.
+	GetImage(ctx context.Context) (image.Image, func(), error)


For idiomatic go, we remove the 'Get' prepend from our getter APIs. I don't know why go does it this way, but this is the pattern we follow for all our go wrappers.

Suggested change

GetImage(ctx context.Context) (image.Image, func(), error)

Image(ctx context.Context) (image.Image, func(), error)

Also, please check if the PAI includes an extra for no further breaks to the interface.

Hm, dont we want to keep Get prefix to match the proto api?

No, look at all our other APIs and how they're wrapped in go. We drop the Get prepend because there is no getter/setter idiom in go, unlike our other sdks. So we drop the prepend in rdk because that's the style that people that are opinionated about this have followed.

We could have not done this, but we should keep the style now that we're in deep.

Oh lord there is an extra field

But the request msg struct also contains camera name and mime type

https://github.com/viamrobotics/api/blob/23fec5b989bfe3d95fbc4271ca42b10a7e71ed0c/proto/viam/component/camera/v1/camera.proto#L67

Dug around and found the camera package's Extra type for extras. So prob gonna use that Image(ctx context.Context, mimeType string, extra Extra) ([]byte, string, error)

It takes a lot of effort to go through and change the Image signature, so I want to verify that Image(ctx context.Context, mimeType string, extra *structpb.Struct) ([]byte, string, error) looks good before I do it

I like this signature since it's 1:1 to our APIs and follows the other sdk's signature, would you like it to go through a scope though?

A scope is only required for breaking proto changes right? This is a go interface not proto, so maybe not? Unless we feel like as a team it's worth getting more eyes on it.

Also what you quoted is the extra *structpb.Struct and not the extra Extra using the Extra type alias for map[string]interface{} in the camera package. Was this intentional? Just wanted to clarify since in Python we use extra: Optional[Dict[str, Any]] which is a native type not a pb type.

^ I do not think anything about stream and next are intentional, is that type alias used anywhere else in the camera package?

It's not seen in the camera.go signature since it's tucked away inside ctx, but it is what's currently being used to handle extra

rdk/components/camera/extra.go

Line 19 in c097ff1

func FromContext(ctx context.Context) (Extra, bool) {

randhid · 2024-10-31T13:04:58Z

components/camera/camera_test.go

-	img, _, err := camera.ReadImage(
-		gostream.WithMIMETypeHint(context.Background(), rutils.WithLazyMIMEType(rutils.MimeTypePNG)),
-		noProj2)
+	img, _, err := noProj2.GetImage(gostream.WithMIMETypeHint(context.Background(), rutils.WithLazyMIMEType(rutils.MimeTypePNG)))


Looking at these two tests I think we could delete them, they seem to be testing cameras with ProjectorProvided, which is removed now.

components/camera/client.go

randhid · 2024-10-31T13:07:38Z

components/camera/replaypcd/replaypcd.go

+func (replay *pcdCamera) GetImage(ctx context.Context) (image.Image, func(), error) {
+	stream, err := replay.Stream(ctx)
+	if err != nil {
+		return nil, func() {}, err
+	}
+	defer func() {
+		if err := stream.Close(ctx); err != nil {
+			replay.logger.Errorf("stream failed to close: %w", err)
+		}
+	}()
+	return stream.Next(ctx)
+}
+


Stream is actually unimplemented in replaypcd, you can just return an empty image and an error.

components/camera/server.go

components/camera/transformpipeline/mods.go

components/camera/videosource/webcam.go

seanavery · 2024-11-01T19:44:52Z

components/camera/camera.go

@@ -119,6 +119,9 @@ type VideoSource interface {
 	// that may have a MIME type hint dictated in the context via gostream.WithMIMETypeHint.
 	Stream(ctx context.Context, errHandlers ...gostream.ErrorHandler) (gostream.VideoStream, error)

+	// GetImage returns a single image that may have a MIME type hint dictated in the context via gostream.WithMIMETypeHint.
+	GetImage(ctx context.Context) (image.Image, func(), error)


Hm, dont we want to keep Get prefix to match the proto api?

seanavery · 2024-11-01T19:48:45Z

components/camera/client.go

@@ -228,6 +228,10 @@ func (c *client) Stream(
 	return stream, nil
 }

+func (c *client) GetImage(ctx context.Context) (image.Image, func(), error) {
+	return c.Read(ctx)


Could we Read and handle release so we do not need to include release func in returns? (perhaps that would make sense in the videosourcewrapper layer).
As mentioned, this would be a larger departure from Stream.Next which assuming will cause major problems.

I'm gonna try to get rid of release in camera components and see what happens

Looks like we just need to move the rimage.EncodeImage step one level in i.e. outBytes should come out of the Image call instead of being handled in the data collector or camera server/client. Currently release is called in the collector/server/client, so we should just move it into the Image implementation. Same logic, just handled more nested in our abstraction.

I guess future module writers should get used to using rimage to encode their output i.e. read from the source bytes and output a newly constructed []byte result? Does that sound okay to everyone?

Looks like we just need to move the rimage.EncodeImage step one level in i.e. outBytes should come out of the Image call instead of being handled in the data collector or camera server/client.

In the client, I think that makes sense to return rimage to match the python sdk functionality -- avoid unnecessary decodes if you just want bytes.

I guess future module writers should get used to using rimage to encode their output i.e. read from the source bytes and output a newly constructed []byte result? Does that sound okay to everyone?

Would this be similar to viamimage in the python sdk, i think that works pretty well.

Looks like the server is already handling release and encoding for the caller.
Are you also suggesting removing encoding step in server GetImage?

This is my WIP in server.go

switch castedCam := cam.(type) { case ReadImager: // RSDK-8663: If available, call a method that reads exactly one image. The default // `ReadImage` implementation will otherwise create a gostream `Stream`, call `Next` and // `Close` the stream. However, between `Next` and `Close`, the stream may have pulled a // second image from the underlying camera. This is particularly noticeable on camera // clients. Where a second `GetImage` request can be processed/returned over the // network. Just to be discarded. // RSDK-9132(sean yu): In addition to what Dan said above, ReadImager is important // for camera components that rely on the `release` functionality provided by gostream's `Read` // such as viamrtsp. // (check that this comment is 100% true before code review then delete this paranthetical statement) img, release, err := castedCam.Read(ctx) if err != nil { return nil, err } defer func() { if release != nil { release() } }() actualMIME, _ := utils.CheckLazyMIMEType(req.MimeType) resp.MimeType = actualMIME outBytes, err := rimage.EncodeImage(ctx, img, req.MimeType) if err != nil { return nil, err } resp.Image = outBytes default: imgBytes, mimeType, err := cam.Image(ctx, req.MimeType, ext) if err != nil { return nil, err } actualMIME, _ := utils.CheckLazyMIMEType(mimeType) resp.MimeType = actualMIME resp.Image = imgBytes }

So I think yes, in the default case we don't encode anymore since the return type is now bytes

Ok cool I like that.

I am little confused on why we still need the ReadImager path here. Shouldnt the camera interface now always have Image defined so we can just use that?

My thinking here is since viamrtsp uses the VideoReaderFunc and Read's release functionality to keep track of pool frames in the frame allocation optimization flow, for the server.go that serves viamrtsp, we need to be able to call release()

I think we could refactor viamrtsp though to just copy out the bytes on return early and use the new .Image pathway... I'm down to remove ReadImager entirely as long as we make it a high priority to refactor viamrtsp to use .Image and []byte

I see, interesting. I think it makes sense to leave in for now for viamrtsp compatibility.

I think we could refactor viamrtsp though to just copy out the bytes on return early and use the new .Image pathway... I'm down to remove ReadImager entirely as long as we make it a high priority to refactor viamrtsp to use .Image and []byte

Yep the whole point to having a memory manager was the issue with passing in a pointer to the avframe in VideoReaderFunc return causing segfaults with the encoding in the server.

Since we are removing encoding from the server and relying on the user to handle this, we should eventually be able to use Image pathway and simplify things quite a bit : )

All our go cameras will need a refactor for the full fix of this interface, I do not anticipate that we will have the best version of this code and go modules after one pr. Eventually we will get rid of Stream and Next completely. The problem is those videosourcewrapper helpers are all over the place in go modules since they're exported convenience functions.

Always export code conscientiously. But we're breaking this interface anyway, so we'll deal with the following breaks.

Okay 👍 will remove ReadImager in this PR

components/camera/server.go

components/camera/transformpipeline/classifier.go

seanavery

I am wondering if it makes sense to do a full removal of Read from client & server, ReadImage, and ReadImager here instead of just aliasing in those cases.

randhid · 2024-11-01T20:50:12Z

I am wondering if it makes sense to do a full removal of Read from client & server, ReadImage, and ReadImager here instead of just aliasing in those cases.

Yes.

seanavery

Should we replace the cemeraservice GetImage call in RenderFrame with Image?

func (s *serviceServer) RenderFrame(
	ctx context.Context,
	req *pb.RenderFrameRequest,
) (*httpbody.HttpBody, error) {
	ctx, span := trace.StartSpan(ctx, "camera::server::RenderFrame")
	defer span.End()
	if req.MimeType == "" {
		req.MimeType = utils.MimeTypeJPEG // default rendering
	}
	resp, err := s.GetImage(ctx, (*pb.GetImageRequest)(req))

hexbabe · 2024-11-06T17:22:50Z

Should we replace the cemeraservice GetImage call in RenderFrame with Image?

func (s *serviceServer) RenderFrame(
	ctx context.Context,
	req *pb.RenderFrameRequest,
) (*httpbody.HttpBody, error) {
	ctx, span := trace.StartSpan(ctx, "camera::server::RenderFrame")
	defer span.End()
	if req.MimeType == "" {
		req.MimeType = utils.MimeTypeJPEG // default rendering
	}
	resp, err := s.GetImage(ctx, (*pb.GetImageRequest)(req))

serviceServers use the gRPC named version, so GetImage is right

hexbabe · 2024-11-06T18:01:03Z

Uhoh, should I be using rebase from now on?

Edit: something else is weird with my version control. gonna try to fix sigh

hexbabe · 2024-11-07T19:17:27Z

services/vision/vision.go

@@ -351,11 +357,14 @@ func (vm *vizModel) ClassificationsFromCamera(
 	if err != nil {
 		return nil, errors.Wrapf(err, "could not find camera named %s", cameraName)
 	}
-	img, release, err := camera.ReadImage(ctx, cam)
+	imgBytes, mimeType, err := cam.Image(ctx, gostream.MIMETypeHint(ctx, utils.MimeTypeJPEG), extra)


@bhaney wanted to pick your brain rq on if it's okay to default to JPEG like this

You definitely want to test out on depth cameras, which never have jpeg available. That is the most important edge case

If anything, why not just leave it blank?

Also, you repeat the camera.Image call + rimage.Decode call a lot -- why not make a helper function in vision.go and use it everywhere instead?

Ideas on what to call it? WIP name is GetGoImage

hexbabe · 2024-11-07T19:19:34Z

services/vision/obstaclesdepth/obstacles_depth.go

@@ -149,11 +152,14 @@ func (o *obsDepth) obsDepthWithIntrinsics(ctx context.Context, src camera.VideoS
 	if o.intrinsics == nil {
 		return nil, errors.New("tried to build obstacles depth with intrinsics but no instrinsics found")
 	}
-	pic, release, err := camera.ReadImage(ctx, src)
+	imgBytes, mimeType, err := src.Image(ctx, utils.MimeTypeRawDepth, nil)


@bhaney similar thing here... should we be checking the context first for a mimetype? Also is raw depth the right mimetype to use here?

bhaney · 2024-11-07T19:39:36Z

services/vision/obstaclesdepth/obstacles_depth.go

@@ -117,12 +117,15 @@ func (o *obsDepth) buildObsDepth(logger logging.Logger) func(

 // buildObsDepthNoIntrinsics will return the median depth in the depth map as a Geometry point.
 func (o *obsDepth) obsDepthNoIntrinsics(ctx context.Context, src camera.VideoSource) ([]*vision.Object, error) {
-	pic, release, err := camera.ReadImage(ctx, src)
+	imgBytes, mimeType, err := src.Image(ctx, utils.MimeTypeRawDepth, nil)


I think you can just leave it blank here as well for the MimeType request

bhaney · 2024-11-07T19:39:44Z

services/vision/obstaclesdepth/obstacles_depth.go

@@ -149,11 +152,14 @@ func (o *obsDepth) obsDepthWithIntrinsics(ctx context.Context, src camera.VideoS
 	if o.intrinsics == nil {
 		return nil, errors.New("tried to build obstacles depth with intrinsics but no instrinsics found")
 	}
-	pic, release, err := camera.ReadImage(ctx, src)
+	imgBytes, mimeType, err := src.Image(ctx, utils.MimeTypeRawDepth, nil)


bhaney

I think you should make a helper function in services/vision/vision.go that does the camera.Image + rimage.Decode step so you don't end up re-writing a lot of code

bhaney · 2024-11-07T19:44:44Z

services/vision/vision.go

@@ -351,11 +357,14 @@ func (vm *vizModel) ClassificationsFromCamera(
 	if err != nil {
 		return nil, errors.Wrapf(err, "could not find camera named %s", cameraName)
 	}
-	img, release, err := camera.ReadImage(ctx, cam)
+	imgBytes, mimeType, err := cam.Image(ctx, gostream.MIMETypeHint(ctx, utils.MimeTypeJPEG), extra)


Also, you repeat the camera.Image call + rimage.Decode call a lot -- why not make a helper function in vision.go and use it everywhere instead?

randhid · 2024-11-07T20:00:00Z

I think you should make a helper function in services/vision/vision.go that does the camera.Image + rimage.Decode step so you don't end up re-writing a lot of code

@hexbabe okay to do in another pr given that we are adding more changes to the GetImage API method, and I'd rather focus on profiling the data collector performance if you haven't started that next.

…here we convert to go image; Change default mimetypes for classifier

… encode "")

…ng default mimetypes for vision since we are failing unit tests with 'do not know how to encode' errors

seanavery · 2024-11-08T15:58:12Z

components/camera/camera.go

@@ -70,15 +72,32 @@ type NamedImage struct {
 	SourceName string
 }

+// ImageMetadata contains useful information about returned image bytes such as its mimetype.
+type ImageMetadata struct {


Stubbing this struct out now so that width, height, and other data can be included here?

seanavery · 2024-11-08T16:14:04Z

components/camera/client.go

-				Media:   frame,
-				Release: release,
+				Media:   img,
+				Release: func() {},


So release is now a no-op if using client stream?

seanavery · 2024-11-08T16:16:01Z

components/camera/client.go

 			if err != nil {
 				for _, handler := range errHandlers {
 					handler(streamCtx, err)
 				}
 			}

+			img, err := rimage.DecodeImage(ctx, resBytes, resMetadata.MimeType)


We are not using GetGoImage here because of specific error handler?

seanavery · 2024-11-08T16:43:40Z

components/camera/client.go

+		resp.MimeType = mimeType
+	}
+
+	resp.MimeType = utils.WithLazyMIMEType(resp.MimeType)


Mime type handling seems to be making sense but just want to make sure I am understanding.

Use CheckLazyMIMEType to trim off lazy suffix if present in request.

Handle requested and response mimetype diff.

Add lazy suffix back in.

seanavery · 2024-11-08T16:48:40Z

components/camera/client.go

+		return nil, ImageMetadata{}, err
+	}
+
+	if expectedType != "" && resp.MimeType != expectedType {


A little confused about the handling here.

What happens here if expectedType is empty? Does this mean we fill with the empty mime type to signal we do not know?

seanavery · 2024-11-08T16:49:58Z

components/camera/client.go

+	if expectedType != "" && resp.MimeType != expectedType {
+		c.logger.CDebugw(ctx, "got different MIME type than what was asked for", "sent", expectedType, "received", resp.MimeType)
+	} else {
+		resp.MimeType = mimeType


Is this redundant? Shouldnt we always fill with the response mime type? What should we do if response mimeType is empty?

Init craziness

a643d22

viambot added the safe to test This pull request is marked safe to test from a trusted zone label Oct 24, 2024

Use camera pkg scoped ReadImage in webcam

41cb592

viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Oct 25, 2024

hexbabe marked this pull request as ready for review October 25, 2024 18:56

hexbabe requested review from randhid and seanavery October 25, 2024 18:58

randhid requested changes Oct 31, 2024

View reviewed changes

seanavery reviewed Nov 1, 2024

View reviewed changes

Merge branch 'main' into RSDK-9132

f6e3d69

viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 4, 2024

seanavery reviewed Nov 5, 2024

View reviewed changes

viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 6, 2024

hexbabe requested review from ethanlookpotts, micheal-parks, DTCurrie and zaporter-work as code owners November 6, 2024 17:59

viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 6, 2024

Use agreed upon Image signature

d6439dd

hexbabe force-pushed the RSDK-9132 branch from f90689b to d6439dd Compare November 6, 2024 18:10

viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 6, 2024

Delete ReadImager and fix mimetype formatting in data collector

16079fa

viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 6, 2024

Fix up obstacle depth; Delete custom extra type;

9084264

hexbabe force-pushed the RSDK-9132 branch from 82bc87a to 9084264 Compare November 7, 2024 15:28