-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RSDK-1271 - fix nil distortion interface type propagation #1765
RSDK-1271 - fix nil distortion interface type propagation #1765
Conversation
brownConrady, ok := props.DistortionParams.(*transform.BrownConrady) | ||
if !ok || brownConrady == nil { | ||
_, ok = props.DistortionParams.(*transform.BrownConrady) | ||
if !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removal of the un necessary parts of the original fix.
@@ -135,7 +135,13 @@ func NewFromReader( | |||
if err != nil { | |||
return nil, NewPropertiesError("source camera") | |||
} | |||
actualSystem = &transform.PinholeCameraModel{props.IntrinsicParams, props.DistortionParams} | |||
|
|||
var cameraModel transform.PinholeCameraModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixed the issue in slam. The rest of the changes are to ensure that the issue doesn't surface anywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but may want to DRY It up
@@ -146,10 +146,15 @@ func newColorDepthExtrinsics(ctx context.Context, color, depth camera.Camera, at | |||
debug: attrs.Debug, | |||
logger: logger, | |||
} | |||
var cameraModel transform.PinholeCameraModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can probably be a refactored function if attrs is the same type across each of these camera instances
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, turn this into a helper function in camera.go and you can use it in all of the camera models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately they are not always the same. attrs
or params
are one of these types at these locations:
*extrinsicsAttrs: rdk/components/camera/align/extrinsics.go:118
*joinAttrs: rdk/components/camera/align/join.go:102
*homographyAttrs: rdk/components/camera/align/homography.go:104
camera.Properties: components/camera/camera.go:134 & in the `transformpipeline` package & in slam service
*transformConfig: rdk/components/camera/transformpipeline/pipeline.go:77
*dualServerAttrs: rdk/components/camera/videosource/server.go:111
*ServerAttrs: rdk/components/camera/videosource/server.go:246
*fileSourceAttrs: rdk/components/camera/videosource/static.go:28
*WebcamAttrs: rdk/components/camera/videosource/webcam.go
As a result I've made the helper function take 2 params, one for each PinholdCameraModel
struct member. Let me know if that works or if I'm missing something / if you would prefer I add a function per type attrs
/ params
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with approving this change, since it fixes the issue, and I'm not a go style expert, but I do find this pattern surprising.
In other languages, it would be very common to first check for null, then check the underlying type of the pointer. I don't know why we wouldn't expect developers to do this in go.
Also, does this pattern really save us from future headaches? Yes, it saves a developer from remembering to perform a nil check in addition to a type assertion. But it doesn't save a developer from remembering not to assign a nil value to Distortion
. And in case another developer accidentally assigned a nil value to Distortion
, maybe you're safer performing the nil check anyway.
But as I said, I'm not a go style expert, so I'm not going to hold back this PR.
brownConrady, ok := props.DistortionParams.(*transform.BrownConrady) | ||
if !ok || brownConrady == nil { | ||
_, ok = props.DistortionParams.(*transform.BrownConrady) | ||
if !ok { | ||
return "", nil, errors.New("error getting distortion_parameters for slam service, only BrownConrady distortion parameters are supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add back the test case for this error in slam/builtin_test.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but unless I'm missing something, I don't think I removed any test cases as part of this change. What did I remove which you would like me to add back?
So a good read for understanding this style (which we should ideally lint for) is https://go.dev/doc/faq#nil_error. Assigning a Given:
Ultimately, this is all due to a very bad design decision in go linked above. I find this the most confusing aspect of go. So in response to "Also, does this pattern really save us from future headaches?", I would say that it absolutely does. The TLDR is to only assign an actual |
Thank you for sending that reference, that was very helpful! I understand now. Yes, that is a bad design decision in go :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one suggestion for refactoring/wrapping the reused code. Also thank you for adding parameter names (PinholeCameraIntrinsics) to some of this functions. Makes my VSCode complain a lot less :D
@@ -146,10 +146,15 @@ func newColorDepthExtrinsics(ctx context.Context, color, depth camera.Camera, at | |||
debug: attrs.Debug, | |||
logger: logger, | |||
} | |||
var cameraModel transform.PinholeCameraModel | |||
cameraModel.PinholeCameraIntrinsics = attrs.CameraParameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the reuse of this code, do we want to consider creating a function that will give cameraModel
from a given attrs? (This might be what eric was already referring to)
@@ -254,7 +265,11 @@ func (vs *videoSource) Properties(ctx context.Context) (Properties, error) { | |||
} | |||
result.ImageType = vs.imageType | |||
result.IntrinsicParams = vs.system.PinholeCameraIntrinsics | |||
result.DistortionParams = vs.system.Distortion | |||
|
|||
if vs.system.Distortion != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, no longer being set to nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this fix! I think the only thing to do is turn the repeated code into a helper function, so that it can be reused
expected4 := transform.PinholeCameraModel{} | ||
pinholeCameraModel4 := camera.NewPinholeCameraModel(nil, nil) | ||
test.That(t, pinholeCameraModel4, test.ShouldResemble, expected4) | ||
test.That(t, pinholeCameraModel4.Distortion, test.ShouldBeNil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these tests still pass if I change https://github.com/viamrobotics/rdk/pull/1765/files#diff-4eab9b93978c09650defc23c4274a4c77d87cfdea7775d86dcf8c838ec207a01R156 to have an implementation which will allow https://go.dev/doc/faq#nil_error. to occur, aka:
func NewPinholeCameraModel(pinholeCameraIntrinsics *transform.PinholeCameraIntrinsics,
distortion transform.Distorter,
) transform.PinholeCameraModel {
var cameraModel transform.PinholeCameraModel
cameraModel.PinholeCameraIntrinsics = pinholeCameraIntrinsics
cameraModel.Distortion = distortion
return cameraModel
}
I'm not certain what to test for to determine whether or not (*T, nil)
is occurring to prove that this is fixing what I think its fixing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore, the current implementation is not resilient to intentionally trying to pass in a (*T, nil)
value. Does it need to be resilient to the caller passing in such a value?
var nilPointer *transform.NoDistortion
pinholeCameraModel2 := camera.NewPinholeCameraModel(intrinsics, nilPointer)
test.That(t, pinholeCameraModel2, test.ShouldResemble, expected2)
falis with:
=== RUN TestNewPinholeCameraModel
camera_test.go:335: Expected: 'transform.PinholeCameraModel{PinholeCameraIntrinsics:(*transform.PinholeCameraIntrinsics){Width:10, Height:10, Fx:1, Fy:2, Ppx:3, Ppy:4}, Distortion:transform.Distorter(nil)}'
Actual: 'transform.PinholeCameraModel{PinholeCameraIntrinsics:(*transform.PinholeCameraIntrinsics){Width:10, Height:10, Fx:1, Fy:2, Ppx:3, Ppy:4}, Distortion:(*transform.NoDistortion)(nil)}'
(Should resemble)!
Diff: 'transform.PinholeCameraModel{PinholeCameraIntrinsics:(*transform.PinholeCameraIntrinsics){Width:10, Height:10, Fx:1, Fy:2, Ppx:3, Ppy:4}, Distortion:(*transform.NoDistorterion)(nil)}'
--- FAIL: TestNewPinholeCameraModel (0.00s)
FAIL
exit status 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, after manually testing I've confirmed that this code change doesn't fix the bug. I still get a nil pointer exception here:
https://github.com/viamrobotics/rdk/pull/1765/files#diff-36268adc1863431c2fa801a317c85bbda50faed827c5a40b37c7888f3c3e1076L59
I'm going to add additional checking in NewPinholeCameraModel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Furthermore when it panics, RDK still hangs & doesn't terminate the entire os process:
2023-01-18T11:08:30.179-0500 DEBUG robot_server videosource/webcam.go:324 camera connected {"label": "video0"}
2023/01/18 11:08:31 slam::slamService::New 2419 ms
2023/01/18 11:08:31 slam::builtIn::getAndSaveDataSparse 94 ms
2023/01/18 11:08:31 slam::builtIn::getAndSaveDataSparse 0 ms
2023/01/18 11:08:31 slam::builtIn::getAndSaveDataSparse 323 ms
2023/01/18 11:08:31 rimage::EncodeImage::image/png 309 ms
2023-01-18T11:08:31.232-0500 INFO robot_server.process.0_intelrealgrpcserver pexec/managed_process.go:323 stopping process 8947 with signal terminated
2023-01-18T11:08:31.234-0500 DEBUG robot_server.process.0_intelrealgrpcserver pexec/managed_process.go:380 unable to check exit status
When I send an exit signal:
^C2023-01-18T11:09:33.454-0500 INFO robot_server server/entrypoint.go:248 failed to check restart {"error": "rpc error: code = Canceled desc = context canceled"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1809134]
goroutine 1 [running]:
go.viam.com/rdk/services/slam/builtin.(*builtIn).orbCamMaker(0x4000a8d040, 0x4000cec880)
/home/nickpi/rdk/services/slam/builtin/orbslam_yaml.go:59 +0x134
go.viam.com/rdk/services/slam/builtin.(*builtIn).orbGenYAML(0x4000a8d040, {0x6891838?, 0x4000ba1100?}, {0x689abe0?, 0x4000d26b40?})
/home/nickpi/rdk/services/slam/builtin/orbslam_yaml.go:143 +0xc0
go.viam.com/rdk/services/slam/builtin.runtimeServiceValidation({0x6891838, 0x4000ba1100}, {0x4000bacd70, 0x1, 0x1}, {0x4000bace50, 0x1, 0x1}, 0x4000a8d040)
/home/nickpi/rdk/services/slam/builtin/builtin.go:240 +0x3e0
go.viam.com/rdk/services/slam/builtin.NewBuiltIn({0x68918e0?, 0x4000d1c9f0?}, 0x7?, {{0x4000233080, 0x8}, {0x4000233088, 0x3}, {0x400023308c, 0x4}, {{{0x40002330a8, ...}, ...}, ...}, ...}, ...)
/home/nickpi/rdk/services/slam/builtin/builtin.go:663 +0x760
go.viam.com/rdk/services/slam/builtin.init.0.func1({0x68918e0?, 0x4000d1c9f0?}, 0x40?, {{0x4000233080, 0x8}, {0x4000233088, 0x3}, {0x400023308c, 0x4}, {{{0x40002330a8, ...}, ...}, ...}, ...}, ...)
/home/nickpi/rdk/services/slam/builtin/builtin.go:83 +0x50
go.viam.com/rdk/robot/impl.(*localRobot).newService(0x4000d3e140, {0x68918e0, 0x4000d1c9f0}, {{0x4000233080, 0x8}, {0x4000233088, 0x3}, {0x400023308c, 0x4}, {{{0x40002330a8, ...}, ...}, ...}, ...})
/home/nickpi/rdk/robot/impl/local_robot.go:605 +0x238
go.viam.com/rdk/robot/impl.(*resourceManager).processService(0x0?, {0x68918e0?, 0x4000d1c9f0?}, {{0x4000233080, 0x8}, {0x4000233088, 0x3}, {0x400023308c, 0x4}, {{{0x40002330a8, ...}, ...}, ...}, ...}, ...)
/home/nickpi/rdk/robot/impl/resource_manager.go:565 +0x484
go.viam.com/rdk/robot/impl.(*resourceManager).completeConfig(0x4000414340, {0x68918e0, 0x4000d1c9f0}, 0x4000d3e140)
/home/nickpi/rdk/robot/impl/resource_manager.go:413 +0x404
go.viam.com/rdk/robot/impl.(*localRobot).Reconfigure(0x4000d3e140, {0x68918e0, 0x4000d1c9f0}, 0x1f8afec?)
/home/nickpi/rdk/robot/impl/local_robot.go:906 +0x35c
go.viam.com/rdk/robot/impl.newWithResources({0x68918e0?, 0x4000d1c9f0}, 0x4000d40280, 0x0, 0x4000490fc8, {0x4000cef390, 0x1, 0x45f864?})
/home/nickpi/rdk/robot/impl/local_robot.go:557 +0xc28
go.viam.com/rdk/robot/impl.New(...)
/home/nickpi/rdk/robot/impl/local_robot.go:578
go.viam.com/rdk/web/server.(*robotServer).serveWeb(0x40007e0780, {0x68918e0?, 0x40005c2e40?}, 0x4000c76000)
/home/nickpi/rdk/web/server/entrypoint.go:271 +0x57c
go.viam.com/rdk/web/server.(*robotServer).runServer(0x40007e0780, {0x68918e0, 0x40005c2e40})
/home/nickpi/rdk/web/server/entrypoint.go:160 +0x8c
go.viam.com/rdk/web/server.RunServer({0x68918e0, 0x40005c2e40}, {0x40000ba040, 0x4, 0x4}, 0x483994?)
/home/nickpi/rdk/web/server/entrypoint.go:141 +0x82c
go.viam.com/utils.contextualMain(0x6478430, 0x0, 0x4000076768?)
/home/nickpi/go/pkg/mod/go.viam.com/utils@v0.1.6-0.20221221182443-100057239d4e/runtime.go:47 +0x1b0
go.viam.com/utils.ContextualMain(0x77a7c20?, 0x40000021a0?)
/home/nickpi/go/pkg/mod/go.viam.com/utils@v0.1.6-0.20221221182443-100057239d4e/runtime.go:28 +0x40
main.main()
/home/nickpi/rdk/web/cmd/server/main.go:19 +0x30
exit status 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edaniels or @bhaney Whenever y'all have time, I'm blocked & I'd like to get y'alls take to move forward.
TL;DR:
Problem:
The problem is that the DRYed up version doesn't work, but the manual version does work. This is b/c DistortionParams
comes from so many different sources in the codebase that sometimes it is a concrete type & sometimes it is an interface type & the helper function which operates on interface values returns (*T, nil)
values i.e. defeats the entire purpose of this exercise.
Questions:
- Is there a way to DRY up this code & not emit non-nil type but nil values for distortion?
- If not, should I just revert the code back to the version which doesn't have the helper function?
Details:
As an example camera.Properties()
returns a struct that has distortion as a Distortion
interface type but videosource.WebcamAttrs
only provides the distortion field as a concrete *transform.BrownConrady
.
When it is a concrete type, camera.NewPinholeCameraModel
(the helper function I created to DRY up the code) does the wrong thing.
Here is an example:
func makeCameraFromSource(ctx context.Context,
source gostream.MediaSource[image.Image],
attrs *WebcamAttrs,
) (camera.Camera, error) {
if source == nil {
return nil, errors.New("media source not found")
}
var cameraModel1 transform.PinholeCameraModel
cameraModel1.PinholeCameraIntrinsics = attrs.CameraParameters
if attrs.DistortionParameters != nil {
cameraModel1.Distortion = attrs.DistortionParameters
}
cameraModel2 := camera.NewPinholeCameraModel(attrs.CameraParameters, attrs.DistortionParameters)
fmt.Printf("CAMERAMODEL1 %T, %#v\n", cameraModel1, cameraModel1)
fmt.Printf("CAMERAMODEL2 %T, %#v\n", cameraModel2, cameraModel2)
return camera.NewFromSource(
ctx,
source,
&cameraModel2,
camera.ColorStream,
)
}
prints:
CAMERAMODEL1 transform.PinholeCameraModel, transform.PinholeCameraModel{PinholeCameraIntrinsics:(*transform.PinholeCameraIntrinsics)(0x4000628180), Distortion:transform.Distorter(nil)}
CAMERAMODEL2 transform.PinholeCameraModel, transform.PinholeCameraModel{PinholeCameraIntrinsics:(*transform.PinholeCameraIntrinsics)(0x4000628180), Distortion:(*transform.BrownConrady)(nil)}
As you can see, when the helper function is used, PinholeCameraModel.Distortion
gets set to (*transform.BrownConrady)(nil)
, non nil dynamic type but nil value for an interface value, exactly the situation we were trying to avoid in the first place.
I don't think I can define the helper function with this signature:
func NewPinholeCameraModel(pinholeCameraIntrinsics *transform.PinholeCameraIntrinsics,
distortion *transform.BrownConrady,
)
as I'll need any callers to assert the Distortion
can be turned into a *transform.BrownConrady
, which I imagine we don't want to do (otherwise why is there an interface at all) and even if we did, I'd need to add error handling for the case when the assertion fails, which I think is way beyond the scope of fixing this nil pointer de reference in the slam service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline, tag me when you've made the changes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @edaniels!
This change now uses the helper function when Distortion
is being set using a *transform.BrownConrady
& does not use the helper function when setting it to a transform.Distortion
interface type, as otherwise the helper function creates the (*T, nil)
issue we were trying to avoid in the first place.
As a side note, while working on this PR, I noticed that if the slam service panics, it does not terminate the RDK OS process. Instead logs just stop & RDK hangs.
Here is a 1 line PR which reproduces that behavior.
Should I file a new bug on this issue or is this expected behavior?
@@ -23,6 +23,18 @@ type depthToPretty struct { | |||
cameraModel *transform.PinholeCameraModel | |||
} | |||
|
|||
func propsFromVideoSource(ctx context.Context, source gostream.VideoSource) (camera.Properties, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another helper function created for a pattern that was common in transformationpipeline
components/camera/camera_test.go
Outdated
@@ -313,6 +313,50 @@ func (s *simpleSourceWithPCD) NextPointCloud(ctx context.Context) (pointcloud.Po | |||
return nil, nil | |||
} | |||
|
|||
func TestNewPinholeCameraModel(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test for the new helper function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// If *transform.BrownConrady is `nil`, transform.PinholeCameraModel.Distortion | ||
// is not set & remains nil, to prevent https://go.dev/doc/faq#nil_error. | ||
func NewPinholeCameraModel(pinholeCameraIntrinsics *transform.PinholeCameraIntrinsics, | ||
distortion *transform.BrownConrady, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about when we add support for other distortion models that are not Brown Conrady? Can you maybe rename the helper function to specify its with a BrownConrady type distortion?
Ie NewPinholeModelWithBrownConradyDistortion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
Ticket: https://viam.atlassian.net/browse/RSDK-1271
Fixes issue with original fix which @edaniels raised to me was only band-aiding over a deeper issue which turned out to occur multiple places in RDK.
The problem is that
transform.PinholeCameraModel
'sDistortion
field is an interface type, which means that if it is set to anil
value it results in the type being*transform.BrownConrady)(nil)
but the value being*transform.BrownConrady
, of whichnil
is a valid value. This means that for any users oftransform.PinholeCameraModel.Distortion
, if this was left unaddressed, users would need to write:which is not reasonable to expect from users.
The solution is NOT for users of
transform.PinholeCameraModel
values to need to constantly check whether or not this confusing condition has occurred, but rather for creators of instances of that type to guarantee that they are not setting the interface type struct member to a nil value (as it causes this confusing behavior). Simply allowingtransform.PinholeCameraModel.Distortion
to be the zero value whenever the proposedDistortion
value isnil
solves the problem.This PR:
transform.PinholeCameraModel
to check the proposedDistortion
value is not nil before setting it.transform.PinholeCameraModel
instantiation where the struct was created from untagged struct members.BrownConrady
type assertion is not nilDurring review: