Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data-565] Fix for flaky join pointcloud naive test #1505

Merged
merged 27 commits into from
Oct 19, 2022

Conversation

bhaney
Copy link
Member

@bhaney bhaney commented Oct 17, 2022

The bug was an edge-case when reading the batches of points from a channel into the final point cloud would not read all the points in the channel.

The loop would continue reading data into the final point cloud if dataLastTime || atomic.LoadInt32(&activeReaders) > 0

If there was no data in the channel, then the loop would wait 5 ms and then would set dataLastTime = false.

There was a situation where dataListTime was false, but atomic.LoadInt32(&activeReaders) had become 0 only within those 5 ms when dataLastTime was being set to false, meaning there was still some last minute data in the channel. The solution is to do one last read to make sure all of the data is out of the pipeline.

@bhaney bhaney marked this pull request as ready for review October 17, 2022 20:29
@bhaney bhaney requested review from stevebriskin and removed request for stevebriskin October 17, 2022 20:33
@bhaney bhaney marked this pull request as draft October 17, 2022 21:04
@bhaney bhaney force-pushed the data-565-flaky-join-pointcloud branch from 0066921 to 1605092 Compare October 18, 2022 18:35
@bhaney bhaney marked this pull request as ready for review October 18, 2022 18:37
Comment on lines 104 to 117
// one last read to flush out any potential last data
lastBatches := len(finalPoints)
logger.Debugf("number of last batches: %d", lastBatches)
for i := 0; i < lastBatches; i++ {
lastPoints := <-finalPoints
logger.Debugf("number of points in batch %d: %d", i, len(lastPoints))
for _, p := range lastPoints {
myErr := pcTo.Set(p.P, p.D)
if myErr != nil {
err = myErr
}
}
}
Copy link
Member Author

@bhaney bhaney Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix, essentially

@bhaney bhaney force-pushed the data-565-flaky-join-pointcloud branch from a81421d to 3679fd6 Compare October 18, 2022 20:53
@bhaney bhaney changed the title [Data-565] flaky join pointcloud naive test [Data-565] Fix for flaky join pointcloud naive test Oct 18, 2022
@github-actions
Copy link
Contributor

Code Coverage

Package Line Rate Health
go.viam.com/rdk/components/arm 59%
go.viam.com/rdk/components/arm/universalrobots 12%
go.viam.com/rdk/components/arm/xarm 2%
go.viam.com/rdk/components/arm/yahboom 7%
go.viam.com/rdk/components/audioinput 55%
go.viam.com/rdk/components/base 68%
go.viam.com/rdk/components/base/agilex 62%
go.viam.com/rdk/components/base/boat 41%
go.viam.com/rdk/components/base/wheeled 76%
go.viam.com/rdk/components/board 69%
go.viam.com/rdk/components/board/arduino 10%
go.viam.com/rdk/components/board/commonsysfs 47%
go.viam.com/rdk/components/board/fake 39%
go.viam.com/rdk/components/board/numato 19%
go.viam.com/rdk/components/board/pi 50%
go.viam.com/rdk/components/camera 66%
go.viam.com/rdk/components/camera/fake 67%
go.viam.com/rdk/components/camera/ffmpeg 72%
go.viam.com/rdk/components/camera/transformpipeline 80%
go.viam.com/rdk/components/camera/videosource 55%
go.viam.com/rdk/components/encoder/fake 77%
go.viam.com/rdk/components/gantry 68%
go.viam.com/rdk/components/gantry/multiaxis 84%
go.viam.com/rdk/components/gantry/oneaxis 86%
go.viam.com/rdk/components/generic 85%
go.viam.com/rdk/components/gripper 82%
go.viam.com/rdk/components/input 86%
go.viam.com/rdk/components/input/gpio 87%
go.viam.com/rdk/components/motor 82%
go.viam.com/rdk/components/motor/dmc4000 69%
go.viam.com/rdk/components/motor/fake 60%
go.viam.com/rdk/components/motor/gpio 62%
go.viam.com/rdk/components/motor/gpiostepper 55%
go.viam.com/rdk/components/motor/tmcstepper 66%
go.viam.com/rdk/components/movementsensor 67%
go.viam.com/rdk/components/movementsensor/cameramono 39%
go.viam.com/rdk/components/movementsensor/gpsnmea 37%
go.viam.com/rdk/components/movementsensor/gpsrtk 28%
go.viam.com/rdk/components/posetracker 88%
go.viam.com/rdk/components/sensor 88%
go.viam.com/rdk/components/sensor/ultrasonic 31%
go.viam.com/rdk/components/servo 77%
go.viam.com/rdk/config 77%
go.viam.com/rdk/control 57%
go.viam.com/rdk/data 78%
go.viam.com/rdk/grpc 25%
go.viam.com/rdk/ml 67%
go.viam.com/rdk/ml/inference 70%
go.viam.com/rdk/motionplan 71%
go.viam.com/rdk/operation 93%
go.viam.com/rdk/pointcloud 71%
go.viam.com/rdk/protoutils 62%
go.viam.com/rdk/referenceframe 78%
go.viam.com/rdk/registry 88%
go.viam.com/rdk/resource 85%
go.viam.com/rdk/rimage 78%
go.viam.com/rdk/rimage/depthadapter 94%
go.viam.com/rdk/rimage/transform 73%
go.viam.com/rdk/rimage/transform/cmd/extrinsic_calibration 67%
go.viam.com/rdk/robot 93%
go.viam.com/rdk/robot/client 79%
go.viam.com/rdk/robot/framesystem 68%
go.viam.com/rdk/robot/impl 80%
go.viam.com/rdk/robot/server 58%
go.viam.com/rdk/robot/web 61%
go.viam.com/rdk/robot/web/stream 87%
go.viam.com/rdk/services/armremotecontrol 75%
go.viam.com/rdk/services/armremotecontrol/builtin 25%
go.viam.com/rdk/services/baseremotecontrol 75%
go.viam.com/rdk/services/baseremotecontrol/builtin 71%
go.viam.com/rdk/services/datamanager 62%
go.viam.com/rdk/services/datamanager/builtin 81%
go.viam.com/rdk/services/datamanager/datacapture 34%
go.viam.com/rdk/services/datamanager/datasync 70%
go.viam.com/rdk/services/motion 68%
go.viam.com/rdk/services/motion/builtin 89%
go.viam.com/rdk/services/navigation 54%
go.viam.com/rdk/services/sensors 78%
go.viam.com/rdk/services/sensors/builtin 97%
go.viam.com/rdk/services/shell 15%
go.viam.com/rdk/services/slam 86%
go.viam.com/rdk/services/slam/builtin 73%
go.viam.com/rdk/services/vision 82%
go.viam.com/rdk/services/vision/builtin 74%
go.viam.com/rdk/spatialmath 85%
go.viam.com/rdk/subtype 96%
go.viam.com/rdk/utils 71%
go.viam.com/rdk/vision 26%
go.viam.com/rdk/vision/chess 80%
go.viam.com/rdk/vision/delaunay 87%
go.viam.com/rdk/vision/keypoints 92%
go.viam.com/rdk/vision/objectdetection 83%
go.viam.com/rdk/vision/odometry 60%
go.viam.com/rdk/vision/odometry/cmd 0%
go.viam.com/rdk/vision/segmentation 49%
go.viam.com/rdk/web/server 26%
Summary 66% (19038 / 28721)

@bhaney bhaney requested review from edaniels and removed request for stevebriskin October 18, 2022 21:33
Copy link
Contributor

@edaniels edaniels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with this fixing the flake but if this code is used a lot, I'd love if we can benchmark it and possibly simplify it greatly.

var pcTo PointCloud
var err error

dataLastTime := false // if there was data in the channel in the previous loop, continue reading.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may be a naive question and can be discussed later, but how fast is this code compared to a simpler version. This feels very complicated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make a ticket for simplifying this merging function.

var err error

dataLastTime := false // if there was data in the channel in the previous loop, continue reading.
for dataLastTime || atomic.LoadInt32(&activeReaders) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this looping like this versus waiting on a channel and/or waitgroup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what i understand, this code generates a thread for each pointcloud source, splits each pointcloud into 8 subset pointclouds and puts each into a seperate thread, and then does the necessary transforms before dumping all the points into a final point cloud.

So there are nCamera*8 threads all feeding into one channel. Then there is this loop, which is reading from that channel into this final point cloud. Is it waiting for all threads to finish pushing data in the channel before it exits the for loop.

for _, p := range ps {
if pcTo == nil {
if p.D == nil {
pcTo = NewAppendOnlyOnlyPointsPointCloud(len(cloudFuncs) * 640 * 800)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the 640*800 for? should it be a const?

Copy link
Member Author

@bhaney bhaney Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer your question, this code was written in a rush to fulfill fast pointcloud merging for a demo. 640*800 is the resolution of the cubeEye cameras, so this was a pre-allocation for the number of points generated by the system of cubeEye cameras

@bhaney bhaney merged commit 73b2f3d into viamrobotics:main Oct 19, 2022
@bhaney bhaney deleted the data-565-flaky-join-pointcloud branch October 19, 2022 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants