-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantic Segmentation messages #71
Changes from 4 commits
7edefb5
bb9dadc
87f2eae
f26fb65
ccddf62
a5d7352
ad6e80e
4ecebca
46f6a78
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# A key value pair that maps an integer class_id to a string class label. | ||
# The class_id should be interpreted as the pixel value corresponding to a | ||
# given class name in a segmentation mask | ||
|
||
# Integer value corresponding to the value of pixels belonging | ||
# to a given class in a segmentation mask | ||
uint16 class_id | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this a uint16 when the Limiting the number of classes to 255 is probably bad, but doubling the message size for the segmentation image by making it a uint16 is equally bad. Anyhow, the two should be consistent. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On second thought, it's probably better to leave it as uint16. If we use a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't speak to that exactly without his comment, but there can easily be more than 255 class values in segmentation algorithms. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My bad, they must be consistent. I will use |
||
|
||
# Label corresponding to the class_id | ||
string class_name |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# A message to contain the output of a Semantic Segmentation process | ||
|
||
std_msgs/Header header | ||
|
||
uint32 height # mask image height, that is, number of rows | ||
uint32 width # mask image width, that is, number of columns | ||
|
||
# the bytes of the single-channel image made by | ||
# the segmentation mask | ||
uint8[] data # | ||
|
||
# the confidence of the inference of each pixel | ||
# between 0-100% | ||
uint8[] confidence | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not use the full range 0-255 here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's not standard for the output of AI libraries, they usually give out a probability value. It would be a bit unnatural to convert it to 0-255 for use (e.g. I want to use predictions that are at least 80% confident). I'd argue this should actually be a float (?) from 0-1. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It could be a float, but does it really make a difference to have a confidence of say 60% (uint8) or 60.5% (float) knowing that this would increase the space used by this array by 4? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Specially on an array field that is expected to contain a lot of elements. On a 640x360 mask changing this for a float32 would mean an increase of 640x360x3 bytes, almost 700kB per message with respect to uint8 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should make sure to explain these are uints not floats, otherwise people will just put their |
||
|
||
# an array of SemanticClasses specifying which integer value in the | ||
# segmentation mask corresponds to each semantic class | ||
vision_msgs/SemanticClass[] class_map | ||
|
||
# the threshold value used in the segmentation model | ||
float32 threshold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's trailing whitespace in this line which makes the tests fail. Also both files are missing a newline at the end of the file.