Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Node name in user data #276

Merged
merged 2 commits into from
Mar 30, 2018
Merged

Node name in user data #276

merged 2 commits into from
Mar 30, 2018

Conversation

dirk-thomas
Copy link
Member

Connect to ros2/ros2#438.

@dirk-thomas dirk-thomas added the in review Waiting for review (Kanban column) label Jan 19, 2018
@dirk-thomas dirk-thomas self-assigned this Jan 19, 2018
@dirk-thomas dirk-thomas force-pushed the node_name_in_user_data branch 4 times, most recently from 88b6595 to 321cfcd Compare January 19, 2018 21:34
Copy link
Member

@mikaelarguedas mikaelarguedas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm except for one comment about hardcoded values

RMW_SET_ERROR_MSG("failed to resize participant user_data");
return NULL;
}
const char * prefix = "name=";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would reduce the number of hardcoded values if we defined this prefix above and use it's length everywhere below (rather than hardcoding 5s and 6s)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 221769e.

if (!name || dds_ret != DDS_RETCODE_OK) {
name = "(no name)";
std::string name;
if (dds_ret == DDS_RETCODE_OK) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: enums and litterals on left side of operators

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 1164cea.

@dirk-thomas dirk-thomas force-pushed the node_name_in_user_data branch 2 times, most recently from 93e4877 to 1164cea Compare January 26, 2018 20:25
@@ -53,6 +53,23 @@ create_node(
// This String_dup is not matched with a String_free because DDS appears to
// free this automatically.
participant_qos.participant_name.name = DDS::String_dup(name);
// since the participant name is not part of the DDS spec
// the node name is also set in the user_data
DDS_Long name_length = strlen(name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generates a warning on Windows

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@dirk-thomas dirk-thomas force-pushed the node_name_in_user_data branch 2 times, most recently from e523a82 to c800708 Compare January 31, 2018 16:44
Copy link
Member

@dhood dhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you testing this with opensplice 6.7 or a different version? When I tested the behaviour with ros2 node list cross-vendor (to verify ros2/ros2cli#72 is resolved), it didn't work. A connext daemon will detect an opensplice talker node now, but still has issues with whatever other participants opensplice spawns.

}
if (name.empty()) {
// use participant name if no name was found in the user data
name = pbtd.participant_name.name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The participant name might be null (that seems to be the case for whatever additional participants opensplice is spawning). If we just store the participant name without checking, this causes a segfault at the if (name.empty()) check on the std::string below

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. While the participant name is a std::string in other implementations it is a char * in Connext. I checked for NULL in ae30090 before assigning it to a std::string.

if (name.empty()) {
// ignore discovered participants without a name
node_names->data[i] = nullptr;
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think continueing here is a good idea, higher layers are probably assuming that the node_names map will return valid strings (nothing in the rcl_get_node_names documentation suggests otherwise). FWICT rclpy, for example, is assuming null-terminated strings are stored at each index: https://github.com/ros2/rclpy/blob/fbd80a27342d56b32047ecb7f2ad2e44e9d027b3/rclpy/src/rclpy/_rclpy.c#L2323

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line you refenced should be fine if NULL is passed in as far as I understand the API docs:

If u is NULL, this function behaves like PyUnicode_FromUnicode() with the buffer set to NULL. This usage is deprecated in favor of PyUnicode_New().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you're referencing the documentation for PyUnicode_FromStringAndSize, PyUnicode_FromString says:

Create a Unicode object from a UTF-8 encoded null-terminated char buffer u.

The error (when it doesn't segfault) is: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 0: invalid continuation byte

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I stopped at the first match and didn't check the exact function name 😞

@dirk-thomas dirk-thomas force-pushed the node_name_in_user_data branch from c800708 to ae30090 Compare February 2, 2018 17:01
@dirk-thomas
Copy link
Member Author

Are you testing this with opensplice 6.7 or a different version?

Yes, with 6.7.

When I tested the behaviour with ros2 node list cross-vendor (to verify ros2/ros2cli#72 is resolved), it didn't work. A connext daemon will detect an opensplice talker node now, but still has issues with whatever other participants opensplice spawns.

Can you describe what exact issues you are seeing. Maybe with a sequence of commands you are running in order to reproduce it.

@dhood
Copy link
Member

dhood commented Feb 2, 2018

The specific issue is the segfault that I mentioned in the inline comments.

The sequence is the same as in the initial issue:

RMW_IMPLEMENTATION=rmw_opensplice_cpp ros2 run demo_nodes_cpp talker
$ RMW_IMPLEMENTATION=rmw_connext_cpp ros2 node list
RTI Data Distribution Service EVAL License issued to OSRF dhood@osrfoundation.org For non-production use only.
Expires on 05-Mar-2018 See www.rti.com for more information.
Segmentation fault (core dumped)

@dhood
Copy link
Member

dhood commented Feb 2, 2018

What output are you seeing/how are you testing the cross-vendor behaviour? Does ros2 node list have blank lines still? Maybe it's using an older version of the daemon instead of a "fresh" one?

@dirk-thomas
Copy link
Member Author

I was able to reproduce the problem with the OpenSplice talker and the Connext node list. I created PRs to document the behavior of the rcl API as well as fix the usage in rclcpp and rclpy.

Two considered alternatives:

  • Each rmw implementation could implement rmw_get_node_names() with a two-pass: the first iteration only checks how many non-null names exist, then it allocates the array with the exact size, in the second iteration it populates the array with all non-null names.
  • If the rcutils_string_array_t struct would have a capacity member storing the number of array items which actually contain data (beside the existing allocated size) null node names could be skipped in each rmw implementation already in a single pass without the need for reallocating the array.

@dirk-thomas dirk-thomas force-pushed the node_name_in_user_data branch from ae30090 to 74e28fb Compare March 1, 2018 17:09
@dirk-thomas dirk-thomas merged commit a50c9b0 into master Mar 30, 2018
@dirk-thomas dirk-thomas deleted the node_name_in_user_data branch March 30, 2018 17:55
@dirk-thomas dirk-thomas removed the in review Waiting for review (Kanban column) label Mar 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants