Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get_non_local_subscription_count to PublisherBase #2338

Open
wants to merge 2 commits into
base: rolling
Choose a base branch
from

Conversation

MiguelCompany
Copy link
Contributor

Adds a method in PublisherBase that returns the number of matched subscriptions that are not in the same context as the publisher.

It also makes inter-process publishing depend on this new method.

Depends on ros2/rcl#1111

Part of #2202

full repos file here

Signed-off-by: JLBuenoLopez-eProsima <joseluisbueno@eprosima.com>
Signed-off-by: JLBuenoLopez-eProsima <joseluisbueno@eprosima.com>
@MiguelCompany MiguelCompany changed the title Feature/count non local subscriptions Add get_non_local_subscription_count to PublisherBase Oct 20, 2023
@alsora
Copy link
Collaborator

alsora commented Oct 23, 2023

Can we add a couple of unit-tests?

@@ -133,6 +133,12 @@ class PublisherBase : public std::enable_shared_from_this<PublisherBase>
size_t
get_subscription_count() const;

/// Get non local subscription count
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means that Not in the same context for the ROS 2 user (precisely for DDS-RTPS, GUID prefix is not the same), that would be nicer to add the description explains?

if (RCL_RET_OK != status) {
rclcpp::exceptions::throw_from_rcl_error(
status,
"failed to get get non local subscription count");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"failed to get get non local subscription count");
"failed to get non local subscription count");

@sloretz
Copy link
Contributor

sloretz commented Nov 2, 2023

@MiguelCompany Friendly ping. Would you be willing to address @fujitatomoya's feedback?

@mauropasse
Copy link
Collaborator

mauropasse commented Nov 23, 2023

I've been testing the new feature enabled by:

bool inter_process_publish_needed = get_non_local_subscription_count() > 0;

This not only fixes the "double delivery" issue (local subscription receiving intra & inter process message), but also reduces CPU at startup and destruction of subscriptions.

When creating a subscription, there is a short time gap in which the intra-process subscription has not yet been created, so messages come through inter-process (serialization/deserialization/etc) causing high CPU utillization.
The same happens when the subscription is destroyed, since first the intra-process subscription is destroyed, it'll receive messages through the DDS until the whole subscription is destroyed.

This is particularly noticeable for /parameter_events built in subscriptions: in a big system with many nodes using parameters, these subscriptions might receive thousands of inter-process messages at startup.

@mauropasse
Copy link
Collaborator

mauropasse commented Nov 24, 2023

Correction about my statement above:

This not only fixes the "double delivery" issue..

This PR doesn't fix the double delivery issue. It does avoid getting inter-process messages at creation/destruction of subscriptions.

Is there anything missing to avoid the double delivery issue @MiguelCompany ? Or is that a separate effort?

  • Process1: Sub_1 --> /topic <-- Pub
  • Process2: Sub_2 --> /topic
  • Issue: Sub_1 receives both inter and intra-process messages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants