Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roscpp: Cannot have multiple latched publishers on the same topic (ros ticket #2356) #146

Closed
tfoote opened this issue Jan 31, 2013 · 18 comments
Labels
Milestone

Comments

@tfoote
Copy link
Member

tfoote commented Jan 31, 2013

In rosplay, to actually mirror the appearance of multiple nodes, I need to create multiple latched publishers on the same topic.

The expected behavior would be for every advertiser to publish its own most recent message when a new subscriber shows up, which is what independent nodes would do. However, only a single message gets published.

trac data:

@tfoote
Copy link
Member Author

tfoote commented Jan 31, 2013

[jfaust] This isn't how we spec'ed out latched publishers to work, and it's not completely trivial to change. Not going to happen for 1.0.

@tfoote
Copy link
Member Author

tfoote commented Jan 31, 2013

[jfaust] Ken, I assume this wouldn't work in rospy either?

@tfoote
Copy link
Member Author

tfoote commented Jan 31, 2013

[kwc] Right, the latch is stored with the publisher singleton, not the publisher instance

@tfoote
Copy link
Member Author

tfoote commented Jan 31, 2013

[leibs] Pushing is fine. I don't think it's a very common use-case given that latched recording wasn't working at all previously and nobody was complaining.

Is there a reason that latched publishers were spec'ed to be per-topic rather than per-advertiser?

If I have multiple advertisers from within the same node how do their subscription callbacks interact? Does each get its own subscription callback at least?

@tfoote
Copy link
Member Author

tfoote commented Jan 31, 2013

[jfaust] subscription callbacks work per advertiser, but latched topics are really meant more for topics that only a single publisher is publishing on (like the map).

@tfoote
Copy link
Member Author

tfoote commented Jan 31, 2013

[tfoote] With people using nodelets more this can start to produce unexpected errors, and it would enable libraries from using latched topics. In particular we are looking at libraries which used latched topics for bulk updates with a parallel channel with delta updates.

@tfoote
Copy link
Member Author

tfoote commented Jan 31, 2013

[jbohren] I also encountered this "feature" when I was trying to use latched publishers in smach last year.

@dirk-thomas dirk-thomas removed the major label Mar 3, 2016
@wjwwood
Copy link
Member

wjwwood commented May 17, 2016

This came up again in the context of recording /tf_static.

One solution would be to have rosbag advertise as a non-latched topic but keep the latest message for each publisher, watch for new subscribers to come up and when they do publish the latest message from each publisher. The upside is that this approach would ensure that all subscribers would receive the latest message from each publisher, but the downside is that existing subscribers will get repeated messages when new subscribers come up.

Short of refactoring how publishers and latching works I don't think we can fix this...

@mkhansenbot
Copy link

Any chance this will be fixed in an upcoming release?

@dirk-thomas
Copy link
Member

dirk-thomas commented Nov 17, 2016

This ticket has the milestone "untargeted" assigned which says the following in its description:

It is unlikely that the maintainers will have time to address these issues. Please provide pull requests if you want these issues to be addressed.

mdhorn added a commit to mdhorn/realsense that referenced this issue Dec 17, 2016
Due to a ROS bug which prevents publishing more than one static
transform in separate processes, enable the use of dynamic
camera transforms when needed for multiple camera support.

Static transforms are still the default.

See:
  ros/ros_comm#146
  ros/geometry2#181
mdhorn added a commit to IntelRealSense/realsense-ros that referenced this issue Dec 17, 2016
Due to a ROS bug which prevents publishing more than one static
transform in separate processes, enable the use of dynamic
camera transforms when needed for multiple camera support.

Static transforms are still the default.

See:
  ros/ros_comm#146
  ros/geometry2#181
racko added a commit to racko/ros_comm that referenced this issue Feb 26, 2017
- Starts one child-process per latched topic callerid to publish the
  messages of this sender.
- Shutdown behavior is strange: First Ctrl+C kills child-processes,
  second kills rosbag play.
- Does not solve the issue of latched topics not being present in split
  bags after the first one.
@racko
Copy link
Contributor

racko commented Feb 26, 2017

I implemented a prototypical solution for this problem in the rosbag play context: #1004
What do you think about it? Is it worth pursuing?

@dirk-thomas
Copy link
Member

Please see my comment on the PR: #1004 (comment)

jonpol01 added a commit to jonpol01/realsense that referenced this issue Apr 17, 2017
* Don't ignore linker flags set by user

Yocto's bitbake sets global LDFLAGS that include the option
--hash-style=gnu. If this option is ignored then bitbake's
QA checks report that

QA Issue: No GNU_HASH in the elf binary: '/[...]/opt/ros/indigo/lib/librealsense_camera_nodelet.so' [ldflags]

The patch includes user defined linker flags to
CMAKE_SHARED_LINKER_FLAGS defined in the project's CMakeList.txt.

Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>

* Make system wrapper function generic

Error message was assume only use was dynamic reconfigure

Added boost as ROS dependency -- should have already been required.

* Added imu_start_ts for imu sync

* Added frame callbacks instead of rs_wait_for_frames

* Added imu_mutex for data sync to callback

* Refactored setStreams

* Removed prepareTopics thread; replaced with setDepthEnable

* Changed depth_scale_meters to local var

* Renamed getStreamData to setImageData

* Change color stream default to 30fps

Improve reliability of the camera by reducing the color stream
to 30fps by default. This will be the highest frame rate which will
be validate with VGA or higher resolutions for use with ROS.

Default files will no longer use the librealsense preset modes
as they will be depreciated in the future.

* Fixed boost mutex error when destroying ZR300 nodelet

Replaced boost::mutex with std::mutex

* Added destructor to ZR300 nodelet to join thread

* Fix F200/SR300 tests for new default 30fps

The F200 and SR300 tests were attempting too much code reuse
by include the r200_nodelet_default.launch file.
When the Default launch files were changed to set 30fps as the new
default for the color stream, F200/SR300 also had to set new defaults
for the Depth stream to prevent using the invalid R200 depth default.
The R200 default file does NOT pass the depth_width and depth_height
to the nodelet so the override values were lost.

* Add Dynamic Transforms support -- multi-cam

Due to a ROS bug which prevents publishing more than one static
transform in separate processes, enable the use of dynamic
camera transforms when needed for multiple camera support.

Static transforms are still the default.

See:
  ros/ros_comm#146
  ros/geometry2#181

* Code Review Fixes

* Enable ROS Lint

Many minor fixes to be compliant with ROS Lint standards.

* C++11 Flagged ROS Lint items

Suppressing these for now, but should be investigated as
they are not compliant with current ROS standards.

* Update Change log for release

* 1.7.0

* Update RGBD launch files.

Modify the r200/sr300 RGBD-style launch files
to no longer use the RGBD depth/ir/rgb_processing
flags to control whether the corresponding stream
is also enabled. For example, now turning off
RGBD depth *processing* will no longer disable
the depth stream itself.

* Do not register IR callback if stream disabled

Work around bug in librealsense where the callback for
IR is called even if the stream is disabled if IR2 stream
is enabled.
See IntelRealSense/librealsense#393

* Start IMU in startCamera

* Fix Multi-cam example launch file

Leading '/' required to ensure both cameras use the same
nodelet manager due to group name space.

* Added try-catch for start and stop device

* Fixed spacing before comment - roslint failure

* Disable Fisheye and IMU on ZR300 for RGBD launch

* Fixed roslint tabs errors

* Allow Disabling of IMU after PR#175

Refactoring PR IntelRealSense#175 broke the logic for disabling IMU

* Updated Change Log for 1.7.0

* Added retry in tests to avoid random failures

* Also exclude swap files from Git

* Use shared timestamp for SR300/F200 cameras

Due to camera hardware issues on the SR300 and F200, reverted back
to using a common timestamp updated by the fastest stream. This was the
old behaviour when rs_wait_for_frames was used prior to release 1.7.0.

* Generate Warning for non-validated camera firmware

Compare the camera's firmware version to the validated firmware version. Log
a warn message if the two firmware versions do not match. If the camera is
ZR300, also check the adapter firmware version and the motion module firmware
version.

* Updated Change Log for 1.7.1

* 1.7.1

* Git ignore QT Creator *.workspace files

* Upload 18 bat tests

Upload first version of stable bat tests, there are 2 common tests
to check librealsense and realsense_camera installed, and there are
4 specific tests of each camera (R200/F200/SR300/ZR300) to check
camera info matrix, enabled tf, set power off/on and force power
off/on, totally 18 tests.

All the tests are based on rospy, using python script as test type.

* Changed fisheye_strobe and fisheye_external_trigger to static params

* Change tf to using setRPY for consistency

* Create tool to get debug info

Create a node tool that, when run, will display the user's
operating system, kernel, ROS version, RealSense version, librealsense
version, and the type and firmware of any detected cameras.

* Fix minor spelling errors

* Updated the maintainers

* Update change log

* 1.7.2

* Added SyncNodelet class

* Updated F200 to be based on SyncNodelet

* Updated SR300 to be based on SyncNodelet

* Fixed roslint errors

* Removed getTimestamp override from F200 and SR300

* Updated r200 to be based on SyncNodelet

* Added loop checking of start/stop camera

* Update Debug Tool

Update debug tool to show motion module and adapter firmwares for the
ZR300 camera.

Update the validated firmware versions.

* Added check for depth_enable dynamic change

* Dynamic reconfigure of SR300 exposure controls

* Updated Copyright to 2017

* Added SyncNodelet destructor

* Added blank lines for readability after closing braces

* Fixed LR auto exposure issue IntelRealSense#131

LR gain should be set only when LR auto exposure is disabled

* Add option to link against non-catkin librealsense

In OpenEmbedded setups where both meta-ros and
meta-intel-realsense layers are used it's problematic
to avoid two copies of librealsense installed onto
an embedded target: one catkit-based installed under
/opt/ros and one other installed to a standard system
location.

This patch adds an option making realsense_camera link
against librealsense provided by the host system.

By default the currently existing behavior is preserved.

Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>

* Enable roslint when CATKIN_ENABLE_TESTING is True

roslint depends on catkin_run_tests_target() which is defined only
when CATKIN_ENABLE_TESTING is True.

Thus run roslint functions in CMakeLists.txt only when testing
is enabled.

Signed-off-by: Dmitry Rozhkov <dmitry.rozhkov@linux.intel.com>

* Exposed autoexposure toggle and exposure control for F200, R200. Updated copyright data on nodelets

* Fix build timing issue for catkin

* Make ZR300 consistent with other cameras

Recent color auto exposure code changes are slightly different
for SR300/F200/R200 from ZR300. Mainly the ZR300 needed the same
if condition added for only setting the manual value when auto
exposure is disabled.

* Updated changelog

* 1.8.0

* Enable configuration of the TF publication rate when using tf_dynamic

The parameter is called tf_publication_rate

* 1.8.0 -- Updated

* librealsense: Fix link when the system library is used

Otherwise librealsense won't be listed as a dependency for the nodelet
that will fail to load due to undefined symbols.

Signed-off-by: Murilo Belluzzo <murilo.belluzzo@intel.com>

* Add RGDB launch file for the ZR300
@BenjaminPelletier
Copy link

For other people running into this issue, there's a pretty easy work-around with rospy to the poor behavior in this bug. Instead of using a rospy.Publisher, use a LatchPublisher as shown below. One gotcha is that, because of the underlying Publisher singleton, the queue_size must be large enough for all Publishers on the same topic, and each instantiated Publisher/LatchPublisher overwrites the queue_size specified for the previously-instantiated Publisher/LatchPublisher.

class LatchPublisher(rospy.Publisher, rospy.SubscribeListener):
    def __init__(self, name, data_class, tcp_nodelay=False, headers=None, queue_size=None):
        super(LatchPublisher, self).__init__(name, data_class=data_class, tcp_nodelay=tcp_nodelay, headers=headers, queue_size=queue_size, subscriber_listener=self, latch=False)
        self.message = None

    def publish(self, msg):
        self.message = msg
        super(LatchPublisher, self).publish(msg)

    def peer_subscribe(self, resolved_name, publish, publish_single):
        if self.message is not None:
            publish_single(self.message)

racko added a commit to racko/ros_comm that referenced this issue Aug 13, 2017
- Starts one child-process per latched topic callerid to publish the
  messages of this sender.
- Shutdown behavior is strange: First Ctrl+C kills child-processes,
  second kills rosbag play.
- Does not solve the issue of latched topics not being present in split
  bags after the first one.
@NikolausDemmel
Copy link
Contributor

This came up again in the context of recording /tf_static.

One solution would be to have rosbag advertise as a non-latched topic but keep the latest message for each publisher, watch for new subscribers to come up and when they do publish the latest message from each publisher. The upside is that this approach would ensure that all subscribers would receive the latest message from each publisher, but the downside is that existing subscribers will get repeated messages when new subscribers come up.

Coming here (again) from https://discourse.ros.org/t/request-for-comments-static-transform-mux/5907. I guess for all practical purposes the mux would actually work well enough. However, this solution suggested by @wjwwood to modify rosbag play for special handling of latchted topics also seems relatively simple and it sounds like it would also work well enough and maybe more seamlessly than with the mux. The mentioned "downside" could maybe be avoided by only sending out the cached messages to the new subscriber, not all. Not sure how easy the is to do with the existing roscpp API.

One thing to keep in mind is that AFAIR most, if not all, the tools that modify rosbags (rosbag filter, fix, ...) currently loose the connection info and also the "latched" flag.

@meyerj
Copy link
Contributor

meyerj commented Oct 26, 2018

Missing messages when using tf_static or replaying bag files with latched topics are only symptoms of the underlying problem, namely that the current design of latching publishers is broken whenever modules using roscpp are linked together and run in the same process. This might not have been the case yet when this issue was opened in 2013, but nodelets, Gazebo, move_base planners or any other kind of plugins are examples where the author cannot know in advance in which process his code will finally run and using latched publishers or static transform publishers in those contexts must be considered as undefined behavior. Are there examples where it is actually desirable that only the message published last is stored and forwarded to new subscribers? If the two publishers would live in separate processes, subscribers also receive both messages.

Would a patch to roscpp be acceptable that moves the storage of the latched SerializedMessage instance from the Publication singleton to the individual Publisher instances? The publishers could handle the case when a new subscriber connects in a wrapped subscriber connect callback that first enqueues the cached message and then calls the actual user-provided callback. Or a separate callback that the publisher can register and that is called by the publication for each new subscriber link without using a callback queue. As a result, if there are multiple publishers of a latched topic in a single process, connecting subscribers will receive one message per publisher instead of only the last one - given that their queue size is large enough. Unfortunately such a patch would probably break the roscpp ABI, but because of the significant change in behavior it can only be introduced in a future release anyway.

A limitation of this solution is that either all or no topics in a process must be advertised as latching. In other cases the creation of an additional publisher should fail explicitly.

Once this design flaw is solved, it should be straight-forward to fix rosbag play by instantiating one publisher per connection. I did not check, maybe this is already the case. However, when recording and replaying topics that are published by a single node with multiple latching publishers, rosbag play can still not know which message originated from which publisher and hence not fully mimic the situation at the time of recording.

I did not think about the solution in all details and maybe I am missing something else here. I just would like to confirm that such a behavioral change would be desirable or not, before spending more time.

@dirk-thomas
Copy link
Member

Would a patch to roscpp be acceptable ...

In general the answer is yes.

because of the significant change in behavior it can only be introduced in a future release anyway.

Agreed.

A limitation of this solution is that either all or no topics in a process must be advertised as latching. In other cases the creation of an additional publisher should fail explicitly.

I don't think that is a viable solution. It kind of "fails" in a similar but different way as currently multiple latched publishers are not doing what you would expect.

@mgrrx
Copy link
Contributor

mgrrx commented Oct 29, 2018

I recently started to work on allowing multiple latched publishers in Python. Due to a lack of time I stopped working on it but maybe it helps to continue the discussion here: https://github.com/magazino/ros_comm/commits/multiple-latched-publishers . This of course completely changes the behavior of existing code.
I checked the code and a similar approach seemed feasible for roscpp.

@meyerj
Copy link
Contributor

meyerj commented Nov 22, 2018

I submitted a pull request with a patch following the proposal in #146 (comment) along the same line of what @mgrrx proposed for Python: #1544

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants