Enable image_tools tests with synthetic image #83

dhood · 2016-10-26T17:16:47Z

connects to #43
todo: test different quality of service configurations

dhood · 2016-10-29T02:46:10Z

update: different reliability QoS settings now tested.

todo: tests not running properly on windows http://ci.ros2.org/job/ci_windows/1740/testReport/(root)/test_showimage_cam2image__rmw_fastrtps_cpp_Release/test_reliable_qos/

dhood · 2017-02-21T20:00:44Z

Update:

runs on windows now. all that was needed was to add the opencv bin directory to the path of the buildfarm machines.
there's still some bad performance with fastrtps.

it takes a long time for an image to be received when using best effort publishing and subscribing with FastRTPS. see this test output. This does not happen for reliable pub+sub. Interestingly, this also does not happen for a reliable publisher and a best effort subscriber, where the resulting connection should be best effort.

it seems that the best effort publisher might be what's causing the performance problems. there might be something in the reliable publisher that makes it easier for the subscriber (best effort or reliable) to get the message. I am not sure why at the moment.

If others agree I will focus on getting this merged just to confirm that these demos are runnable, and we can investigate the behaviour issues as a separate task. I'll reduce the publishing rate of the images so that the timeout doesn't have to be so high to accommodate fastrtps (it's 60s).

dhood · 2017-02-21T20:51:55Z

another update: only reliable is working for os x. best effort publisher and/or subscriber does not show that any message is received. I don't remember this being an issue when I first started working on this PR.

The changes referenced in ros2/rmw_fastrtps#81 (comment) might be responsible (I don't know that we have any other tests for best effort communication + fragmentation, so that might be why we didn't notice this before.)

mikaelarguedas · 2017-02-21T20:57:14Z

That sounds good to me. +1 for getting this merged without fixing the fast-RTPS throughput issue in best effort

Rationale:
The best effort throughput issue on FastRTPS may be out of the scope of this PR and I think that several issues and discussion are reporting the same problem. Edit: like the on referenced in the previous comment

(I don't know that we have any other tests for best effort communication + fragmentation, so that might be why we didn't notice this before.)

+1 I think this is the only demo actually testing best effort on big messages. To my knowledge the only other tests with big messages are in test_communication where we established only reliable communication

dhood · 2017-02-28T20:50:49Z

So, the reason it wasn't working on osx with best effort is because of packet collisions. enabling the throughput controller for fastrtps means that the fragments of the image aren't all sent at the same time and so the collisions are less likely.

With a fastrtps throughput controller of 30,000 bytes/10ms and a heartbeat period of 10ms, all platforms and all reliability combinations can receive an image using the default settings of 320x240 @ 30fps in less than a second:
http://ci.ros2.org/job/ci_osx/1740/testReport/(root)/test_showimage_cam2image__rmw_fastrtps_cpp_/
http://ci.ros2.org/job/ci_windows/2266/testReport/(root)/test_showimage_cam2image__rmw_fastrtps_cpp_Release/
http://ci.ros2.org/job/ci_linux/2267/testReport/(root)/test_showimage_cam2image__rmw_fastrtps_cpp_/
(these jobs were run with a branch on rmw_fastrtps enabling the throughput controller and 10ms heartbeat period)

I have opened ros2/rmw_fastrtps#92 to track the discussion on what to do with this moving forward. For now, now that the reason these tests were having different behaviour on different platforms is clearer, I will only enable the reliable tests for this demo.

dhood · 2017-02-28T23:30:15Z

now that the reason these tests were having different behaviour on different platforms is clearer

As @mikaelarguedas pointed out this isn't 100% accurate - it's still not clear why the behaviour is dependent on the operating system, just that the throughput controller is a fix.

To be explicit, without any changes to fastrtps (ie not throughput controller, default heartbeat period), if the best effort tests are enabled:
Performance is fine on linux: 0.6s to receive a best effort image
Performance is poor on Windows: 28s to receive a best effort image
Performance is bad on OS X: no best effort image received in 60s

CI for just this branch, with best effort tests disabled so as not to be dependent on any changes to fastrtps:
windows:
os x: (existing test failure)
linux:

wjwwood · 2017-02-28T23:36:10Z

How are you measuring that? Just the time the tests take to complete?

dhood · 2017-02-28T23:42:32Z

yes, with the test completion time. the tests will pass once any image has been received.

mikaelarguedas · 2017-02-28T23:48:48Z

Yeah still wondering how there can be collision with a single sender o_O. maybe a magic ability to send simultaneously unicast and multicast packets 🦄 that end up colliding...

I think you'll want to rebase this before merging (or use the "rebase and merge" option) otherwise the tests marked as skipped won't be skipped anymore.

dhood · 2017-02-28T23:53:40Z

I have edited one of my previous comments to clarify that I was linking to two different sets of CI jobs: one with best effort tests enabled (top), and one with them not enabled (bottom), the current state of this PR.

maybe a magic ability to send simultaneously unicast and multicast packets 🦄 that end up colliding...

Yeah I was just speaking with @wjwwood about that. I do see that, at least on my linux machine, I can see unicast and multicast packets being sent.

I think you'll want to rebase this before merging (or use the "rebase and merge" option) otherwise the tests marked as skipped won't be skipped anymore.

I am not sure that I understand. I do not want the "showimage_cam2image" tests to be skipped anymore. Are you speaking about another?

wjwwood · 2017-02-28T23:54:20Z

It could also be that the kernel's udp buffer is getting overwhelmed by the bursty nature without the flow controller, but that's just conjecture.

What's the evidence for the package collision as described above? I didn't see anything in @dhood's post where she mentions that.

dhood · 2017-03-01T00:01:04Z

What's the evidence for the package collision as described above? I didn't see anything in @dhood's post where she mentions that.

I don't have evidence that it's packet collision - I shouldn't have jumped to conclusions on that one.

As an additional data point, I observed that running the demo on OS X at higher frequency (e.g. 100 Hz instead of 30 Hz) allows messages to be received by the best effort subscriber, even without any throughput controller....

mikaelarguedas · 2017-03-01T00:04:09Z

I think you'll want to rebase this before merging (or use the "rebase and merge" option) otherwise the tests marked as skipped won't be skipped anymore.

I am not sure that I understand. I do not want the "showimage_cam2image" tests to be skipped anymore. Are you speaking about another?

Sorry I should have phrased it differently, I just wanted to be sure you were aware that this was reenabling all of them. Haven't tried any connext_dynamic code for a while so I just assumed it was still broken.

wjwwood · 2017-03-01T00:10:01Z

As an additional data point, I observed that running the demo on OS X at higher frequency (e.g. 100 Hz instead of 30 Hz) allows messages to be received by the best effort subscriber, even without any throughput controller....

lol wut

dhood · 2017-03-01T00:11:17Z

Haven't tried any connext_dynamic code for a while so I just assumed it was still broken.

I think that the decision we made once upon a time was that we only disable tests for "on pause" rmw implementations e.g. https://github.com/ros2/system_tests/blob/3005b6df5261098bc6a51567b57f1022f7ef8830/test_communication/CMakeLists.txt#L60 if they are being skipped because of known implementation-specific bugs. I don't think we skip tests for them otherwise. so enabling for all rmw implementations is OK as I see it

dhood · 2017-03-01T00:18:25Z

Haven't tried any connext_dynamic code for a while so I just assumed it was still broken.

OK you're referring to the "COMMENDSrWriterService_write:!write. Reliable large data requires asynchronous writer." comment that's in the code. We use asynchronous writer now in the shared connext functions, so it should be fine to re-enable this even for connext dynamic.

mikaelarguedas · 2017-03-01T00:19:22Z

sounds good to me 👍 thanks for clarifying

dhood · 2017-03-01T02:17:19Z

As an additional data point, I observed that running the demo on OS X at higher frequency (e.g. 100 Hz instead of 30 Hz) allows messages to be received by the best effort subscriber, even without any throughput controller....

lol wut

I've checked on the three OS X buildfarm machines and they all had the same thing.. wouldn't receive images at 30 Hz, would receive images at 100 Hz (eyeballing it it seems to actually be at 100 Hz, too). It evens improves further if you increase the frequency more....

Yeah still wondering how there can be collision with a single sender o_O. maybe a magic ability to send simultaneously unicast and multicast packets 🦄 that end up colliding...

Implementing the changes suggested here to only use multicast for subscribers seems to improve performance a bit (it can receive an image every so often, varies with runs), but there still seems to be something preventing images from being received in whole.

I've opened ros2/rmw_fastrtps#93 to ask the eProsima folks if they have any ideas.

dhood · 2017-03-07T16:28:03Z

I didn't realise that this was labelled as in progress and not in review. Could I get an explicit approval from someone? As we discussed above, this PR is just to check that the executables are runnable and while it was useful for evaluating fastrtps' performance, its scope has been narrowed to not depend on any changes to rmw_fastrtps.

dirk-thomas · 2017-03-07T16:53:09Z

image_tools/CMakeLists.txt

+  get_available_rmw_implementations(rmw_implementations)
+  foreach(rmw_implementation ${rmw_implementations})
+    find_package("${rmw_implementation}" REQUIRED)
+  endforeach()


Is this really necessary?

The reason that I thought it was necessary is because it seems from your comment here that these lines were not removed from the tests mentioned for a reason that also applies here (we are making test executables for each rmw implementation).

Will the call_for_each_rmw_implementation macro work fine without doing find_package on each of the rmw implementations? I thought it must be necessary but since that macro calls get_all_rmw_implementations itself perhaps this block can be removed again.

I don't see where this is true (if we consider "executable" to be a compiled executable rather than an executable Pythonscript):

we are making test executables for each rmw implementation

As far as I can see the testing_targets macros doesn't contain any compiled targets. It only expands a Python template and registers that Python file as a nosetest. So there is nothing which uses the find_package information. For the referenced tests in system_tests that is different - the function there actually builds executables for each rmw impl.

OK I see now that the only info that's needed is the list of RMW implementations from get_all_rmw_implementations, not the packages themselves. I will remove this block thanks.

Without flow controllers there is no guarantee to receive best effort images

dhood · 2017-03-08T17:54:43Z

Addressed @dirk-thomas' comment in 11cd2e3

Rebased and adapted for #117 in 86c3d36

Linux:
OS X: (standard test failure these days)
Windows:

mikaelarguedas · 2017-03-08T18:17:21Z

image_tools/test/cam2image.txt

@@ -0,0 +1,2 @@
+Publishing data on topic 'image'
+Publishing image #1


Can you give more context about why renaming this file was necessary ? Asking because I thought that launch testing was explicitely looking for .regex files

Sorry missing a pointer here: based on this block I would expect this cam2image.txt file to never be taken into account during the tests

launch testing can work with either .txt or .regex. .txt has to match explicitly. cam2image.txt will be matched as a literal file because of the particular block that you referenced

oops messed up my search for .txt 😕 sorry for the noise

dhood · 2017-03-24T17:13:56Z

I've run CI again on this since it's been a few weeks and I'm going to merge this.
linux (linters on another package)
windows
osx

dhood added the in progress Actively being worked on (Kanban column) label Oct 26, 2016

dhood self-assigned this Oct 26, 2016

mikaelarguedas mentioned this pull request Nov 24, 2016

keeping track of disabled tests ros2/ros2#290

Closed

6 tasks

dhood force-pushed the enable-burger-tests branch from 871f2dd to 048020a Compare February 21, 2017 19:02

dhood added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Mar 7, 2017

dhood mentioned this pull request Mar 7, 2017

Allow showimage and cam2image to subscribe/publish to alternate topics. #117

Merged

dirk-thomas reviewed Mar 7, 2017

View reviewed changes

dhood added 3 commits March 8, 2017 08:44

Enable image-tools tests with synthetic images

f63e939

Remove best effort tests

d548791

Without flow controllers there is no guarantee to receive best effort images

Remove unnecessary calls to find_package

11cd2e3

dhood added 2 commits March 8, 2017 08:44

Flush buffer for cam2image too

45670ee

Update expected output for cam2image

86c3d36

dhood force-pushed the enable-burger-tests branch from 0e42da7 to 86c3d36 Compare March 8, 2017 16:49

mikaelarguedas reviewed Mar 8, 2017

View reviewed changes

dhood merged commit 5e84afc into master Mar 24, 2017

dhood deleted the enable-burger-tests branch March 24, 2017 17:21

dhood removed the in review Waiting for review (Kanban column) label Mar 24, 2017

mikaelarguedas mentioned this pull request Apr 5, 2017

Document how to set up Git-LFS ros2/ros2#173

Closed

dhood mentioned this pull request Apr 26, 2017

cam2image__rmw_connext_cpp fails #61

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable image_tools tests with synthetic image #83

Enable image_tools tests with synthetic image #83

dhood commented Oct 26, 2016

dhood commented Oct 29, 2016

dhood commented Feb 21, 2017

dhood commented Feb 21, 2017

mikaelarguedas commented Feb 21, 2017

dhood commented Feb 28, 2017 •

edited

Loading

dhood commented Feb 28, 2017 •

edited

Loading

wjwwood commented Feb 28, 2017

dhood commented Feb 28, 2017

mikaelarguedas commented Feb 28, 2017

dhood commented Feb 28, 2017 •

edited

Loading

wjwwood commented Feb 28, 2017

dhood commented Mar 1, 2017

mikaelarguedas commented Mar 1, 2017

wjwwood commented Mar 1, 2017

dhood commented Mar 1, 2017

dhood commented Mar 1, 2017

mikaelarguedas commented Mar 1, 2017

dhood commented Mar 1, 2017

dhood commented Mar 7, 2017 •

edited

Loading

dirk-thomas Mar 7, 2017

dhood Mar 7, 2017

dirk-thomas Mar 7, 2017

dhood Mar 7, 2017

dhood commented Mar 8, 2017

mikaelarguedas Mar 8, 2017

mikaelarguedas Mar 8, 2017

dhood Mar 8, 2017

mikaelarguedas Mar 8, 2017

dhood commented Mar 24, 2017

		@@ -0,0 +1,2 @@
		Publishing data on topic 'image'
		Publishing image #1

Enable image_tools tests with synthetic image #83

Enable image_tools tests with synthetic image #83

Conversation

dhood commented Oct 26, 2016

dhood commented Oct 29, 2016

dhood commented Feb 21, 2017

dhood commented Feb 21, 2017

mikaelarguedas commented Feb 21, 2017

dhood commented Feb 28, 2017 • edited Loading

dhood commented Feb 28, 2017 • edited Loading

wjwwood commented Feb 28, 2017

dhood commented Feb 28, 2017

mikaelarguedas commented Feb 28, 2017

dhood commented Feb 28, 2017 • edited Loading

wjwwood commented Feb 28, 2017

dhood commented Mar 1, 2017

mikaelarguedas commented Mar 1, 2017

wjwwood commented Mar 1, 2017

dhood commented Mar 1, 2017

dhood commented Mar 1, 2017

mikaelarguedas commented Mar 1, 2017

dhood commented Mar 1, 2017

dhood commented Mar 7, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhood commented Mar 8, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhood commented Mar 24, 2017

dhood commented Feb 28, 2017 •

edited

Loading

dhood commented Feb 28, 2017 •

edited

Loading

dhood commented Feb 28, 2017 •

edited

Loading

dhood commented Mar 7, 2017 •

edited

Loading