-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tasks left running after DDS exits. #379
Comments
Attaching tmp dirs for both runs. |
When running the above test repeatedly (
|
There is also one run of this test where all of the devices and also some agents keep running: The error in the log gives a bit more details:
Attaching its logs: (it is not the one with |
Affected DDS versions: 3.5.10, 3.5.14
One (or possibly several, only reproduced with this one so far) of FairMQ tests, that is executed with DDS, frequently leaves behind running devices.
The test is:
https://github.com/FairRootGroup/FairMQ/blob/master/test/sdk/_topology.cxx#L264-L277
It runs this topology:
https://github.com/FairRootGroup/FairMQ/blob/master/test/sdk/test_topo.xml
(one task + group of 5 tasks)
The test queues a state change operation via custom commands with a timeout of 1ms, and when that expires (the value is intentionally set low to test the timeout function) proceeds to exit.
Frequently - in about 50% of cases, one of the tasks is left over. Specifically, it seems to be the single task and not one of the group.
Examining the device logs, and comparing to a successful run, the left-over device is left running where it would usually receive device shutdown request (signal 15). Sending SIGINT or SIGTERM to the leftover device leads to an immediate and proper exit, so it is not anything that hangs on the device side.
Devices from the task group do the right thing in both failed and successful runs. (EDIT: I've now also seen the devices from the group be leftover).
Examining DDS logs, one line stands out in the unsuccessful run:
The text was updated successfully, but these errors were encountered: