-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connection failure is not handled in the case of empty http_error #122
Comments
Thanks for pointing out the issue. I can take a look early next week. |
Hello, is there any updates on this? Thanks |
Sorry for the delay. I will take a look at it today and get back. |
@sarguez the main problem here seems to be that the IP address is wrong. Have you made sure the IP address is correctly set in the YAML file? Also,
The driver should not try to re-establish the connection if the HTTP error is empty. It should try to re-establish only in case of these following error strings pf_lidar_ros_driver/src/pf_driver/src/pf/pfsdp_base.cpp Lines 51 to 52 in 16dd98c
|
Hello,
In our case, for some reason we got an http_error that was not OK, but didn't include the strings you've mentioned as well. Instead, it was just an empty string. Would you like to handle this case in the codeblock block ?
and then the device got stuck in this state after I restarted the node. Do you see any possibility to handle with this case? |
I am not sure how will re-connection help in this case. Re-connection was meant for cases where the device is unplugged (device is still on) or the device is powered off. For re-connection to happen, the connection must be established in the first place. In your case, the device did not even initialized, so I don't see the point of trying to reconnect. Maybe @ptruka has some inputs as to why the HTTP error is empty? |
Hello @sarguez,
|
Hello, Here is the info I got from the sensor.
|
Hello and thanks for the information. Everything is looking fine there.
|
Hello,
|
Thanks for the information. I will try to replicate the issue with the driver version you mentioned. Edit: I just finished some tests on Ubuntu 20, ROS noetic with the driver version you are using (commit 682a1fb) to see wich http error messages I can achieve. A) IP address not availableThe IP address was set to an unused address in the yaml file. Result:
B) Wrong IP addressThe IP address was set to an address of an random network device in the yaml file. (In this case my router). Result:
@sarguez If you got any advice how I could replicate your setup in some way, I would try to do so. Otherwise I'm a bit clueless how we could find the root cause here. |
Hello, Unfortunately I don't know a way to reliably reproduce the issue yet, but it is encountered yet again on one of our robots. Some background again:
This morning we realized that the top scanner was not publishing on one of our robots. I got a coredump of the r2000_node by sending SIGABRT to it. This is what I see in the coredumpctl gdb. ` For help, type "help".
Thread 13 (Thread 0x7f881ffff700 (LWP 1857652)): Thread 12 (Thread 0x7f882dd6c700 (LWP 1857412)): Thread 11 (Thread 0x7f8816ffd700 (LWP 1857654)): Thread 10 (Thread 0x7f88177fe700 (LWP 1858165)): Thread 9 (Thread 0x7f882e56d700 (LWP 1857410)): Thread 8 (Thread 0x7f881effd700 (LWP 1857655)): Thread 7 (Thread 0x7f881d7fa700 (LWP 1857658)): Thread 6 (Thread 0x7f881e7fc700 (LWP 1857656)): Thread 5 (Thread 0x7f881f7fe700 (LWP 1857653)): Thread 4 (Thread 0x7f882cd6a700 (LWP 1857651)): Thread 3 (Thread 0x7f882ed6e700 (LWP 1857406)): Thread 2 (Thread 0x7f882d56b700 (LWP 1857497)): Thread 1 (Thread 0x7f882f183ac0 (LWP 1855320)): ` |
Thanks for sharing the coredump. @ipa-vsp and me are trying to find some hints in there. 2 more points:
|
Hello,
I brought more data :D State 1: Out of nowhere, we start receiving this. This seems to be a state the lidar side. It is probably the root cause of all these. I don't know why the connection may be reset like this. Once we get into this state, this log is spammed endlessly.
During this state, we occasionally get the 'empty reply from server' as well.
State 2: The error changes to protocol error : 120 for no apperant reason.
State 3: R2000 node starts restarting due to network failure. This goes on for a long time as well.
State 4: There comes the weird empty http error state.
|
Hello @sarguez, Could you please provide us with some additional information to help us better understand the issue you're facing? Specifically, could you share the launch file you're using and the r2000_params.yaml files for all the devices involved? Additionally, it would be helpful if you could record a rosbag while the issue occurs. Thank you! |
Hi @sarguez,
Other points
|
Yes, I don't think the network load is an issue. Sure, let's do more collaboration. Ideally, we can maybe provide you with access to the robot directly for debugging. Unfortunately I don't think I can share things like the actual launch files, bag files, logs, etc. here on GitHub. |
Describe the bug
Hello,
We encountered this during start-up for some unknown reason.
After the bringup restart, the node started spamming:
If I checked it correctly, it seems that the node is not trying to re-establish the connection when it receives an empty http_error.
handle_connection_failure terminates the connection and tries to restart it.
I guess restarting the bringup without running that made it stuck in some weird state. I speculate that it may be the cause for this one:
Environment (please complete the following information):
Sensor
The text was updated successfully, but these errors were encountered: