-
Notifications
You must be signed in to change notification settings - Fork 3k
Socket connect() function has no timeout? #13056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers. |
Are you using lwIP? I suspect so. The lwIP implementation has this behaviour as a backwards-compatibility fudge. There was application code that relied on You should be able to get it to obey timeouts by simply removing the pair of https://github.com/ARMmbed/mbed-os/blob/master/features/lwipstack/LWIPStack.cpp#L353 That will make the underlying lwIP implementation of connect be non-blocking (like all the other socket calls), which lets There should probably be a config option to allow you to remove that behaviour. If you're using an off-board stack, like an ESP8266 module, it's not terribly uncommon for them to have limited nonblocking support on things like connect. They might be harder to fix. |
Note that non-blocking or timing-out connects are a tricky thing. If the Mbed OS connect call returns after your specified time, that doesn't necessarily mean the stack has given up - the connect attempt is in progress and will keep going. The timeout is purely on "how long should this call wait for completion before handing back control", not "before giving up". Another call to If you want to try a new connection attempt, you must close and re-open the socket before you do so. You can't make two different connection attempts on one socket. None of this behaviour is unique to Mbed OS - it's standard POSIX/BSD TCP socket behaviour, respelt into a C++ API. You should be able to find assorted sources. |
Thanks Kevin, I'm using Mbed OS-6 with offline Studio so I assume that would be LWIP. That https://github.com/ARMmbed/mbed-os/blob/master/features/lwipstack/LWIPStack.cpp#L353 I have tried a few fixes: setting that to various values 50ms to 5000ms does actually work okay, but throws other errors like:
Using the ESP8266 does not have that 30 second socket connect delay although the Mbed driver is a bit sluggish and you have to wait a while before another connect attempt after null IP address. Could there be an override available similar here for https://github.com/ARMmbed/mbed-os/blob/master/features/lwipstack/mbed_lib.json#L113 Does anyone know where that 30 second blocking/time-out is value is defined? |
That would then be lwIP for IPv4. (You could use Nanostack instead for IPv6, but it doesn't support IPv4. Good chance its connect works better though. That would mean flipping The nonblocking change I suggested should have stopped it going through that code. It should be taking the I believe sticking a timeout there will break it because the connect will still be trying to report back to someone who has given up waiting. It's signalling a destroyed semaphore. I don't see an explicit 30 seconds, but there is a config setting for how many retry attempts to make on connect that could lower the total attempt time:
(Not having checked the code...) It won't be the driver being sluggish, it will be us having to wait for the device to finish the last connect attempt. By setting the timeout we're giving up waiting for a response after 10ms, but not changing the ESP8266s own retry time (not sure if you can on the ESP8266). In lwIP you should be able to do a |
@star297 please do not remove sections from the issue template! All fields are required to be filled in. |
@adbridge Will do, but I think this can be closed as an 'issue' in any case and added somewhere as an 'enhancement' if and when someone who knows this library has time to add a socket connect timeout. But a fixed 30 seconds timeout is difficult to work with. I did have a look but the code is vast. However I have attacked this from a different angle where the project can wait 30 seconds on a null IP connect attempt and periodically attempt a re-connection but is far from ideal. |
Did my suggestion of adjusting |
Yes it does Kevin 👍 I've added this to my mbed_app.json file
gives a stable timeout of 5 seconds. Apparently the range is 1 to 12 with, I assume, 5 second multiples (I tried setting 2 which results in a 10 second timeout). |
I think it might be the Changing that will speed it up, but will make the whole TCP implementation a bit more aggressive. |
Tried again, just to be sure, but no change. Using Mbed OS version 5.15.3 on Mbed Studio. Commented out those non_blocking calls in LWIPStack.cpp
This is a snip of the test program I'm using here.
Changing that 3000 to 1000 as you suggested...
Timeout changed to around 1.5 seconds but then gradually increased back to 5 seconds. |
Okay, I think I was mistaken about that nonblocking snippet. I thought it was temporarily setting a nonblocking socket to blocking then putting it back. Not unreasonable, the way it looks. In fact, I think it wasn't nonblocking to start with. That is the point at which it is set nonblocking for the first time. Change the code to
|
Works perfect!!
timeout() now functions with socket connect(). Could this change be applied to the library please? |
You're welcome to make a patch yourself. Main concern will be backwards compatibility, as I stated before. I think it may have to be behind a config option, leaving default behaviour as it is now :/ I regard this change as a bug fix, but |
I'm thinking it's a relatively easy change to make locally should anyone need to. I'll put something on the forum in case any one else has this requirement. Would you close this issue off unless you need to add anything else. |
Closing as requested. Thanks ! |
Description of defect
Huge problem with this, I have several remote devices connected on my local network. Some may be 'off-line' at times so I need the connect() function to back out within a few milliseconds. However whatever I try the function locks up for 30 seconds if the remote device IP address has dropped out.
When the remote device is active, the connection time is in the order of 2-3 milliseconds, I would be aiming for a 10 millisecond timeout.
I'm using TCPsocket, however this probably affects other sockets as well.
The 30 seconds is constant so perhaps there's a global value somewhere that I could 'tweak' for a temporary fix?
I think its a similar issue as mentioned here, however it needs to time out before the possibility of starting a new connect():
#https://forums.mbed.com/t/socket-connect-implementation-in-mbed-os/8055
This example below can demonstrate it, however I think its academic as the time out function is not employed with the connect() function (unless I've missed something).
Target(s) affected by this defect ?
Any
Toolchain(s) (name and version) displaying this defect ?
OS 5.15
OS 6 alpha-3
What version(s) of tools are you using. List all that apply (E.g. mbed-cli)
Mbed-online.
Studio-online
The text was updated successfully, but these errors were encountered: