Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WiFiClientSecure gets stuck with bad connection #3675

Closed
aignatiev opened this issue Oct 3, 2017 · 15 comments
Closed

WiFiClientSecure gets stuck with bad connection #3675

aignatiev opened this issue Oct 3, 2017 · 15 comments

Comments

@aignatiev
Copy link

Basic Infos

Hardware

Hardware: ESP-12
Core Version: 2.3.0

Description

The problem is that WiFiClientSecure gets stuck when the WiFi connection is weak. It just gets stuck for a random amount of time, usually it's just less than a minute or two, but at least once it was stuck for 20 minutes. It does not restart, it does not print any debug messages, it's just stuck. The test with 20 minutes was done by using a phone as a hotspot and slowly walking away from the ESP, the ESP continued to be stuck even when the phone was brought closer, the ESP continued to function normally only after the hotspot was switched off. I also tried to apply the fix listed in #3537, but it didn't seem to help.

Additional info: I'm making a sensor that logs data to flash memory whenever there is no internet, and uploads this data to Amazon AWS Lambda function whenever there is internet. The samples should be taken arond every 16 seconds and these random delays are no good. Would also be nice to control the timeouts, because the data can always be resent from memory. Please help me to fix this problem or give any advice how to overcome this.

Sketch

WiFiClientSecure client;
...
char * buffer[1000] = "Long JSON here";
client.write(buffer, strlen(buffer));

Debug Messages

none
@igrr
Copy link
Member

igrr commented Oct 6, 2017

Could you please enable logging and post the log for the situation when this happens? In Arduino IDE: Tools > Debug Level = Core + SSL, and add Serial.setDebugOutput(true); to setup function.

The reason why i'm asking is that i'm having trouble reproducing this issue. Logs will likely contain at least some information which can provide a clue. Thanks.

@aignatiev
Copy link
Author

I'm using PlatformIO instead of Arduino IDE, so I don't know how to enable other debug prints than Serial.setDebugOutput(true);. And here's what it gives after being stuck for 35 seconds (trying to upload around 200 bytes of data):

bcn_timout,ap_probe_send_start
ap_probe_send over, rest wifi status to disassoc
state: 5 -> 0 (1)
rm 0
pm close 7
f r0, 

Additional info: the connection is kept open, because the upload happens every minute or so and reopening a secure connection every time uses lots of data. The status of the connection is checked just before an upload with if(!client.connected()) return false;.

@igrr
Copy link
Member

igrr commented Oct 7, 2017

Can you try adding extra flags to CPPFLAGS? (or CFLAGS and CXXFLAGS)

-DDEBUG_ESP_CORE -DDEBUG_ESP_SSL

Once you do that successfully, you should see lots of debug messages from WiFiClient (starting with a colon) as well as messages from axTLS.

@aignatiev
Copy link
Author

aignatiev commented Oct 9, 2017

I suppose build_flags = -DDEBUG_ESP_CORE -DDEBUG_ESP_SSL did the trick and I see more debug prints. This thing is difficult to reproduce but this time it was nicely stuck for around two minutes and the pattern is very easy to see..

:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:sent 261
:ww
[1575443] HTTP header sent
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:ww
:wr
:er -9 0 1
:ww
Alert: close notify

and slightly later

:ur 1
:del

The code is supposed to first upload the HTTP header and then the JSON data, it was stuck for 51 seconds uploading the header and 65 more seconds for the JSON, which eventually failed.

@Rob58329
Copy link

Rob58329 commented Oct 12, 2017

Update: Thanks to [aignatiev] for pointing out that version [2.4.0-rc2] is now available! (I hadn't spotted this!) I can also confirm that this new [2.4.0-rc2] version fixes my below issue. (ie. with this [-rc2], a loss of SSL-internet connection does not block, but instead times-out and returns.) 20Nov17.

PS. Interestingly [2.4.0-rc2] seems to produce code which is approx 5% larger than the [2.4.0-rc1], BUT which uses approx 5% LESS dynamic-memory (which is what I am short of!).

PostScript: I note that "WiFiClientSecure::write" requires a significant amount of RAM ("ESP.getFreeHeap()") to work reliably. The documentation (eg. #2499) suggests it needs at least sufficient FreeHeap for a contiguous 16kB block in order to initiate an SSL connection. In addition, if sending large chunks of data, it is also necessary to ensure that you maintain sufficient FreeHeap. In my case, my code sent about 3kB of data (in 1kB chunks) successfully over an SSL link when using the UK-4G-mobile network. However when instead using the UK-ADSL-Broadband network, the SSL link failed about 90% of the time (and hung the ESP8266). I believe this issue was due to the apparently slower ADSL-network (at busy times) causing the SSL-Send-Buffer in the ESP8266 to get too large, causing the FreeHeap get so low it hung the ESP8266. My solution was to check the amount of FreeHeap before any large "WiFiClientSecure::write" chunk was sent, and if the FreeHeap was ever below approx 10kB (for my code), wait until it rose back above 10kB before performing the "WiFiClientSecure::write". (I also do this check when using the non-secure "WiFiClient::Write", but this seems to need to maintain FreeHeap levels of only approx 3kB to work reliably!) 8Dec17.

                                    --- 000 ---

Hardware: Wemos D1 mini
Core Version: 2.4.0-rc1 (via ArduinoIDE Boards Manager on a Windows-PC)

I am also having what is probably the same issue with WiFiClientSecure::print("....."), where the ESP8266 just hangs/blocks if the internet-connection disappears (and sometimes after a few minutes generates an exception and reboots). IE. There is no 5second timeout! It is most easily caused by simply disconnecting the internet connection (but NOT the wifi): My ESP8266 is connected to the hotspot of my mobile-phone, and then this is connected to the internet over 4G. If you just turn-off the 4G whilst the ESP8266 is in the middle of doing a sequence of WiFiClientSecure::print("....."), the ESP8266 hangs/blocks... (NB. the ESP8266 is still connected via wifi to the hotspot, but just no longer to the internet.)

Note1: Interestingly, if you re-connect the 4G, say 20seconds after the ESP8266 has hung/blocked, the ESP8266 will often (although not always) start responding again. It then appears to complete the remaining WiFiClientSecure::print's, and at least some of the time, sends all the data (both before and after the hang/block) sucessfully (ie. the complete data is received at the remote end).

Note2: Also note, that during the above ESP8266-hang/block, the "os_timer_setfn/os_timer_setfn" ISR routines still work, so it is possible to detect the above ESP8266-hang using a watchdog, and then perform a "ESP.restart()", however this is far from ideal.

Update: Note3: Finally note that there are (two) different ways that this "WiFiClientSecure::print" causes issues, (3a) the first being as described above, where the "WiFiClientSecure::print" just never returns (ie. blocks - presumeably when its buffer gets full), and (3b) the second being that the final ""WiFiClientSecure::print" does return, but as the internet connection has failed it's internal buffer is still full of stuff, which is never cleared (even after " WiFi_Client_secure.stop()"). In the latter case, the "ESP.getFreeHeap()" is often now so small that the ESP8266 generates an exception whenever you next try to do anything with the WiFi. And the only way to free up the Heap appears to be an "ESP.restart()" in this case (3b) too!?

PS1: Note that I do not believe that the above (Note 3a) hanging/blocking issue is due to running out of heap: (a) because of the above Note1 occasional successful restart and complete data transfer; and (b) as during this print process, the "ESP.getFreeHeap()" starts at 14176-bytes, drops immediately to 10976-bytes where it stays for 26 "WiFiClientSecure::print"s, then drops to 9456-bytes where it stays for another 31 "WiFiClientSecure::print"s, at which point the ESP8266 hangs/blocks.

PS2. Also note that if I switch to using "WiFiClient::print"s (ie. non-secure), then the loss of internet connection does not cause the ESP8266 to hang/block. Instead, the heap starts off at 28640-bytes, immediately drops to 25440-bytes, and then stays at this level for the rest of the "WiFiClient::print"s (ie. continues to return from the "WiFiClient::print"s even after the internet has been disconnected), enabling me to subsequently check if the remote-server is still responding.

A solution would be appreciated!!

@machadoroger
Copy link

I'm facing the same issue. I've tried almost (I hope) everything to fix it with no success.

ESP8266 just hangs for no good reason while trying to connect the server (client.connect(HOST, PORT)). It can take place after 20 min, 1h, 4h, and so on... The same happens with the time the ESP8266 remains hanging. I mean... sometimes it can take 10 min and sometimes it can last 2 hours.

For some reason, as @Rob58329 has mentioned, in the meantime, if I turn off/on the internet and re-connect, the ESP8266 starts responding.

Please @igrr , let me know if you need more information. Thank you in advance.

@igrr
Copy link
Member

igrr commented Oct 25, 2017

@machadoroger from your description it doesn't look like the same issue. The issue described in the original post is related to writes not being handled correctly. In you case it is the connection phase. Are you using WiFiClientSecure or WiFiClient?

@machadoroger
Copy link

@igrr thanks for your quick reply. I'm using WifiClientSecure. Just like @Rob58329 , when I'm using WifiClient everything works fine.

@aignatiev
Copy link
Author

In the meantime I tried to use this code with 2.4.0-rc2 and it seems that I cannot reproduce this issue with it. The current code has been updated and has a payload of around 1k, but I still managed to get it stuck with 2.3.0 and with 2.4.0-rc2 it just seems to work.

P.S.
I've found an easy way to use Arduino 2.4.0-rc2 with PlatformIO and for those who are interested, use this in platformio.ini

platform = https://github.com/platformio/platform-espressif8266.git#feature/2.4.0-rc2

It should work as long as the branch exists.

@zipiju
Copy link

zipiju commented Nov 24, 2017

@igrr I seem to have a similiar problem which can be reproduced (and fixed?) easily.

I use WifiClientSecure to connect to a web server and to download a file.
If during the file download I break the connection (in a way that IP is still assigned and ESP is still associated to a WLAN - fe. by unplugging LAN cable from AP) the ESP gets stuck in a manner that WDT won't fire.

It seems that any calls to client (at least client->connected() and client->available()) are blocking indefinitely.

I was able to fix this blocking behavior by removing SSL_READ_BLOCKING from ssl_ctx_new() fc. in WiFiClientSecure.cpp SSLContext() class constructor.
Not sure what other implications this have, but if none maybe it should be changed.
Or can this be changed in a way that user will be able to choose blockin/non-blocking socket?

And also noticed that WiFiClientSecure.h is missing timeout parm. in connect() fc., so I guess there is no way to specify connect failed timeout.

Thank you.

Also created Pull Request #3872.

@pjunni
Copy link

pjunni commented Feb 13, 2018

My question is how can I make WiFiClientSecure client.connect not hang during bad connection. I get timeouts much longer than defined (WIFI_CLIENT_TIMEOUT in code). Tested with values 1, 2000 and 5500 and each time the timeout is around 6600 - 7000 milliseconds. My version is 2.4.0. My code is:

client = new WiFiClientSecure; 
client->setTimeout(WIFI_CLIENT_TIMEOUT);
int startTime = millis();
client->connect(URL, PORT);
Serial.println(String("total time: ") + String(millis() - startTime));

@devyte
Copy link
Collaborator

devyte commented May 29, 2018

BearSSL is merged in #4273 , with alternate BearSSL::WiFi* classes. Although axtls-based classes are still available and even the default, they are planned for deprecation and then retirement, hence won't be fixed. Any issues with BearSSL-based classes should be reported in new issues.
Closing.

@devyte devyte closed this as completed May 29, 2018
@d-a-v
Copy link
Collaborator

d-a-v commented Jan 28, 2019

deleted

You'd better use an esp8266 :-)

@earlephilhower
Copy link
Collaborator

@macedolfm , what @d-a-v was trying to say is you're posting about ESP32 problems on the ESP8266 Arduino site. Pop to https://github.com/espressif/arduino-esp32 if you want help, because we can't do anything for you here unless you get that ESP8266. :)

@vishwasvogga
Copy link

Content-Length header shall be equal to the body length.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants