-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WiFi] Fix issue where WiFi could not scan to reconnect #3650
Conversation
Apparently this wasn't enough: letscontrolit#3650 Fix intended for [this report on the forum](https://www.letscontrolit.com/forum/viewtopic.php?p=53062#p53062)
With the latest Vagrant custom build I have encountered a reconnect issue - one node after firmware upgrade could not reconnect and it was necessary to power off / on, then it conencted quickly.
I have just found an option "Use Last Connected AP from RTC:". |
That RTC option used to be the default option, to speed-up reconnects. |
Can you connect it to the serial to get some idea of why it is crashing? |
Unfortunately not, both nodes have no serial port attached. BTW. the primary AP has BSSID broadcast disabled. When I enabled it (the AP rebooted then), the node trying to reconnect unsuccesfully (even after power off/on) finally connected (don't know if it was due to AP reboot or BSSID broadcast change). Both nodes of course have enabled the option "Include Hidden SSID:" |
Are those APs set to only allow 802.11n ? Can you set the number of WiFi scans to 2? |
No, the Primary AP is b/g, the Backup AP is in mixed mode 802.11 b,g,n. |
There is a checkbox for it to disable AP mode.
Yep, this does perform multiple scans increasing the chance it will find the best AP. |
Ahh, really, "Do Not Start AP:" |
Yep it does also help on hidden SSIDs.
As an extra, if a connection is considered stable, the list of candidates is cleared and the active one is left as the only one in the list. |
Thanks for the explanation, although I don't fully understand this complexity (from my perspective), I believe it's the best alghoritm, except trying hidden SSID's with all password combinations. Is it really necessary? I thought connecting hidden AP is the same process as with visible APs. |
For a hidden SSID you only know the BSSID and channel. If I try to use the standard |
OK, thanks again! |
"soon" or "immediately" ? |
Not immediately but as far as I can remember, in several minutes, then crash repeated, when the nodes could not reconnect, I turned them off and on. Still experiencing a connectivity issue so I enabled the BSSID broadcast on AP and after succesfull connection disabled it. So currently the nodes are after cold boot and connected to hidden primary AP. |
OK, will perform some tests here with hidden SSID to see how it behaves. |
Yes, ESP8266. Please note in my case the AP signal is quite low which can also make troubles, hardly reproducible when the signal is strong (then ESP nodes are quite stable). Currently I have RSSI -91 dBm and -87dBm and both ESP nodes are working fine, the web GUI is very quick. |
So far one node (with more plugins) crashed twice, I suppose it was due to exhausted RAM for some reason:
Second node is still running without crash:
|
Can you test that specific node with a build from this PR: #3680 |
I am not sure. I can build from PR but in general I need the 'custom_IR_ESP8266_4M1M' binary which I have in the pio_envlist.txt file. So the question is how to specify the beta build exactly (perhaps put custom_beta_ESP8266_4M1M Edit - tried to compile with 'custom_beta_ESP8266_4M1M' in pio_envlist.txt but it did not create the .bin file: |
I was first trying to describe what you needed to change and then I realized it is way easier to just add another PIO env: |
Thank you, I did not find the proper string 'custom_beta_IR_ESP8266_4M1M' anywhere. It would be great if the pio_envlist.txt.sample file could contain all valid build names. Compiled and one node upgraded, I'll put a feedback here couple days later. |
Nope, I think it has to do with the "broadcast SSID". |
By the way, the unit now does show a "2nd heap" in the system overview? |
The node crashed again after 114 minutes and does not reconnect to hidden AP. So unfortunately it's less stable than earlier firmware in my configuraton. |
The current status (the node is now reconnecting to hidden AP succesfully):
The second upgraded node with only DS18b20 and TSOP4838 plugins and MQTT controller:
|
Hmm watchdog timer is still happening so it seems. |
Yeah it could be the reason... Even with previous firmware release this node sometimes crashed when I rebooted AP. But it looks in the beta firmware the 2nd Heap is not used in fact. |
The 2nd heap is mainly used for 'temporary' stuff. What you can do to see it is working, is sending messages to some controller with 1 sec interval and set the controller to only allow to send messages with a minimum interval > 1 sec. |
OK, I understand. I have minimized the queues to save memory in past so I can try to increase them again. |
An HTTP connection is not kept open. The reason why this may lead to crashes is not 100% clear to me, but it seems like the set timeout is not used in the core libraries when either doing a DNS resolve (if needed) or during the connection phase. |
Well, after last crash the node does not reconnect to hiden WiFi AP again. This is the worst case which can happen and this is unacceptable for me so I'll have to downgrade this node to earlier FW... After downgrade to custom FW built on 20210531 there's no reconnect issue and it's reconnecting much more quickly and reliable. |
OK, I found some issue related to WiFi reconnect. ESPEasy/src/src/ESPEasyCore/ESPEasyWifi.cpp Lines 876 to 896 in 2c6a742
|
N.B. I made a new PR for it: #3702 This is not yet included in the "core 3.0.0" PR, but just to fix the WiFi connection issues since they are not specific to core 3.0.0. So you can also try to build one for that PR to test if it solves your connection issues. |
Thanks a lot but as there's a stability issue as well, not just a WiFi reconnect, I am not going to upgrade the primary ESP node yet. |
Yep, there is no "beta" for that custom build ... I think... outside the "core 3.0.0" PR. |
Thanks but unfortunately the compilation stopped with error: Environment Status Duration custom_IR_ESP8266_4M1M FAILED 00:06:01.495 |
Ah, yep, that's my fault as I developed the commits first on another branch and then cherry picked it to this PR branch. Will fix it. |
Done... |
A great job, thank you! I'll give it a try again.
FirmwareBuild:⋄ | 20114 - Mega |
BTW. Today upgraded another ESP node with the firmware build created 3 days ago with PR 3702. So far this is the status:
|
FYI yesterday in late evening I have recompiled the custom firmware again from latest sources where PR3702 is already merged. Upgraded 3 ESP nodes, it's early (uptime 9 - 10 hours) but so far no crash and no WiFi connection issue. |
Great! |
Well, the primary node crashed but at least it reconnected succesfully (it took some time - about 4 minutes - but the WiFi signal level is quite low there).
|
That setup should benefit from the 2nd heap in the core 3.0.0 PR. You really have low free memory. |
I just looked through your posted config. |
Yeah I'll try to compile the beta firmware again a couple days later and give it a try again. The current FW needs to be tested on another nodes as well but I see an improvement. |
Just tried to compile the beta firmware but it did not work: |
The |
Unfortunately now even custom_IR_ESP8266_4M1M can't be compiled: |
Ah looks like an issue introduced by the last PR I merged. |
No problem, please just let me know when fixed so I can try compiling again. It's not urgent / important at all. |
OK, can you try again to build it? |
Yes, compiled without issue and updated 2 nodes. Thanks a lot. |
As reported by a number of users since the latest mega-20210503 build.