-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WiFi based ESPHome BT Proxies error message - "Connection error occurred: EOF received" #87158
Comments
Hey there @OttoWinter, @jesserockz, mind taking a look at this issue as it has been labeled with an integration ( Code owner commandsCode owners of
(message by CodeOwnersMention) esphome documentation |
There is a chance this is related to #85432 |
Did the problem go away once you reflashed the device? |
If not try connecting the device with a cable and collect logs from it when the connection drops. Its possible its running out of ram and resetting over and over again |
@bdraco Maybe, for sure! I'm going to re-flash tonight (MST) with your suggestions from the other issue. Should have an answer for your in 6 to 8 hrs. With regards to uptime and ram. I do have uptime and memory sensors on each BTProxy. Here are some overlays for the last week. Uptime: Memory for the same time period: Why my core logs filling up with this EOF message, not sure what has happened only thing thing changed was 2023.2..., unless its referring to this: IF you noticed in the logbook entries here, it becomes unavailable often, but this has been happening since day 1 of the BT integration, so unless the log level detail l has changed that it throws it into the core logs now? |
We changed the logging a bit in this version so I'm guessing based on your comment above it's been going on for a while but we didn't have a good log message about why it was disconnecting. I fixed a problem with sending empty packets via esphome/esphome#4284 that could result in the eof but its only in the dev version of esphome. Are you using noise encryption? |
The EOF errors are continuing to log, 5 minutes in since re-flashing as per suggestion here. I first heard of noise encryption on the last release party, so basically no I'm not using that encryption, just whatever is default. There was a fellow who mentioned in the ESPHome discord channel (not Jesse) that the constant available/unavailable is "normal" because the ESP32 cannot share the BT and WiFi radio, so its constantly switching between BT/WiFi, hence the disconnects. This was months ago, don't know if things have changed or if there is any truth to this....does that make sense to you? |
I just realized I didn't catch logs while on serial as you asked above.... I'll get the logs to you tomorrow after the inkbird test....cheers |
with regards to the logs, I was monitoring the logs via serial connection and in another window, watching the log book entries here: In the 45 minutes that I was monitoring it, it become unavailable a few times, briefly, as shown in the log book entries, but no logs were shown (just a black window from the ESPHome website interface). Not sure why that is happening. Hopefully the new ESPHome update will solve the EOF issue, its counting up the core log entries. |
just wondering, should I try 1 update for ESPHome Dev? the eof messages are approaching the thousands now in my core logs. If you recommend a dev update, do I delete this entry in yaml
press save, then install and THEN update to dev? thanks |
Keep the yaml regardless of version. That change isn't in dev yet |
Trying out dev is a good idea though |
Awesome, I will update tonight! |
Hey, just wanted to update you. Last night I updated to Dev, 2023.2.0-dev Feb 4 2023, 18:28:40 (MST), and it was successful. However, I'm still see the logging.
and
I do want to say that, my BT proxies are operating just fine. I have 10 switchbots, 4 temp sensors, 6 ibeacons and I'm using my phone as a room presence detector (and it updates decently fast). The systems works really well (has been for months), the logs would suggest otherwise, but it is all good! Is it possible the the logging is just to sensitive? |
Hi, just wanted to capture this comment here from another fellow is experiencing the same issue as me. Also, ESPHome 2023.2.0b3 is available. Is this an update to my current Dev version of 2023.2.0-dev (Feb 5 2023) and if so, should I update? thank you! |
just wanted to provide an update that ESPHome 2023.2.0b3 still gives the error. However, I'm starting to wonder if these error messages above are related to another issue that I'm also affected by since updating to 2023.2 here: which has also made similar issues with my ZHA system seems I'm breaking the 10 second rule, left right and center! |
The profiler will only show whats running in the event loop. To see everything you need |
Ok, I'm going to try and install py-spy. I actually tried to do this before but found the instructions a bit too technical....I'll figure this out....or I'd be happy to provide you a chrome remote desktop connection session to you and you could directly connect to my computer while I do the physical parts (usb insert-removal etc). ...please let me know |
Ok, I got the key on my USB named CONFIG plugged into my HAOS instance: I'm uncertain about the next: " Do I simply type |
|
ok thanks! When I type
I get the message Protection mode is off for the SSH and Web terminal addon. What did I do wrong? |
I tried this instead:
got me a bit further, but then it says
my key is not good i guess? |
ok, still hitting a road block here: I'm using Notepad++ and puttygen generating the key by moving my mouse randomly on puttygen. copied the generate key from the "Public key for pasting into OpenSSH authorized_keys file" text box, pasted into Notepad++ Its an RSA key with 2048 bits. check that is ANSI (check) save file as authorized_keys (no extension) USB is NTFS 8GB Place the usb into the RPi 4 port in terminal addon (from community repo) I type
say it imports successfully, then type
all of which leads to root@[ as X option above]: permission denied (publickey). EDIT: it says "Use the CLI (eg. SSH to the SSH add-on on port 22)"...does this just mean to use the terminal add-on in HA UI? |
Made some progress! I followed this video, didn't understand 1 word he was saying but I was able to log in as root Ok, I believe py-spy is now installed or (copied at least into the correct directory) instructions then say "There should be a py-spy binary once you call unzip in one of the directories it creates" How do I call unzip? and for the PID process, it is 61 in my case? this but there is a previous directory, i,.e. the one before this command was sent:
baby steps! |
ok some more progress here, think I unzipped correctly here:
I basically said yes to all Then, I typed this:
while in I typed a few commands:
The instructions say to use PID 60, but I think mine is 61, so I tried both. They both lead to errors above. Do you know what the issue is here? |
.12 is an older version. Try .14 https://github.com/benfred/py-spy/releases/tag/v0.3.14 |
@Anto79-ops assuming the defaults fix it, can you try the values in this PR as well? |
Some good news to report! with these settings:
There has been a significant improvement, in fact, since changing the parameters, I've seen only 1 EOF warning message in the logs. Typically, I would would have seen 40 to 60 occurrences by now. Trying the PR suggested values now, as below:
I'll post an update in a couple of hours. May I ask, would this change affect the performance (good or bad) of my proxy network? Thanks |
ok, I think its fair to say that the PR suggested values are not as good as the default ones in terms of improving the EOF issue. In 7 minutes, I've already got 7 EOF occurrences since changing the interval and window times. |
Thanks for the report. Glad the defaults are working. We probably need to do a bit more experimenting to get values that work best for the Wi-Fi ones. The downside is it means less time listening for Bluetooth and more time preferring Wi-Fi but likely that isn't an issue unless you have devices that broadcast infrequently |
Anytime. BTW, the generic yaml has them at 1100 ms...so this is where it came from in the first place. https://github.com/esphome/bluetooth-proxies/blob/main/esp32-generic.yaml |
I went back to default and things are going well. In about 10 hrs, I only recieved 12 EOF messages and about 7 api ping messages. Massive reduction of by almost 97% One other thing I noticed since changing the intervals, is a reduction in CPU usage, by almost 5%. Does this make sense to you? You can set here, green diamond was when the change to default was made... Maybe because it's logging less, it helps the CPU a bit. |
Rebuilding the connection isn't cheap but I wouldn't expect it to be that expensive. Also wouldn't be incredibly surprised if it was that expensive though either. |
@Anto79-ops were did you fail with
|
@nagyrobi thanks, it fails when I try to start it.
bdraco mentioned to update rust, what version do you have? and how do I check my version and/or update if necessary. thanks |
Hey! |
@marsp88 thanks for sharing I actually do have a OpenWRT router (Linksys WRT3200 ACM), but its running default firmware (from Linksys what it had out of the box. and unless WMM is disabled by default, it should still be disabled...because I have no where that setting even is!) AND i actually have the router wifi disabled because I'm using 4 access points scattered throughout my home, which is hardwired to the router via ethernet, all my wifi devices connect to these access points, which then communicates to the router that way. The router actually thinks I have only ethernet wired devices, as a result. These wifi access points are marketed as wifi boosters from my ISP (they are not mesh networks). Would this issue still apply? |
If that's the case, that doesn't qualify as an OpenWRT router. You can only name it like that if the firmware running on it is a real OpenWRT build. Otherwise, it's just a simple Linksys WRT... Besides, that wouldn't matter anyway, especially when addressing WMM, because you just said its WiFi is disabled, and you use 4 other, separate APs. In that case, is WMM enabled on those? That would matter. What kind of WiFi boosters do you use? Are they broadcasting the same SSID? |
Its mostly a mystery what these boosters are, other than they provided by my ISP and are called "Telus Wifi Boosters" Everything is dumbed down from the ISP, but its the only thing I use from them. However, if you look a little deeper they have model number on them Arcadyan Technology WE410443 Wi-Fi Repeater They are all broadcasting the same SSID. EDIT: checking the booster settings, there is no such setting for WMM. |
Is it this one: There's something fishy imho according to pages 9-10:
These are not ESP-friendly functions... How's your Home Assistant connected to the network? Wired or WiFi? |
Yeah I wonder myself these are great points you mention. But I'm not going to lie Wi-Fi in my home is actually pretty good. Some of our laptops can hit 500 Mbps on wifi, but usually it's 80 to 300 Mbps. Wired devices devices hit up to 1.2 Gbps (fiber optic internet). With about 96 devices on my wifi/ethernet network, about 30 are on ethernet, including HAOS, and my Ubuntu mqtt broker. It's clear though, ethernet has has the advantage. I saw the option to turn off the band steering on the boosters..... but honestly if it's just the ESP devices that are causing issues I might as well just upgrade them to olimex ones. These are my actual booster devices, doesn't look exactly the same as the photo but probably the insides are the same because it's the same model. |
To isolate if it's just this magic AP system being the culprit, how about trying:
And yes, you have a lot of WiFi devices sharing the airtime of only 4 APs, it may simply be the issue of overloading them. Strangely the manufacturer doesn't give in the specs the max supported clients number... Given your environment, I'd be sure to only use Ethernet based ESP devices (with ext antenna) for Bluetooth proxying. It's all maxed out... |
awesome information here, thanks! Let me look into these and see what I find. I should've disabled band steering long time ago! Thankfully, in the advanced options of these APs, I can see which and how many devices are connected to them. There seems to be something choking my event loops (as I'm getting many messages in my logs, that...xx is taking more than 10 seconds even long after HA has restarted, and even things not related to ESPHome are saying this). I'm hoping the issue will be obvious once I get |
ok, I went in to the settings for these boosters and I can answer some questions:
These AP actually put out some detailed logs! So I downloaded the logs and searched for 1 BT Proxy to see its behavior at a given time. This proxy, 40:22:d8:4c:a0:c4, is connected to one of my AP (it does not hop, because this mac address does not show up on other AP. This is only a sample of what I see in the Family Room log. Does this say anything? not sure what IF or OWL., but notice it says "connected" a lot. Signal for this BT proxy is between -40 to -60 dBm, so its not a signal strength issue.
|
looking into my logs more, I really think this not a wifi issue. Here's what I found. Here's an example of a disconnect that occurred in HA as reported in my HA Core logs
this particular ESP32 has a mac address of In my booster APs logs, this particular device only shows up on one of 4 of booster APs in my home (that's the one that it is nearest to), and shown below is 1 hr before and 1 hr after that EOF event as reported by HA. I've placed asterisk (****) every time an event as logged for this particular BT proxy's mac address:
If the boosters AP log a connect event (I don't know of they do), it didn't happen during the time that HA reported an EOF event. Which may suggest that the device is not disconnecting from wifi, but disconnecting from HA, instead....maybe? |
I have the same problem with one of my two BT proxies - every few minutes this error appears in my HA log. But the connection is only disconnected to HA, not to the WLAN. |
Thanks @MikeDeltaHH for reporting. At the moment, we're stuck at getting py-spy installed on my system to check for event loop blocking. Py-spy unfortunately still has a bug of some sort that prevents me from running it. Hopefully soon it will be fixed. You may have better luck getting it installed if required. I still have the EOF messages in my logs, hundreds of them over days, but my BT Proxies and BT network seem to be working well. |
esphome/esphome#4924 should reduce the load on the esphome device |
Core needs #94138 as well to it be able to us the new functionality Currently scheduled for 2023.7.x |
Thanks bdraco! I will follow these. |
Closing per discussion in #beta. Considered solved in 2023.7.x. Along with esphome 2023.6.x |
The problem
Hi!
Since updating to 2023.2, I have been receiving these messages in my log file for each of my BT Proxies running on ESPHome 2022.12.8
there is a second message that may be related, but it is not as often as the first above:
What version of Home Assistant Core has the issue?
2023.2.0
What was the last working version of Home Assistant Core?
2023.1.7
What type of installation are you running?
Home Assistant OS
Integration causing the issue
ESPHome
Link to integration documentation on our website
https://www.home-assistant.io/integrations/esphome/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: