-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Adapter disconnected, stopping" (Skyconnect + Multiprotocol + EZSP) #21198
Comments
To continue the conversation started here @Nerivec : #21140 (comment): I have provided a herdsman debug log but it is very difficult to get it just before a crash which can happen at any time. I have no other It is therefore not certain that the problem is due to the suspension of the USB ports. However, the In any case, the I hope I've given you enough information to help me. |
You are using silabs multiprotocol addon, right? If so, your firmware version shouldn't be 7.x.x, multiprotocol RCP firmware are currently all 4.x.x. https://github.com/NabuCasa/silabs-firmware/tree/main/RCPMultiPAN Also make sur the settings in silabs match whatever firmware you pick (baudrate). You'll probably want to go with a 460800 one (but older nabu casa ones are 115200). |
You're right because I did some manipulations in this direction just before your message and I think there's a clue, but I don't understand it any more: I've just used https://skyconnect.home-assistant.io/firmware-update/ to flash this firmware: Next:
to:
So I put 115200 back into SLM and Z2M and everything starts fine, but in Z2M/settings/About I read "Coordinator revision":7.3.1.0 build 0 And in this configuration I always get "Adapter disconnected, stopping" error messages (obviously...) How can I see a version 7.3.1.0 in Z2M after flashing an 4.4.0 RCP version? |
Did you make sure to disable anything that could be using the adapter before starting the flashing procedure? (ZHA, you shouldn't have at all in your case), silabs multiprotocol addon and Z2M. PS: If you are not using the multiprotocol (other radios than Zigbee), I'd recommend you use an NCP firmware, with just Z2M. SLM/Zigbeed adds a layer of complexity and no major benefit if you don't have a huge network of devices. |
I flashed the adapter from my PC, using the website mentioned above. So it wasn't connected to my Khadas with Home Assistant. Yes, I was thinking of going back to the NCP firmware to test, but we agree that if I do that, I'll have to re-pair my 25 devices one by one (bearing in mind that some of my Zigbee relays are in wall switch boxes, in an outdoor electric gate post, etc...) ? :(( Could there be another explanation for the fact that Z2M "sees" a 7.3.1 version when I'm on RCP 4.4.0 (an issue with a file in the /zigbee2mqtt directory of the Z2M addon in HA, or something else...)? |
If the flasher is telling you 4.4.0 after flashing, then you're good. I looked, apparently Home Assistant reverted the version of the multiprotocol addon SDK back to 4.3.1 (again), which would report 7.3.1 EmberZNet (what you're seeing). And might explain a lot of you troubles. They created "special firmware" too... https://github.com/home-assistant/addons/tree/master/silabs-multiprotocol/rootfs/root On the "re-pair" question, when I tested the addon a few months back, I had to re-pair everything when I switched from NCP to RCP (which also upgraded the EZSP version), but not when I switched back to NCP at the same version level as the RCP (the equivalent, as described in first link...), Z2M kept all my devices talking without issue... But I heard mixed reports on people who also tried, so... 😢 |
Can you try to catch the logs like you did before? There was a lot of "noise" in your previous one because of decryption failures (I think?). Hopefully these will now be gone. PS: The Z2M baudrate won't matter in your current setup, only the SLM one will be used (since Z2M is connecting through socket to SLM). |
Here's an updated file: About the Z2M baudrate, I believe you but if I don't put these parameters in the Z2M add-on configuration, it won't start:
|
I am facing a similar issue, more details here : Koenkk/zigbee-herdsman#910 |
Here is some log that lead to shutdown of zigbee2mqtt : https://github.com/Koenkk/zigbee-herdsman/files/14170047/lixee-2.log |
@Nerivec feel free to ask more logs. |
Ah, finally a possible lead :) |
@merlinpimpim indeed , a tiny lead. is Lixee the cause or the consequence? that is question. @Nerivec looking at my logs, we see a kind of pattern, having the same sequence : first , we do have this log line: then, we have then , few millesecondes after we have some error log : followed by hope its helps. |
I can confirm this behaviour with my HA Yellow, Multiprotocol firmware and Z2M 1.35.2, with 1.35.1 it works. Not Lixee device in my network. But I would have to get more precise log messages to find out where exactly the error happens, the only error I have for now is "Adapter disconnected" |
@jhbruhn could your HW and exact FW revision? and share some log with zigbee-herdsman debugging enable ? by running |
@merlinpimpim Could you also list the router/endpoint devices running in your zigbee network ? I higly suspect that intensive polling of non Reportable attributes is part of the root cause. |
@Nerivec even shorter log : Also, among the seveal test case, most "Unexpected packet sequence" occurence seen are related to ACK |
Disabling polling on Linky_TIC / Lixee by setting an empty string on param tic_command_whitelist is a working dirty workarround that stop unexpected shutdown of Z2M.
But by doing so we loose the non Reportable attribute values, as described here : EDIT : empty tic_command_whitelist disable also Reportable attributes, so it is not a evidence for any rootcause. |
At first, a data is sent : But Z2M never received this ACK (2) @Koenkk @Nerivec is this expected ? Full log here : |
The Coordinator Type is
to be specific a HA Yellow running the most recent (Gecko SDK 4.3.1) Coordinator revision is
Logs are attached here: (Unfortunately it seems like HA truncated the logs at the start) The devices in my network are:
|
Hello, Here are (all) my devices:
If it helps... |
After checking, the common link between my devices and those of @jhbruhn (excluding Lixee) are SonOff SNZB-xxx devices, including SNZB-04s in particular. |
I'm still looking into this, but my list of tasks is growing faster than my two eyes can follow 😅 I've fixed a couple of problems raised from your original log @merlinpimpim, but not particularly linked to LiXee... @jhbruhn By any chance do you have any older logs from 1.35.1 or before? Older versions were doing internal restarts (without reaching Z2M level "Adapter disconnected"), that you might not have noticed, but they could still have been happening. |
Absolutely, I am running 1.35.1 right now and also noticed that devices sometimes do not react in that version. I will try to collect some logs about that behaviour the next time I notice it. But the problems with 1.35.2 occur right after starting Z2M, making it unusable, that does not happen with 1.35.1. Maybe the problem is different than that? |
Then likely different yes, and (without seen the error can't say for sure) there's a good chance that is fixed already. I'm just looking for a couple of other buggers before I create a PR. |
@Nerivec : I think @romarysonrier is on to something because since I did what he said, Z2M hasn't rebooted for 2 and a half hours, which is a record since the 1.35.2 update (Z2M currently reboots between 1 and 8 times an hour for me...). Parameter The issue with this workaround is that no more attributes are exposed. So it's as if the device doesn't exist and/or serves no purpose: However, this shows that the issue is indeed with Lixee for our cases. I'm going to leave it until tomorrow morning to confirm that it didn't restart overnight. |
Glancing at the github for that device, tomorrow (if leaving it like that worked) you might want to try enabling only the values that interest you in that (long) list. |
On my side , I tried yesterday :
And then i did those testing cases :
Thus , I will retry test case B ( all attributes belonging to the same cluster = 0x702) for a longer time frame, to gather more log and clues. But is the lixee device still send updated to all Reportable attribute from all cluster (haElectricalMeasurement+haMeterIdentification+lXeePrivate+seMetering). @Nerivec : So it so mean that the abnormal behaviour is also trigger when using only Reportable attribute update |
no reboot for 3 hours :) |
@Nerivec i found out that zigbee-herdsman ezrp uart driver doesn't support NAK message properly. Here is a proposal for handleDATA()
ps : inspired from https://www.silabs.com/documents/public/user-guides/ug101-uart-gateway-protocol-reference.pdf But zigbee-herdsman doesn't support NAK itsef : handleACK doesn't do retransmission. don't know yet if really necessary. @merlinpimpim Regarding the root cause : My best assumption is that on my slow CPU (Rasberry pi zero) + all the MQTT updates required by the huge number of attributes sent or polled by the lixee device is so long to handle/execute by NodeJS that the EZSP device trigger retransmission, that current implementation doesn't handle properly. |
@merlinpimpim what is your CPU running nodejs ? any source of latencies between Nodejs and the ESZP device ? |
I'm using HA + Z2M + Skyconnect on a Khadas Vim1S (https://www.khadas.com/vim1s). |
@romarysonrier I'm looking into a more permanent solution to fix these sequence & related issues (that's not the only one). We'll see what I can come up with, needs some automated tests and all... 😉 |
@Nerivec i made this PR proposal https://github.com/Koenkk/zigbee-herdsman/pull/911/files , which is far from perfect , but still improving the situation with NAK messages on rx side. |
On my case (Z2M running an a single core raspbery pi zero) , i think the root cause is now clear , at least for me:
2./ all those updates trigger a huge amount of MQTT updates (quiet big dataset because lots of attributes count) + log writes ( if log_level=info), which generate a peak of CPU usage. So @merlinpimpim if you doing this :
You should see some improvement , as you lower the amount of CPU required to handle lixee device . For me , It is running with those settings (+ disabling frontend) since 8 hours without a single Z2M shutdown. |
@romarysonrier : I've made the changes in the Lixee.js file and it seems to work after resetting the Reportable attributes to "All". No crashes for 2 hours 😁but I get this type of message in the logs from time to time:
|
@merlinpimpim "Failed to read zigbee attributes" are only generated by polling / read from Z2M in lixee.ts. BTW, my setup is still up and running after 21 hours |
Hi guys, Since the 1.35.3-1 update of z2m, it would crash only a few seconds after start. I want to know if switching the Linky from Historic to Standard mode would change anything? (Normally it should just be a call to the power provider) Edit: in the meantime I simply unplugged the Lixee and z2m held for 24h max. But then failed again with the same issue. It is not only the Lixee then? |
@remidebette : switching the Linky from Historic to Standard mode=> more attributes => more mqtt update => more cpu load => more issue |
Not sure to understand the question, I have a Sonoff USB dongle Plus EFR32MG21 plugged on the raspberry pi 3+ that hosts my home assistant I did not configure the dongle firmware to do multipan RCP (more precisely I had tested it at some point but I then rolled back the firmware to NCP 7.3.1) |
yes your dongle does serial over USB communication. Could you list you zigbee devices ?
This setup is running since 4 days on my side ( but i lost few non reportable attributes, the ones taged as read only in https://github.com/fairecasoimeme/Zlinky_TIC/tree/master?tab=readme-ov-file#subscription-optarif-values) |
It's not that simple because to do it I have to connect to the Z2M container and make the change to the lixee.js file. However, I've already had Z2M restart because of Lixee even though I think I've made the change in the current Z2M container... Anyway, by limiting the reportable attributes to the 6 I'm interested in and setting the top parameter to 2, I haven't had a single reboot in 24 hours. |
it seems that @mildis is the most active contributor lixee.ts, would be great to have some help... @Koenkk : I would be great to implement a new feature to lower CPU usage + merge Koenkk/zigbee-herdsman#911 :
the main benefits would be to decrease the amount of attributes updated by polling ( Reportable attributes doesn't need polling), and then to reduce the number of MQTT updates sent to HA. |
Hi @romarysonrier, |
Just in case, without any changes to the converter or other files, if I delete the Lixee from Z2M 1.35.3 and start again from scratch, I get this error message (4th attempt, but same message every time):
|
Hey @merlinpimpim, sorry I've been a little buried in code lately 😉 Did you try to force pairing to a specific device (instead of just "all" when you "permit join")? Try the coordinator first, if it's in range, otherwise the nearby routers one by one. You may have a router that's picking it up and making troubles... Since it's a timeout, seems a message is getting lost somewhere (or not being sent to the right place...). @romarysonrier In case you have a test environment where you could put the LiXee up against ember, that'd be great (though don't go breaking your house 😄). This device clearly is a big spammer, which is one of the items missing from my "TODO tests" list! |
@Nerivec: |
This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 30 days |
What happened?
I regularly get this type of message with my adapter: "Adapter disconnected, stopping".
In this case, Z2M restarts.
(Continuation of the conversation started here: #21140 (comment))
What did you expect to happen?
Correct the problem by finding out why it happens
How to reproduce it (minimal and precise)
Nothing, just wait and read the logs
Zigbee2MQTT version
1.35.2
Adapter firmware version
7.3.1.0 build 0
Adapter
Skyconnect (Multiprotocol)
Setup
HA supervised, Khadas Vim1S (Docker)
Debug log
logs.txt
The text was updated successfully, but these errors were encountered: