-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zigbee2mqtt crash after a while running using tcp slzb-06m, sometimes even with restart it does not start. #23151
Comments
I am using slzb-06 and mine is more unstable also |
Add |
Slzb-06m with ember and rtscts on false, same here. If I've switch back to ezsp, work like a charm. |
I have two SlZB-06M, each connected to separate LXC container with Z2M. First works every time, after every reboot, second crash every time when I restart container. Randomly connects after some time, or not at all. FW: 7.4.3. |
@maciekdnd Can you confirm the behavior is still occurring with latest dev? Since you have the two scenario, it would be great if you could share clean startup logs for both, and one when it crashes (both Z2M logs and logs from smlight ESP interface). |
@Nerivec i've two istance with SlZB-06M. I can do some testing tomorrow, are you referring to version 1.39.2? |
@mirkochip88 Latest dev branch. I was mostly interested in the fact that maciekdnd had one that worked, and one that didn't with seemingly same setup. Do you have that same behavior? |
Yes, I have two instances, exactly same setups, but unfortunately I'm unable to update it to dev as this is production environment 460km from my place at this moment. In couple of days I will visit that site, and maybe I'll try do something with this. |
I have two instances both with SLZMB-06M with ethernet. The main one at home that I have never migrated to ember that runs with the official component. The secondary one is in the garage, so few devices and possibly not too much trouble to pair if necessary, and it runs on docker in a VM lxc on proxmox. New firmwares have also come out since the last time I tried to use ember, tomorrow I will migrate again and update you. |
@mirkochip88 thanks, that will hopefully help. |
@maciekdnd Same fixes currently in dev will be available in September 1st release, so, pretty soon, if that's easier to update in your case. 😉 |
@Nerivec, that's bummer, I will be back day before 1st, so I have to move that update to my next visit. At least I can update core for my controllers, but great news anyway, thanks! |
This comment was marked as outdated.
This comment was marked as outdated.
Here we go again:
I've enabled the watchdog for now, and seem to restart correctly (The error seems to occur exactly every hour, very strange). Any suggestions? |
Can you set log level to |
|
Can you see what the logs in the ESP interface (SMLight) say around the time of the crash? It's acting like the adapter stopped responding to Z2M (but without failing the TCP connection). |
|
@darkxst Can you take a look at this issue? |
Do you have automatic zigbee updates enabled? |
No |
What is the IP lease time on your DHCP server? |
It is set to 10 days, but obviously I have the IP reserved for ZigBee routers. With EZSP I have no problems in the same network configuration. |
EZSP/EMBER only affects zigbee2mqtt, nothing changes in coordinator operation, so I think it's an EMBER bug and not coordinator firmware |
Since the same model adapter works in one setup, and not in another (per maciekdnd feedback), even though both are roughly the same, seems like we're missing a parameter in the equation. |
The only 3 differences are the ZigBee Firmware Router version(I'm not sure the latest version is compatible with ezsp), the RTSC set on False and obliviously Ember |
Both controllers were configured with ember right from the beginning. The first thing I did was to update firmware to the latest version (z2m and core) before configuring z2m. I have 3 controllers right now, all of them have same exact settings, except static IP for every controller. All of them are in the same vlan. All settings are exactly the same for each LXC container, all of them were configured using configuration.yaml file before starting z2m. Containers were configured using this script: https://proxmox-helper-scripts.vercel.app/scripts?id=Zigbee2MQTT I remember that at the beginning, when I didn't know how to restart the controller, I waited and blindly changed various things (for example sudo systemctl start zigbee2mqtt without deleting file), then sometimes one controller would restart, but also the second one would automatically start, where I didn't change anything . After that I wasn't able to run them at all until I found a way to remove them. I don't know if it's related, but even when I have permit join set to false, some devices still can join network. Devices removed (force remove) and new ones (like IKEA light bulbs). |
Can you check if this reproducible (does it always fail when there is a See this on how to enable debug logging. |
Yes, for me every single time in environment I mentioned previously. With this file present in the data folder, z2m (1.38) will fail to start. Unfortunately this site is nearly 500 km from my place right now and I should not restart or update anything there (especially remotely) as if something goes wrong, I won't be able to go there to fix it (this is production environment). I can do it as soon as I visit this place, but not before end of the month. Maybe, if something goes wrong, or restart will be needed I can do it then. |
After months I finally managed to get to grips with it. The problem was on my proxmox node, which uses two nics, one of which had both home assistant and the vm with docker and z2m. By moving the VMs to the other nic I no longer had any problems. I noticed this because I also had problems on other VMs. It seems as if the nic, especially with peak CPU loads, stalls. |
@mirkochip88 that's useful info! Would you mind making a PR for this page? |
@mirkochip88 I'm trying to understand the core of your problem with two NICs. |
Any update on this please? |
It looks like a hardware problem with that network card. Now I have moved all the VMs to the other NIC and everything works great. |
What happened?
Even with a full reboot, sometimes it does not start correctly. It keeps crashing randomly.
What did you expect to happen?
Works correctly, reboots when needed!
How to reproduce it (minimal and precise)
Using SLZB-06M with darks ncp firmware: https://github.com/darkxst/silabs-firmware-builder/blob/main/firmware_builds/slzb-06m/ncp-uart-hw-v7.4.2.0-slzb-06m-115200.gbl
Zigbee2MQTT version
1.38
Adapter firmware version
7.4.2
Adapter
SMLIGHT SLZB-06M
Setup
Z2M LXC, MQTT LXC, HAOS VM
Debug log
(STARTED USING NPM START)
root@zigbee2mqtt:/opt/zigbee2mqtt/data/log/2024-06-24.03-37-22# cat log.log
[2024-06-24 03:37:22] info: z2m: Logging to console, file (filename: log.log)
[2024-06-24 03:37:22] info: z2m: Starting Zigbee2MQTT version 1.38.0 (commit #f1847301)
[2024-06-24 03:37:22] info: z2m: Starting zigbee-herdsman (0.49.2)
[2024-06-24 03:37:22] info: zh:ember: Using default stack config.
[2024-06-24 03:37:22] info: zh:ember: ======== Ember Adapter Starting ========
[2024-06-24 03:37:22] info: zh:ember:ezsp: ======== EZSP starting ========
[2024-06-24 03:37:22] info: zh:ember:uart:ash: ======== ASH NCP reset ========
[2024-06-24 03:37:22] info: zh:ember:uart:ash: Socket ready
[2024-06-24 03:37:22] info: zh:ember:uart:ash: ======== ASH starting ========
[2024-06-24 03:37:22] error: zh:ember:uart:ash: Received ERROR from NCP while connecting, with code=ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT.
[2024-06-24 03:37:22] error: zh:ember:uart:ash: ASH disconnected | NCP status: ASH_NCP_FATAL_ERROR
[2024-06-24 03:37:22] error: zh:ember:uart:ash: Error while parsing received frame, status=ASH_NCP_FATAL_ERROR.
[2024-06-24 03:37:22] info: zh:ember:uart:ash: ======== ASH NCP reset ========
[2024-06-24 03:37:22] info: zh:ember:uart:ash: ======== ASH starting ========
[2024-06-24 03:37:23] info: zh:ember:uart:ash: ======== ASH connected ========
[2024-06-24 03:37:23] info: zh:ember:uart:ash: ======== ASH started ========
[2024-06-24 03:37:23] info: zh:ember:ezsp: ======== EZSP started ========
[2024-06-24 03:37:24] warning: zh:ember:uart:ash: Frame(s) in progress cancelled in [1ac1020b0a527e]
[2024-06-24 03:37:24] error: zh:ember:uart:ash: Received unexpected reset from NCP, with reason=RESET_SOFTWARE.
[2024-06-24 03:37:24] error: zh:ember:uart:ash: ASH disconnected: ASH_ERROR_NCP_RESET | NCP status: ASH_NCP_FATAL_ERROR
[2024-06-24 03:37:24] error: zh:ember:uart:ash: Error while parsing received frame, status=HOST_FATAL_ERROR.
[2024-06-24 03:37:24] error: zh:ember: !!! NCP FATAL ERROR reason=HOST_FATAL_ERROR. ATTEMPTING RESET... !!!
[2024-06-24 03:37:24] info: zh:ember:queue: Request dispatching stopped; queue=0 priorityQueue=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: ASH COUNTERS since last clear:
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Total frames: RX=2, TX=3
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Cancelled : RX=1, TX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: DATA frames : RX=0, TX=1
[2024-06-24 03:37:24] info: zh:ember:uart:ash: DATA bytes : RX=0, TX=4
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Retry frames: RX=0, TX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: ACK frames : RX=0, TX=1
[2024-06-24 03:37:24] info: zh:ember:uart:ash: NAK frames : RX=0, TX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: nRdy frames : RX=0, TX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: CRC errors : RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Comm errors : RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Length < minimum: RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Length > maximum: RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Bad controls : RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Bad lengths : RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Bad ACK numbers : RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Out of buffers : RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Retry dupes : RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: Out of sequence : RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: ACK timeouts : RX=0
[2024-06-24 03:37:24] info: zh:ember:uart:ash: ======== ASH stopped ========
[2024-06-24 03:37:24] info: zh:ember:ezsp: ======== EZSP stopped ========
[2024-06-24 03:37:24] info: zh:ember: ======== Ember Adapter Stopped ========
[2024-06-24 03:37:25] info: zh:ember: ======== Ember Adapter Starting ========
[2024-06-24 03:37:25] info: zh:ember:ezsp: ======== EZSP starting ========
[2024-06-24 03:37:25] info: zh:ember:uart:ash: ======== ASH NCP reset ========
[2024-06-24 03:37:25] info: zh:ember:uart:ash: Socket ready
[2024-06-24 03:37:25] info: zh:ember:uart:ash: ======== ASH starting ========
[2024-06-24 03:37:26] info: zh:ember:uart:ash: ======== ASH connected ========
[2024-06-24 03:37:26] info: zh:ember:uart:ash: ======== ASH started ========
[2024-06-24 03:37:26] info: zh:ember:ezsp: ======== EZSP started ========
[2024-06-24 03:37:26] warning: zh:ember: [EzspConfigId] Failed to SET "APS_UNICAST_MESSAGE_COUNT" TO "32" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead.
[2024-06-24 03:37:26] info: zh:ember: [STACK STATUS] Network up.
[2024-06-24 03:37:26] info: zh:ember: [INIT TC] NCP network matches config.
[2024-06-24 03:37:26] info: zh:ember: [CONCENTRATOR] Started source route discovery. 1246ms until next broadcast.
[2024-06-24 03:37:26] info: zh:ember:queue: Request dispatching started.
[2024-06-24 03:37:26] info: zh:ember:ezsp: Received network/route error ROUTE_ERROR_MANY_TO_ONE_ROUTE_FAILURE for "52765".
[2024-06-24 03:38:05] error: z2m:mqtt: Not connected to MQTT server!
[2024-06-24 03:38:05] error: z2m:mqtt: Cannot send message: topic: 'zigbee2mqtt/bridge/state', payload: '{"state":"offline"}
[2024-06-24 03:38:05] info: z2m:mqtt: Disconnecting from MQTT server
[2024-06-24 03:38:05] info: z2m: Stopping zigbee-herdsman...
[2024-06-24 03:38:05] info: z2m: Stopped zigbee-herdsman
[2024-06-24 03:38:05] info: z2m: Stopped Zigbee2MQTT
root@zigbee2mqtt:/opt/zigbee2mqtt/data/log/2024-06-24.03-37-22#
(FAILED TO START)
root@zigbee2mqtt:/opt/zigbee2mqtt# npm start
[2024-06-24 03:49:20] info: z2m: Logging to console, file (filename: log.log)
[2024-06-24 03:49:20] info: z2m: Starting Zigbee2MQTT version 1.38.0 (commit #6c7d52a3)
[2024-06-24 03:49:20] info: z2m: Starting zigbee-herdsman (0.49.2)
[2024-06-24 03:49:20] info: zh:ember: Using default stack config.
[2024-06-24 03:49:20] info: zh:ember: ======== Ember Adapter Starting ========
[2024-06-24 03:49:20] info: zh:ember:ezsp: ======== EZSP starting ========
[2024-06-24 03:49:20] info: zh:ember:uart:ash: ======== ASH NCP reset ========
[2024-06-24 03:49:20] info: zh:ember:uart:ash: Socket ready
[2024-06-24 03:49:20] info: zh:ember:uart:ash: ======== ASH starting ========
[2024-06-24 03:49:20] error: zh:ember:uart:ash: Received ERROR from NCP while connecting, with code=ERROR_EXCEEDED_MAXIMUM_ACK_TIMEOUT_COUNT.
[2024-06-24 03:49:20] error: zh:ember:uart:ash: ASH disconnected | NCP status: ASH_NCP_FATAL_ERROR
[2024-06-24 03:49:20] error: zh:ember:uart:ash: Error while parsing received frame, status=ASH_NCP_FATAL_ERROR.
[2024-06-24 03:49:20] error: zh:ember:uart:ash: Error while parsing received frame, status=ASH_NCP_FATAL_ERROR.
[2024-06-24 03:49:20] error: zh:ember:uart:ash: Error while parsing received frame, status=ASH_NCP_FATAL_ERROR.
[2024-06-24 03:49:20] error: zh:ember:uart:ash: Error while parsing received frame, status=ASH_NCP_FATAL_ERROR.
[2024-06-24 03:49:20] info: zh:ember:uart:ash: ======== ASH NCP reset ========
[2024-06-24 03:49:20] info: zh:ember:uart:ash: ======== ASH starting ========
[2024-06-24 03:49:21] info: zh:ember:uart:ash: ======== ASH connected ========
[2024-06-24 03:49:21] info: zh:ember:uart:ash: ======== ASH started ========
[2024-06-24 03:49:21] info: zh:ember:ezsp: ======== EZSP started ========
[2024-06-24 03:49:22] warning: zh:ember:uart:ash: Frame(s) in progress cancelled in [1ac1020b0a527e]
[2024-06-24 03:49:22] error: zh:ember:uart:ash: Received unexpected reset from NCP, with reason=RESET_SOFTWARE.
[2024-06-24 03:49:22] error: zh:ember:uart:ash: ASH disconnected: ASH_ERROR_NCP_RESET | NCP status: ASH_NCP_FATAL_ERROR
[2024-06-24 03:49:22] error: zh:ember:uart:ash: Error while parsing received frame, status=HOST_FATAL_ERROR.
[2024-06-24 03:49:22] error: zh:ember: !!! NCP FATAL ERROR reason=HOST_FATAL_ERROR. ATTEMPTING RESET... !!!
[2024-06-24 03:49:22] info: zh:ember:queue: Request dispatching stopped; queue=0 priorityQueue=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: ASH COUNTERS since last clear:
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Total frames: RX=2, TX=3
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Cancelled : RX=1, TX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: DATA frames : RX=0, TX=1
[2024-06-24 03:49:22] info: zh:ember:uart:ash: DATA bytes : RX=0, TX=4
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Retry frames: RX=0, TX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: ACK frames : RX=0, TX=1
[2024-06-24 03:49:22] info: zh:ember:uart:ash: NAK frames : RX=0, TX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: nRdy frames : RX=0, TX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: CRC errors : RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Comm errors : RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Length < minimum: RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Length > maximum: RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Bad controls : RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Bad lengths : RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Bad ACK numbers : RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Out of buffers : RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Retry dupes : RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: Out of sequence : RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: ACK timeouts : RX=0
[2024-06-24 03:49:22] info: zh:ember:uart:ash: ======== ASH stopped ========
[2024-06-24 03:49:22] info: zh:ember:ezsp: ======== EZSP stopped ========
[2024-06-24 03:49:22] info: zh:ember: ======== Ember Adapter Stopped ========
[2024-06-24 03:49:23] info: zh:ember: ======== Ember Adapter Starting ========
[2024-06-24 03:49:23] info: zh:ember:ezsp: ======== EZSP starting ========
[2024-06-24 03:49:23] info: zh:ember:uart:ash: ======== ASH NCP reset ========
[2024-06-24 03:49:23] info: zh:ember:uart:ash: Socket ready
[2024-06-24 03:49:23] info: zh:ember:uart:ash: ======== ASH starting ========
[2024-06-24 03:49:24] info: zh:ember:uart:ash: ======== ASH connected ========
[2024-06-24 03:49:24] info: zh:ember:uart:ash: ======== ASH started ========
[2024-06-24 03:49:24] info: zh:ember:ezsp: ======== EZSP started ========
[2024-06-24 03:49:24] warning: zh:ember: [EzspConfigId] Failed to SET "APS_UNICAST_MESSAGE_COUNT" TO "32" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead.
[2024-06-24 03:49:24] info: zh:ember: [STACK STATUS] Network up.
[2024-06-24 03:49:24] info: zh:ember: [INIT TC] NCP network matches config.
[2024-06-24 03:49:24] info: zh:ember: [CONCENTRATOR] Started source route discovery. 1248ms until next broadcast.
[2024-06-24 03:49:24] info: zh:ember:queue: Request dispatching started.
[2024-06-24 03:49:29] info: zh:ember:ezsp: <=== [ZDO clusterId=32824 sender=52815] Support not implemented upstream.
(FAILED TO START)
The text was updated successfully, but these errors were encountered: