-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrading to 2023.11.0 Breaks Publishing #760
Comments
I'll need more info on what "breaks publishing" means. I'm running 2023.11.0 and my gateway continues to publish. |
@bachya not sure what info I can provide - HA defined the MQTT device as unavailable. I'm assuming all MQTT communication stopped (from the the log that MQTT calls were starting the queue up). My MQTT broker, Emqx (running in high availability mode), didn't note any issues. |
Okay—I'll dig in and see what I can find. |
Yeah, sorry, I'm notoriously bad at debugging MQTT communication. Looking at the broker though, I do see ecowitt2mqtt connected to a broker node - just nothing appears to be published. The client stays connected though. I double checked HA, no new devices appear (was looking to see if ids changed or something). So the only thing that I can see out of whack is that warning |
No worries! 👍🏻 Are you using the Docker image? |
Yep!
|
Having trouble reproducing this locally. I spun up a new instance of the 2023.11.0 image and published a 40-sensor payload 10 times in rapid succession and never saw this warning. I can see where this arises in ...but nothing about that has changed from 2023.08.0 to 2023.11.0. |
Silvenga might not be the only one. I did experience the same/something similar after updating to 2023.11.0 via pip install upgrade. I have Mosquitto as a MQTT-broker, also not experienced in MQTT-communication. Downgrading back to 2023.08.0 resolved it back to normal. I use Domoticz. Since I quickly went back to the old version I am not able now to give logs that provide more info, it is just that the sensors in Domoticz did not update anymore after the upgrade. |
I upgraded to 2023.11.0 and have the same issue: after a while sensors go to unavailable state and never recover. I'll dig deeper, but reverting to previous version for now. I think it's the changes to MQTT Discovery that had a negative impact. Will provide more info... |
First thing I noticed: sensors in autodiscovery topic are missing the I restarted the container and finally I see the config topic: And sensors in HA are finally available again: After startup, I have these in the log: 2023-11-06 17:09:21,502 | DEBUG | Sending PUBLISH (d0, q0, r0, m479), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/pm25batt1/attributes'', ... (2 bytes)
2023-11-06 17:09:21,503 | WARNING | There are 198 pending publish calls.
2023-11-06 17:09:21,503 | DEBUG | Sending PUBLISH (d0, q0, r0, m480), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/pm25batt1/state'', ... (5 bytes)
2023-11-06 17:09:21,503 | WARNING | There are 199 pending publish calls.
2023-11-06 17:09:21,503 | DEBUG | Sending PUBLISH (d0, q0, r0, m481), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/wh57batt/availability'', ... (6 bytes)
2023-11-06 17:09:21,503 | WARNING | There are 200 pending publish calls.
2023-11-06 17:09:21,503 | DEBUG | Sending PUBLISH (d0, q0, r0, m482), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/wh57batt/attributes'', ... (2 bytes)
2023-11-06 17:09:21,503 | WARNING | There are 201 pending publish calls.
2023-11-06 17:09:21,504 | DEBUG | Sending PUBLISH (d0, q0, r0, m483), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/wh57batt/state'', ... (4 bytes)
2023-11-06 17:09:21,504 | WARNING | There are 202 pending publish calls.
2023-11-06 17:09:21,504 | DEBUG | Sending PUBLISH (d0, q0, r0, m484), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/co2_batt/availability'', ... (6 bytes)
2023-11-06 17:09:21,504 | WARNING | There are 203 pending publish calls.
2023-11-06 17:09:21,504 | DEBUG | Sending PUBLISH (d0, q0, r0, m485), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/co2_batt/attributes'', ... (2 bytes)
2023-11-06 17:09:21,504 | WARNING | There are 204 pending publish calls.
2023-11-06 17:09:21,505 | DEBUG | Sending PUBLISH (d0, q0, r0, m486), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/co2_batt/state'', ... (5 bytes)
2023-11-06 17:09:21,505 | WARNING | There are 205 pending publish calls.
2023-11-06 17:09:21,505 | DEBUG | Sending PUBLISH (d0, q0, r0, m487), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/wh90batt/availability'', ... (6 bytes)
2023-11-06 17:09:21,505 | WARNING | There are 206 pending publish calls.
2023-11-06 17:09:21,505 | DEBUG | Sending PUBLISH (d0, q0, r0, m488), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/wh90batt/attributes'', ... (2 bytes)
2023-11-06 17:09:21,505 | WARNING | There are 207 pending publish calls.
2023-11-06 17:09:21,505 | DEBUG | Sending PUBLISH (d0, q0, r0, m489), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/wh90batt/state'', ... (4 bytes)
2023-11-06 17:09:21,506 | WARNING | There are 208 pending publish calls.
2023-11-06 17:09:21,506 | DEBUG | Sending PUBLISH (d0, q0, r0, m490), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/interval/availability'', ... (6 bytes)
2023-11-06 17:09:21,506 | WARNING | There are 209 pending publish calls.
2023-11-06 17:09:21,506 | DEBUG | Sending PUBLISH (d0, q0, r0, m491), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/interval/attributes'', ... (2 bytes)
2023-11-06 17:09:21,506 | WARNING | There are 210 pending publish calls.
2023-11-06 17:09:21,506 | DEBUG | Sending PUBLISH (d0, q0, r0, m492), 'b'homeassistant/sensor/C9E3FFE68DC4E12DEB7C96994872057B/interval/state'', ... (4 bytes)
2023-11-06 17:09:21,507 | WARNING | There are 211 pending publish calls.
2023-11-06 17:09:21,511 | INFO | Published to ecowitt2mqtt/GW2000A
2023-11-06 17:09:21,511 | DEBUG | Published data: {'runtime': 144213.0, 'tempin': 22.5, 'humidityin': 58.0, 'baromrel': 1016.49, 'baromabs': 1004.3, 'temp': 17.3, 'humidity': 73.0, 'winddir': 355.0, 'windspeed': 2.53, 'windgust': 3.6, 'maxdailygust': 25.93, 'solarradiation': 0.55, 'uv': 0.0, 'rrain_piezo': 0.0, 'erain_piezo': 0.0, 'hrain_piezo': 0.0, 'drain_piezo': 0.79, 'wrain_piezo': 0.79, 'mrain_piezo': 67.21, 'yrain_piezo': 1283.41, 'ws90cap_volt': 5.3, 'ws90_ver':115.0, 'temp1': 22.3, 'humidity1': 62.0, 'temp2': 22.4, 'humidity2': 62.0, 'temp3': 21.9, 'humidity3': 64.0, 'temp4': 23.0, 'humidity4': 60.0, 'temp5': 22.3, 'humidity5': 65.0, 'temp6': 22.6, 'humidity6': 61.0, 'temp7': 19.3, 'humidity7': 68.0, 'temp8': 18.8, 'humidity8': 73.0, 'soilmoisture1': 63.0, 'soilad1': 275.0, 'soilmoisture3': 62.0, 'soilad3': 273.0, 'pm25_ch1': 8.0, 'pm25_avg_24h_ch1': 12.5, 'tf_co2': 18.4, 'humi_co2': 66.0, 'pm25_co2': 4.2, 'pm25_24h_co2': 5.3, 'pm10_co2': 4.9, 'pm10_24h_co2': 7.1, 'co2': 460.0, 'co2_24h': 435.0, 'lightning_num': 0.0, 'lightning': 1.0, 'lightning_time': datetime.datetime(2023, 11, 1, 22, 30, 18, tzinfo=datetime.timezone.utc), 'batt1': <BooleanBatteryState.OFF: 'OFF'>, 'batt2': <BooleanBatteryState.OFF: 'OFF'>, 'batt3': <BooleanBatteryState.OFF: 'OFF'>, 'batt4': <BooleanBatteryState.OFF: 'OFF'>, 'batt5': <BooleanBatteryState.OFF: 'OFF'>, 'batt6': <BooleanBatteryState.OFF: 'OFF'>, 'batt7': <BooleanBatteryState.OFF: 'OFF'>, 'batt8': <BooleanBatteryState.OFF: 'OFF'>, 'soilbatt1': 1.4, 'soilbatt3': 1.3, 'pm25batt1': 100.0, 'wh57batt': 40.0, 'co2_batt': 120.0, 'wh90batt': 3.06, 'interval': 60.0}
2023-11-06 17:09:21,512 | INFO | Published to Home Assistant MQTT Discovery |
@alexdelprete Curious. To make sure I understand:
Is that correct? |
Yesterday I upgraded, was doing maintenance on HA components/integrations, I restarted HA and after 1h I found out all entities were unavailable. I checked MQTT and found that This morning (15h after) I found that entities were unavailable again, I might have restarted HA because of an upgrade to a component in that time-frame. So I checked the repo for issues and I wrote about my experience. When entities go unavailable, the only way to recover is to restart the container, but I don't know why it happens, I'm sure it's due to the missing Plus, in the log I'm seeing those pending publish calls...I don't remember seeing those before. I was thinking to enable |
@bachya Restarted HA 15m ago for upgrades, and it happened again: entities unavailable and missing config topic in mqtt. |
Thanks, @alexdelprete. That definitely is a bug (and may be the real issue, vs. the |
@bachya Happened again last night. I just discovered when I came back home. It stopped around 3am, and now I checked MQTT and not only there were no config topics again, but there was not even the "root" topic (the one with the big s/n of the device. Obviously I'm on 11.1. I guess there's something else that causes this. I was thinking to enable the retain flag option, what do you suggest? |
Interesting. @Silvenga Are you experiencing this after upgrading, too? |
I enabled retain now. Let's see what happens... |
Sorry @bachya, been away. I just deployed I'll continue to monitor for behavior that @alexdelprete reported. |
@bachya I can confirm the behavior @alexdelprete reported on my cluster (occurred about 2 hours ago). The topic in MQTT is no-longer present and HA considers the device off-line. No errors are reported by the container. Restarting the container (same image) restores the MQTT topic. |
Thanks for the confirmation, @Silvenga. |
Retain flag option seems to help here. But I see it as a workaround for this specific issue. |
I've done some digging and have found one scenario in which I can consistently replicate this: when Here's what happens:
So, although |
Okay, I think #777 should fix this. @alexdelprete and @Silvenga, if you are Docker users, I'd appreciate your testing once the images build (you'll be able to pull |
@alexdelprete Appreciate that. I'd recommend letting this new restart run, and as best you can, try to tell when it fails (if it fails again). Ideally, you'd have verbose logging on and we could look at the logs during that time. |
If you check the screenshot of my docker compose file, verbose is on. Does the old log get overwritten when restarting the container or can I check yesterday's events if possible? |
That's up to your Docker configuration: https://sematext.com/blog/docker-logs-location/ |
it's journald. unfortunately it rotated, last event was 5-6h AFTER the issue that caused the unavailability. Meanwhile I'm very tempted to turn retain on again... BTW: sematext is interesting...will check it tomorrow. Thanks. |
@alexdelprete Checking in: any news? |
Yes, bad news: happened 11h ago. But I got the log this time (from journald), it stopped at |
Tagging on to this. Just upgraded to image pr-777 and have ECOWITT2MQTT_DIAGNOSTICS enabled. |
@JvdMaat Let me know whether |
@alexdelprete Can you help me with where in your logs you notice the issue? Looking at
|
I didn't analyze the log, I only extrapolated it because you asked for it. But now that you asked, I digged into it and also other things, let me clear out:
So I checked other possible causes and I found out that at 14:16 I restarted HA for maintenance, and I don't think it's a coincidence. In order to validate my theory, I now checked with MQTT explorer (that is an MQTT client just like HA is from the broker perspectice), and it doesn't see the In the image you can see HA sensor available and topics in the broker updated but without My theory: since HA is an MQTT client/subscriber, when it restarts, it won't 'see' the In previous versions of ecowitt2mqtt you were updating But that is not enough, because if an MQTT client/subscriber (like HA or MQTT Explorer I used now) connect to the broker, they won't even see the So, my recommendation is to change the code to have the Let me know your feedbacks. :) |
Excellent troubleshooting, @alexdelprete! Much appreciated, and agreed: that sounds like the right approach. Glad to see validation from weatherflow2mqtt. I'll update #777 and once a new image builds, would love for you (and any other Docker users monitoring this thread) to test. EDIT: images building. |
sorry for not doing the analysis before, it only took 30m actually to understand the problem, but last time I had other things going on and I simply said there was an issue, but didn't do proper troubleshooting. We could have saved a couple of weeks. :)
so I'll pull again with same tag |
No apologies! We got there in the end; that's all that matters. 👍🏻
That's correct. |
Looks like the images are done building: https://github.com/bachya/ecowitt2mqtt/pkgs/container/ecowitt2mqtt/150063299?tag=pr-777 @alexdelprete, @Silvenga, @JvdMaat, and any other Docker users: appreciate you testing. |
Already pulled the new one 2 minutes ago. Will keep you posted |
Deployed |
Pulled it now. Will test and report back. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Good find, @alexdelprete! Birth and Will Messages are definitely the right way to tackle this. What I have now is only half complete by the above reckoning (I'm not subsequently publishing retained state messages, nor do I always want to do that). I'll start tinkering. |
I wouldn't say LWT is the right way, because retained config topics work fine too, numerous integrations use that method, and they work fine. But the LWT way is far more elegant and efficient, because it relies on a tighter integration with HA MQTT integration through the subscription to the HA MQTT LWT topic (default is The only cons of the LWT approach are:
So basically the con of the LWT approach is that involves the user configuring things, while retain is pretty much transparent. Anyway, once those 2 pre-requisites are satisfied, on startup ecowitt2mqtt should subscribe to HA LWT topic, send the config topic, then when it receives an The only other event that triggers config republishing is when config changes, obviously. Another note: HA recommends that everytime you publish config topic, state topic has to be updated too, for consistency obviously. My $0.02 for your tinkering. :) |
Good point. We already have a lot of config options, and I'm reticent to add a more elegant approach only to still need an alternate if users turn off LWT. Out of curiosity, has your current |
Yes, but I didn't yet restart HA. :) I saw the retain flag on config topics, I think you can release it. |
Describe the bug
It appears that after upgrading from 2023.08.0 to 2023.11.0, MQTT publishing stopped working. Downgrading to 2023.08.0 restores previous behavior.
To Reproduce
Environment variables:
Expected behavior
MQTT messages continue to publish.
Additional context
Logs similar to the following can be found:
ecowitt2mqtt.log
The text was updated successfully, but these errors were encountered: