-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MQTT OTA csum error in reboot after "OTA sucess" #204
Comments
I am experiencing something similar. I can get the OTA to upload to the device, and it apparently succeeds, but the device fails to reboot. If I rest the device, I get only the off-baud garbage string and the device doesn't come up. If I flash the same image via serial, it works. I am building with PlatformIO, if that makes a difference. I have the latest code as of today from the develop branch and the latest versions of the dependencies, from their respective repositories. Here's my output, for reference. [...snip...] is where I took out long sequences of similar messages:
|
Oh, I'm using the latest homie-ota python code, and I'm using mosquitto as my mqtt broker, if any of those things matter. |
Just to confirm the issue - I experienced the same, but with some minor differences / additions:
|
Since we responded with The one commonality between #204 and #208 is that in both cases we do reboot. According to esp8266/Arduino#1017, even the latest arduino/esp8266 still has issues with restart after OTA. Could this be our problem here? @clough42, @Gulaschcowboy can you please check if it helps to reset or power cycle your device after serial flashing but before trying OTA update? |
@marvinroger I wonder about the Homie output in between Also, varying reset causes in the logs above (1=normal boot, 4=watchdog, 2=reset pin, 3=software reset). See https://github.com/esp8266/Arduino/blob/master/doc/boards.md#boot-messages-and-modes |
@mrpace2 the serial output after the They both use
So the flashed firmware is definitely wrong. Which, along with #208, makes me think the b64 decoding is broken. |
@clough42 @Gulaschcowboy can you provide your firmwares binaries? |
Can you test again with the latest git rev? |
Hi Marvin, tested right now with latest homie-ota and latest homie rev. Result is similar: ✴ OTA available (version 1.0.2) ets Jan 8 2013,rst cause:2, boot mode:(3,6) load 0x4010f000, len 1384, room 16 ets Jan 8 2013,rst cause:4, boot mode:(3,6) wdt reset |
indicates that
It crashed afterwards:
is a watchdog reset. It seems to have happened before Homie's first log output. Maybe the firmware was published and saved correctly, but the firmware binary itself was broken? Does the same BIN file work when you upload it over the serial port? If you share your BIN and BASE64 files, I could try them out on one of my setups. I don't have an ESP-07, though. Another thing you could try are using MD5 checksums. Basically, you checksum the BIN file and Homie verifies the checksum right before committing the received OTA update to flash. This would help ruling out any issues with
|
Hi @mrpace2
Binary attached. Please note, it's actually an uncompressed .bin just renamed to .zip because of GH limitations and my limitation of being "half offline" in a hotel. So please just rename. |
Thanks for sending the files. I will try them out in a bit. I can test on Nodemcu. Homie automatically detects is a firmware blob is binary or base64, based on the first firmware byte which is always 0xE9. It did detect base64 in your case. Recent |
Here's the last binary I attempted to use: This is a zip file, containing the .bin file. It's compiled for a WeMos D1 mini, which has a 4M flash, with the default PlatformIO configuration. I'm actually using an ESP12F module, but the specs are the same. I am attempting to flash with Homie-OTA. The source code is here: https://github.com/clough42/homie-firelight/blob/master/src/main.cpp |
@clough42, thanks, will test shortly. @Gulaschcowboy, can you also provide mqtt and serial log files and the base64 file you were uploading? |
@clough42 Running the latest Homie code, I uploaded the bin file you provided (a) using my little script (attached) and (b) using the latest homie-ota. I can upload your bin file without problems. The board restarts and launches your firmware. See the attached logs. Please try again:
Please report results. If it still fails, please provide logs. |
@Gulaschcowboy Same thing here. It works fine on my setup. See the attached logs, this time generated using your firmware blob. When you're back home, please try again (same as above). Have a safe trip! |
Hi @mrpace2
Result: Triggering MQTT_PACKET_ACKNOWLEDGED event (packetId 32)... receiving OTA firmware (350880/350880)... Exception (3): ctx: sys
ets Jan 8 2013,rst cause:2, boot mode:(3,6) load 0x4010f000, len 1384, room 16 Triggering NORMAL_MODE event... Second attempt causes same result. Question: What should be the setting for homie-ota.ini: What else should I provide to help? |
Hi @mrpace2, tried both ways that you described in your attached zip file. Both ways (homie-ota and your script) end in: Receiving OTA firmware (350845/350880)... Exception (3): ctx: sys
ets Jan 8 2013,rst cause:2, boot mode:(3,6) load 0x4010f000, len 1384, room 16 After that ESP restarts normal (at least this is better than before, as it doesn't produce a csum error anmore) Attached you find a log taken with.. ...using your shell script. |
base64 (=default, if not set):
binary:
I'm afraid I can't help much with your exception. You could try locating the exact code position where it crashes by looking up addresses contained in the exception dump from a map file or a listing generated using xtensa-objdump. I have never done that myself for ESP, but I know that other people have, so it is possible. Or add extra debug logs around BootNormal.cpp:L413-L424 to see how far it gets before it dies. Just guessing... Are you sure your power supply is clean? Maybe there's another board you could try? Or, start over from a simple Homie sketch. Something like:
|
Hi @mrpace2 I guess you misinterpreted the logs. After your last commits it doesn't crash anymore after firmware is copied to "hot" flash area, instead it crashes before. So at least the ESP is not rendered unusable anymore. It crashes, resets and starts old firmware. That was different before todays rev. About OTA in general: OTA worked for me before the base64-related changes in homie-esp8266 about ~3 weeks ago. About power supply/ESP type: I have tested with 2 node MCUs and a ESP07 that I have with me right now. (yes, I always travel with ESPs :-) ) Also 2 Sonoff relays at home have same problem since above mentioned b64 changes. (Couldn't test them with today's rev). Sketch is almost bare minimal, actually it is the example provided by Marvin. (sonoff with button) All ESPs run stable in normal operation. About xtensa-objdump: Don't know if I'm capable of doing it or being able to interpret results... About adding code in BootNormal.cpp: That's also beyond my skills I guess :-( Thanks a lot |
@Gulaschcowboy I did notice that the behavior changed for you and I did notice in your log that your app rebooted with the old md5. The location where I suggested to add debug output is before the reboot, for a reason ;-) How can I duplicate your problem here? As I said, it worked fine when I tried your previous firmware a few hours ago. Do you have a repo I could fork? Would you be open to sharing your latest binary again so I can try that one out? What toolchain are you using? Thanks for your help. |
I think I got over the problem now: a) updated Arduino to 1.6.12 Loaded ESP via Arduino and USB once c) switched back to OTA_FIRMWARE_BASE64 True (BASE64)
|
@ckrey Thanks for your help. So, still something wrong with base64... I'm just reading up on ESP8266 exceptions. Exception 3 is a Can you please generate a map file matching your firmware and look up addresses
to your LDFLAGS. I don't know how this is done in Arduino. In PlatformIo, you would add
to your |
@mrpace2 Yes, Arduino IDE 1.6.12, The sketch doesn't seem to matter. As said, I even have the problem with the Marvins Sonoff button sketch. |
To decode the stack trace easily:
Please paste the decoded stack trace, not the one with the addresses as we cannot do anything without the binaries. 😉 |
Chiming in, got here from #224, so getting the exception 3 after:
Everything on breadboard here, powered by an MB-102, being itself powered by a pretty huge 12V power supply (able to withstand 5A @ 12V). Got a 10µF between across Vcc/GND too. Last change I made was moving from 1.5.0 to head of develop branch. Tried again with 1.5.0 it works fine, so I would agree wint @Gulaschcowboy that is probably isn't related to power supply. Not right now, but I could give a shot at some kind of git bisect on Homie see if it can give us any pointers as when it appeared. |
Sorry I didn't read everything, so my suggestion deesn't make much sense. Still I'm pretty sure it doesn't come from the power supply :) |
I just tested with the latest homie-ota, and it worked! Note, I got the same error as @bleader when running on WIndows 10 with Python 2.7.12:
I tried it again on Linux (Ubuntu 14.04.4 LTS with Python 2.7.6) and it worked on the first try. |
I just discovered that my Windows git repo was a few days out of date. After updating, It's still not working for me. Now it's starting the update, but only sending about 7K instead of the 300K+ of the image. Not sure what's happening, but it is working on Ubuntu, so I'm happy. Clearly this is a homie-ota issue, and not a homie-esp8266 issue, specifically. |
You made me realize my repo wasn't up to date, the last update being to actually really read the OTA_FIRMWARE_BASE64... which may explains why it didn't change a thing when I tried to disable it. So, now that I did update, OTA works with OTA_FIRMWARE_BASE64 disabled, but not if I enable it. Which makes it less likely to be due to homie-ota only. @clough42 Are you sure both your configurations have it disabled? |
@bleader Ahh...good catch. It's explicitly disabled on the Ubuntu machine (which works) and the line in the ini file is commented out on the WIndows machine, which doesn't. If it defaults to enabled, then that's the difference. I'll toggle that and test when I get a chance. |
Finally getting around to testing this. Environment: platformio on macOS 10.12.1, Homie 2.0, latest commit 49eca38 and homie-ota (also latest :-)
(Base64-encoded firmware here) When OTA is trigerred (B64-enabled), the result looks promising until the stack trace:
If I configure home-ota to not use base64, I see the following expected messages, and the update (sometimes) works:
|
What happens when, when not using base64, it (often) does not work? Does it crash? Does it report something? |
By the way, is anybody using NodeMCU v2s experiencing these issues? I only have these, and I don't have any problems with the latest Git, b64 or not. |
Hi, I have that issue with NodeMCU boards (clones) as well with esp-07s |
yes, i guess so. Clones from LoLin. |
Well, I don't experience a single issue on official NodeMCU boards, so I guess the problem is you are using clones. I am not sure if the issue actually comes from the hardware or from esp8266/Arduino, but I am pretty sure it does not come from homie-esp8266. Everything being rocking stable on official NodeMCUs... |
I understand your point, but my point is: Same issue on NodeMCU clone as well as on ESP07 modules, always with base64. Would it be helpful to give you access to my environment/modules? |
I understand your point too, but I would need to test on real hardware. I'll try to buy some. |
I'm in the process of trying OTA on D1 Mini clones of this kind https://www.aliexpress.com/item/D1-mini-Mini-NodeMcu-4M-bytes-Lua-WIFI-Internet-of-Things-development-board-based-ESP8266-by/32635160765.html As this is the first time ever i'm doing such thing, don't expect a quick reply... :)
|
Your firmware file is bad, it just cannot be 4k in size, it is usually
around 300k.
Le 18 févr. 2017 1:01 AM, "Amayii" <notifications@github.com> a écrit :
I'm in the process of trying OTA on D1 Mini clones of this kind
https://www.aliexpress.com/item/D1-mini-Mini-NodeMcu-4M-
bytes-Lua-WIFI-Internet-of-Things-development-board-
based-ESP8266-by/32635160765.html
I'm using this Python "server" https://github.com/jpmens/homie-ota
My code can be accessed from https://github.com/amayii0/Homie-DS18B20
As this is the first time ever i'm doing such thing, don't expect a quick
reply... :)
(spoiler : it currently just crashes after OTA / reboot)
Triggering MQTT_PACKET_ACKNOWLEDGED event (packetId 27)...
Temperature: 19.00 °C
Triggering MQTT_PACKET_ACKNOWLEDGED event (packetId 28)...
✴ OTA available (checksum 1c47b3d8886fb2730d7e53b24acdd43c)
Triggering MQTT_PACKET_ACKNOWLEDGED event (packetId 29)...
↕ OTA started
Triggering OTA_STARTED event...
Firmware is base64-encoded
Receiving OTA firmware (1056/4167)...
Receiving OTA firmware (2151/4167)...
Receiving OTA firmware (3246/4167)...
Receiving OTA firmware (4166/4166)...
✔ OTA succeeded
Triggering OTA_SUCCESSFUL event...
Triggering MQTT_PACKET_ACKNOWLEDGED event (packetId 30)...
Device is idle
↻ Rebooting...
Triggering MQTT_PACKET_ACKNOWLEDGED event (packetId 31)...
✖ MQTT disconnected
Triggering MQTT_DISCONNECTED event...
↕ Attempting to connect to MQTT...
✖ Wi-Fi disconnected
Triggering WIFI_DISCONNECTED event...
↕ Attempting to connect to Wi-Fi...
ets Jan 8 2013,rst cause:2, boot mode:(3,6)
load 0x4010f000, len 1384, room 16
tail 8
chksum 0x2d
csum 0x2d
v3ffe84d0
@cp <https://github.com/cp>:0
ld
s$␀lܟ<␀�l�<␃␌␄␌�␄l�␄c|��␃�␓�;�c␌�c��g'�lng���␌b␜x��l;l{l���␃��r�Ó
␃x�o�␘␃␌␄�␌d␌��␌␄␌#␄g�|␃l�␌␄�c��'o�␀d��$␃�␛␒no␄$␃␏␃g{�
��o␌␄b␄�␏$␏r��g␄␌c␄�␇l�sĜ���␌��d␂��'�␃{$␀l��<␀�l�|␃␌␄␌�␄l�␌
c|��␃�␓�r�c�␄c��gn�dog���␌c␜x
�l;l{lx�o�␘␃␌␄�␌d␌��␌␄␌c␄g�|␃l�␌␄�c��'o�␀d��$␃�␛␒'o␌$␃␎␃gs�
��o␌␄b␄�␎$␏;��g␄␌c␄�␏l�sĜ���␌��d␂��g�␃sd␀$��|␀�l�|␃␌␌␄�␄$�␄
c|��␃�␛�{�c�␌c��og�l'o���␌#␜p�${ls$p�g�␐␃␄␄�␄l␄Ĝ␄␌␌c␌'�|␃$�␄␄�b��og�␀l��l␃�␒␛og␄l
␃␇␃n;��
�g␄␌c␄�␇l␇s��n␌␄c␌�␇d�{�����␄��$`␃��o�␃
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#204 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA8eNalZQlrYX4uVgoKgPJA1KPmCJK6Mks5rdjTVgaJpZM4KkyL5>
.
|
Does it happen all the time? Can you post a |
@amayii0 what mqtt server are you using? I believe I ran into similar issue when max payload sized on my mqtt was limited. I had more success with iot.eclipse.org |
@marvinroger ill check that later. Maybe also switch to mousquitto for debug |
@marvinroger it seems payload gets limited to 5.556 bytes. This would rather be related to the broker I'm afraid. pi@garagepi ~ $ mosquitto_sub -h 192.168.0.26 -p 1883 -t homie/5ccf7fd3b7d5/# I've dumped that to a file using MQTTFX and same story. The homie-OTA is running from W7 host, looks like I can run that from the rpi instead. |
@amayii0 that's indeed related to the broker... We cannot do anything on our side, unfortunately. |
Nice, recommend platform IO using vscode this will fix your lib dependency issues in the future. But does have its anoyences .... |
Hi,
with latest commits of homie-esp8266 I get the following result when updating my ESP8266-07s using homie-ota:
✴ OTA available (version 1.0.11)
OTA started
Triggering OTA_STARTED event...
Firmware is base64-encoded
Receiving OTA firmware (353600/353600)...
✔ OTA success
Triggering OTA_SUCCESSFUL event...
Triggering MQTT_PACKET_ACKNOWLEDGED event (packetId 51)...
Device is idle
↻ Rebooting...
Triggering MQTT_PACKET_ACKNOWLEDGED event (packetId 375)...
��� MQTT disconnected
Triggering MQTT_DISCONNECTED event...
↕ Attempting to connect to MQTT...
✖ Wi-Fi disconnected
Triggering WIFI_DISCONNECTED event...
↕ Attempting to connect to Wi-Fi...
ets Jan 8 2013,rst cause:1, boot mode:(3,6)
load 0x4010f000, len 1384, room 16
tail 8
chksum 0x2d
csum 0x2d
v3de0c112
@cp:0
ld
ets Jan 8 2013,rst cause:4, boot mode:(3,6)
wdt reset
load 0x4010f000, len 1384, room 16
tail 8
chksum 0x1d
csum 0x1d
csum err
Flashing same binary using serial works.
Any ideas?
Thanks,
Alex
The text was updated successfully, but these errors were encountered: