Skip to content

OTA update fails with XMC Flash chip #7267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
6 tasks done
Jason2866 opened this issue May 4, 2020 · 18 comments
Closed
6 tasks done

OTA update fails with XMC Flash chip #7267

Jason2866 opened this issue May 4, 2020 · 18 comments

Comments

@Jason2866
Copy link
Contributor

Jason2866 commented May 4, 2020

OTA update fails (used gz Tasmota firmware) when a XMC Flash is on the ESP board.
Module Model: ESP-12F Vendor: DOITING
Flash Chip Id 0x164020

To get it working (again) a serial flash is needed

 ets Jan  8 2013,rst cause:2, boot mode:(3,6)

load 0x4010f000, len 3656, room 16

tail 8

chksum 0x0c

csum 0x0c

v9c56ed1f

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it.
  • I have filled out all fields below.

Platform

  • Hardware: [ESP-12F]
  • Core Version: [2.7.0]
  • Development Env: [Platformio]
  • Operating System: [Windows|Ubuntu]

Settings in IDE

  • Module: [Generic ESP8266]
  • Flash Mode: [DOUT]
  • Flash Size: [4MB]
  • lwip Variant: [LWIP2_HIGHER_BANDWIDTH_LOW_FLASH]
  • Reset Method: [nodemcu]
  • Flash Frequency: [40Mhz]
  • CPU Frequency: [80Mhz]
  • Upload Using: [OTA]
  • Upload Speed: [115200] (serial upload only)

Problem Description

Flashing via serial does work and device works as expected.
Trying a OTA update results in this

Debug Messages

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x00000000, len 201326592, room 16


@Jason2866
Copy link
Contributor Author

Did some test #6725 causes the issue. Using a eboot.elf version before this commit solves the problem

@d-a-v
Copy link
Collaborator

d-a-v commented May 4, 2020

Thanks @Jason2866 for the bissect !
@ChocolateFrogsNuts Does it ring a bell ?

@crashcoq
Copy link

crashcoq commented May 4, 2020

Same problem for me on OTA only

Start OK
`

21:38:58.882 -> ets Jan 8 2013,rst cause:2, boot mode:(3,6)
21:38:58.882 ->
21:38:58.882 -> load 0x4010f000, len 3656, room 16
21:38:58.882 -> tail 8
21:38:58.882 -> chksum 0x0c
21:38:58.882 -> csum 0x0c
21:38:58.882 -> v9c56ed1f
21:38:58.882 -> ~ld

Don't start after update

21:33:21.235 -> ets Jan 8 2013,rst cause:2, boot mode:(3,6)
21:33:21.235 ->
21:33:21.235 -> load 0x4010f000, len 3656, room 16
21:33:21.235 -> tail 8
21:33:21.235 -> chksum 0x0c
21:33:21.235 -> csum 0x0c
21:33:21.235 -> v9c56ed1f
21:33:21.235 -> @cp:0
21:33:25.407 -> ld
21:33:25.407 -> e:
21:33:25.407 -> ets Jan 8 2013,rst cause:3, boot mode:(3,6)
21:33:25.407 ->
21:33:25.407 -> ets_main.c n l ���o{⸮⸮⸮g���⸮��s⸮⸮o �⸮�⸮�쒌⸮⸮��⸮⸮�d⸮⸮o⸮�2
`

I can upgrade firmware with serial COM only on 2.7.0

On version 2.6.3, all is ok, serial and OTA

@ChocolateFrogsNuts
Copy link
Contributor

Hmm, ok, testing of OTA updates was shall we say "severely limited".
Easiest way to verify if it's the XMC bootloader code is to comment out the #define XMC_SUPPORT at line 193 of eboot.c and see what that does.

There are two factors at play here -
The implementation of spi_flash_get_id() on eboot may not work (ie I'm pretty sure its dodgy), thus not detecting the XMC chip... this should only be a factor if power is cycled when an update is not complete, or if you do an OTA upgrade from a version of the core that has no XMC support at all, because the XMC chip won't be set to full output drive.
Failing to detect the XMC chip should make no difference to how it operated on the previous version, so I think it's something else happening.

OR it is working and the code that tries to slow down the access is messing up.

Either way, trying a build with XMC support disabled in the bootloader will give some insight.

@devyte
Copy link
Collaborator

devyte commented May 5, 2020

To whoever will test, the previous means:

  • modify eboot.c as explained
  • rebuild eboot.elf
  • rebuild your test sketch

@Jason2866
Copy link
Contributor Author

Jason2866 commented May 5, 2020

Using a eboot.elf before the XMC PR works for my module with XMC flash (OTA with gzip and uncompressed too).
I just replaced the eboot.elf file. So IMHO for a Hotfix the PR should reverted
For building Tasmota we are using just this change. arendst/Tasmota#8342

@ChocolateFrogsNuts The OTA fails only if the XMC eboot.elf is used.
This is not the case

if you do an OTA upgrade from a version of the core that has no XMC support at all, because the XMC chip won't be set to full output drive.

I am not using Linux. So i cant use the provided Makefile to build eboot.elf...
If you provide a modified version i will test.

@Tech-TX
Copy link
Contributor

Tech-TX commented May 5, 2020

All of my D1 Mini boards have XMC flash chips, if you need another test bunny. I'm not on Linux, either. 😄

I know I can OTA to the IP address directly, haven't been able to OTA via the GUI for some time; Bonjour Browser doesn't see it, nor does Service Browser on the Android phone.

I'm pretty sure I updated to 2.7.0 dev on Saturday.

edit: yep, I'm even with master, and it dies quietly after the OTA completes when going from the command line direct to the IP address with espota.py. It uploads, but the code doesn't run after the reset.

@Jason2866
Copy link
Contributor Author

Tested. XMC Hotfix solves issue. OTA works again

@Tech-TX
Copy link
Contributor

Tech-TX commented May 8, 2020

PR #7277 didn't fix it for me, even after doing "Erase all flash contents" with a fresh compile via serial upload with the new binary. I can still upload via serial, but OTA to the IP address hangs quietly on boot afterwards, and all I get is

ets Jan 8 2013,rst cause:2, boot mode:(3,0)
load 0x00000000, len 201326592, room 16

I'm purely guessing that doing git fetch upstream and git rebase upstream/master should have pulled the replacement eboot.elf file from #7277 after it was merged. It has a time stamp from 30 minutes ago, so it ought to be the right file. Here's the MD5 I got on the file from the merged PR:
d8708dea12bce1eb9ddf058020699fae *./eboot.elf

If I have to delete my local files and fork and start from scratch, I'm OK with that. ;-) Right now it doesn't appear like it'll help. I copied an older eboot.elf from my desktop PC with a date of 4/18/2020 on top of the one on my laptop and it uploads and works via OTA now.

edit: Never mind. After deleting everything and re-forking I was getting the older file from a different PR once I switched to it.

@ChocolateFrogsNuts
Copy link
Contributor

getting this issue some CPU cycles now... I can replicate it, so will be working on a fix over the next few days I hope.

@ChocolateFrogsNuts
Copy link
Contributor

Ok, so it appears that examining the flash speed registers to get appropriate values to use in eboot by printing them from within the sketch at various flash speed settings was the wrong approach - I needed to have eboot print them.
If I had done this initially, I would have realised that eboot runs at 20mhz flash speed every time - the registers never change until some time later during SDK init.
There are also some differences in the register values since last time I was looking at this... so that is probably why things are crashing.

Anyway, the fact that eboot already runs at 20mhz flash access no matter what speed is selected once the SDK/core starts means there is no need to have XMC support in eboot at all! The XMC chips only need special treatment (boosting the drive level) when at 40mhz and above, which never happens in eboot.

I'll do a bit more testing and try breaking things with a power cycle during the eboot copy, but I can't see why that would be a problem now.

I can also move spi_vendors.h back out of eboot into the core too.
Should get that pushed in the next day or so.

@ChocolateFrogsNuts
Copy link
Contributor

I added a couple of 2 second delays to eboot so I could pull the USB cable at a known point.
Power fail testing results:

  • Power failure in eboot before the copy begins: boots with old firmware.
  • Power failure after the copy completes but before command_clear(): boots with new firmware.
  • Power failure during copy: board consistently fails to boot with assorted errors/resets.

It would seem the RTC is losing the copy command when the power fails, thus resulting in no attempt to re-start the failed copy at power up!
This could be an issue for all boards, not just those with XMC flash... I will investigate further when I track down one of my non-XMC boards...

@earlephilhower
Copy link
Collaborator

RTC is a SRAM, not an NVRAM, so that's expected, @ChocolateFrogsNuts. See #6538

@ChocolateFrogsNuts
Copy link
Contributor

ahh excellent... I can ignore that and just temporarily hack eboot to always copy while I test that the copy works ok from a cold boot.

@devyte
Copy link
Collaborator

devyte commented May 19, 2020

@ChocolateFrogsNuts #7307 has some additional info.

@ChocolateFrogsNuts
Copy link
Contributor

ChocolateFrogsNuts commented May 19, 2020

PR #7317 should wrap this up.

@Jason2866
Copy link
Contributor Author

@ChocolateFrogsNuts tried PR #7317 OTA is working with.

@devyte
Copy link
Collaborator

devyte commented Jun 15, 2020

The relevant PRs are merged, so closing as already resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants