Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootutil: Fix upgrade might fail after power-failure if swap-scratch is used #2109

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

taltenbach
Copy link
Contributor

Since the introduction of 4cab08e, an interrupted upgrade might fail at next boot when swap-scratch is used. This is due to the fact the image header are not properly reloaded from flash memory after the upgrade completion, leading to erroneous behaviors such an image validation failure if the images in the primary and secondary slots haven't exactly the same size. The issue is only temporary and disappear at next boot, but would cause a revert process to be triggered if BOOT_SWAP_TYPE_TEST is used.

The problem is that after a partial upgrade has been resumed and completed, the image headers are reloaded using boot_read_image_headers(state, false, bs); (see this line). The idea is to update the image headers in boot_data (state argument) to properly reflect the new state of the slots. However, since a boot status is provided (bs), boot_read_image_headers considers we're in the middle of an upgrade/revert process and will use the boot status to look for the header of the former primary and secondary images in their new locations. So, since the upgrade process is complete, the primary image header will be read from the secondary slot and the secondary image header from the primary slot. This is not what we want here since we need the image headers in boot_data to reflect the new state of the slots and this will cause e.g. the new secondary image header to be used instead of the new primary image header to validate the content of the primary slot if MCUBOOT_VALIDATE_PRIMARY_SLOT is enabled. If the two images haven't exactly the same size, the TLV area of the new primary image won't be found, causing a validation failure. To avoid this issue, no boot status should be provided to boot_read_image_headers.

Note that this issue only happens with swap-scratch and not with swap-move since boot_read_image_header in swap_move.c has been implemented in such a way the header of the primary and secondary images is read respectively from the primary and secondary slot if the two images have fully been swapped. I personally don't think this is what should be done because:

  • My understanding is that providing a boot status indicates that we consider the "primary image" and "secondary image" to refer respectively to the former primary and secondary image, i.e. the primary and secondary image before the upgrade/revert has started. Not providing boot status indicates that we consider the "primary image" and "secondary image" to refer respectively to the image currently in the primary and secondary slot.
  • The current implementation of boot_read_image_header for swap-move means that if an upgrade is interrupted after having fully swapped the two images but before having written the copy-done flag to fully complete the swap process, the former primary and secondary image headers will be read at next boot from respectively the primary and secondary slot before resuming the upgrade process (at this line). In practice, this won't cause any issue since the headers are in this case not used before being reloaded after completing the upgrade (at this line), but this doesn't look to be good thing to do and could potentially cause issues in the future.

Also note that the issue hasn't been detected by the automatic tests since they unfortunately seem to only perform swap-scratch upgrades with images having exactly the same size. I will create a GitHub issue about that point.

After a partial swap has been resumed and completed, the image headers
are reloaded. The idea is that we want to update the bootloader state
(boot_data) to properly reflect the new state of the slots: the image
headers in the primary and secondary slots are now respectively the
headers of the new and previous active image.

However, the implementation was doing the exact opposite when
swap-scratch was used, which could lead to erroneous behaviors such as a
failure when validating the primary slot.

Signed-off-by: Thomas Altenbach <thomas.altenbach@legrand.com>
@utzig
Copy link
Member

utzig commented Nov 11, 2024

Also note that the issue hasn't been detected by the automatic tests since they unfortunately seem to only perform swap-scratch upgrades with images having exactly the same size. I will create a GitHub issue about that point

I don't remember why maximal was created but testing locally with different sizes did fail as you mention. If I get some time this week I'll review that change and submit a fix to the simulator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: core Affects core functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants