Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an image access abstraction to RoT update_server #1813

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

lzrd
Copy link
Contributor

@lzrd lzrd commented Jun 12, 2024

The upcoming rollback protection feature needs to access partial or complete
images in flash, ram, or a mix.
The ImageAccess abstraction makes these operations more readable.

@lzrd lzrd force-pushed the rot-image-access branch 4 times, most recently from 26ccf19 to 8eec9a6 Compare June 30, 2024 04:30
Copy link
Contributor

@aapoalas aapoalas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments from the peanut gallery. Looks like a very nice abstraction layer in general.

drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
let initial_pc_start = image_start.saturating_add(IMAGE_HEADER_OFFSET);
let initial_pc_end =
image_start.saturating_add(self.nxp_offset_to_specific_header);
let pc = exec.start.saturating_add(self.normalized_initial_pc());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Here and also on line 153 were using exec.start while other lines use the image_start alias. Could maybe unify to image_start to avoid confusion on the offsetting.

drv/lpc55-update-server/src/images.rs Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
@lzrd lzrd force-pushed the rot-image-access branch from 8eec9a6 to 624fdfa Compare July 8, 2024 19:56
@lzrd lzrd marked this pull request as ready for review July 8, 2024 19:57
@lzrd lzrd requested review from cbiffle, flihp and labbott as code owners July 8, 2024 19:57
Copy link
Collaborator

@labbott labbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have stats on how much this changes flash size?

drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Show resolved Hide resolved
@lzrd
Copy link
Contributor Author

lzrd commented Jul 9, 2024

This change adds 768 bytes to the image flash size

          master/xtask-oxide-rot-1_app-dev.log.ok:  flash:   0x39c80 (90%)
rot-image-access/xtask-oxide-rot-1_app-dev.log.ok:  flash:   0x39f80 (90%)
echo $(( 0x39f80 - 0x39c80 ))
768

              master/xtask-rot-carrier_app.log.ok:  flash:   0x35aa0 (83%)
    rot-image-access/xtask-rot-carrier_app.log.ok:  flash:   0x35da0 (84%)

echo $(( 0x35da0 - 0x35aa0 ))
768

New sizes for oxide-rot-1

flash   = 0x00050000..0x00090000
ram     = 0x20004000..0x2001c000
sram2   = 0x20020000..0x20030000
usbsram = 0x40102000..0x40103000
dice_certs = 0x40100000..0x40100a00
dice_alias = 0x40100a00..0x40101200
Used:
  flash:   0x39f80 (90%)
  ram:     0x13d20 (82%)
  sram2:   0x0 (0%)
  usbsram: 0x1000 (100%)
  dice_certs: extern region (attest)
  dice_alias: extern region (attest)
warning: memory allocation is sub-optimal
Suggested improvements:
kernel:
  ram:    3040  (currently 4096)
warning: memory allocation is sub-optimal
Suggested improvements:
kernel:
  ram:    3040  (currently 4096)
APP=app/oxide-rot-1/app-dev.toml

@lzrd
Copy link
Contributor Author

lzrd commented Jul 9, 2024

There is work to do to clear fmt and panics out of the the update_server:

On master:

$ arm-none-eabi-objdump -C -d target/oxide-rot-1-dev/dist/a/update_server  | tee update_server.s
$ grep '   b.*panic' update_server.s  | wc -l
50

On this PR:

$ arm-none-eabi-objdump -C -d target/oxide-rot-1-dev/dist/a/update_server  | tee update_server.s
$ grep '   b.*panic' update_server.s  | wc -l
59

@labbott
Copy link
Collaborator

labbott commented Jul 10, 2024

This change adds 768 bytes to the image flash size

I consider this reasonable

Copy link
Collaborator

@cbiffle cbiffle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the image validation logic, would it be possible to pull this into a no_std crate so we can have unit tests?

drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
},
Ram {
buffer: &'a [u8],
span: FlashRange,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, a number of places in the access code below appear to rely on buffer.len() being equal to the length of this span. How is this invariant maintained? Would it make more sense to have span be a base address instead, so that the len isn't duplicated (and thus doesn't have to be kept in sync)?

) -> ImageAccess<'_> {
let span = flash_range(component, slot);
ImageAccess {
accessor: Accessor::Ram { buffer, span },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Here is a place where the length-matches invariant is definitely not maintained, fwiw.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless all of a slot is in use, it is expected that buffer.len() would always be less than the length of one of the spans.

Perhaps I should change span to slot_range (since Slot is already in use).
Suggestions welcome.

Span is a FlashRange which describes both the stored and the at_runtime address ranges of the slot that the image can occupy.
buffer here is a portion of the image that currently resides in RAM. The rest of the image may not be present.
In the case of a Accessor::Ram variant, the entire image resides in ram. It's destination slot is described by span.stored and its runtime slot is described by span.at_runtime (new names).

drv/lpc55-update-server/src/images.rs Outdated Show resolved Hide resolved
@lzrd
Copy link
Contributor Author

lzrd commented Aug 6, 2024

I'm concerned about the image validation logic, would it be possible to pull this into a no_std crate so we can have unit tests?

I would like to have a no_std library in the hubtools repo that can be used by any code concerned with Oxide LPC55 images. That would include Hubris lib/lpc55-rot-startup/src/images.rs if practical. Tests would be included there.

I'd rather not refactor that now and the sprot-e2e tests are successful for upgrade and rollback to and from the current master branch and the rot-image-access branch.

Created oxidecomputer/hubtools#34

@cbiffle
Copy link
Collaborator

cbiffle commented Aug 6, 2024

Looks like all the saturating arithmetic is still in place. I need convincing that this handles bounds cases correctly. Tests would be the fastest way, short of that I'd like to see a laaaaarge rationale comment explaining why the arithmetic isn't checked instead.

Copy link
Contributor

@aapoalas aapoalas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort of accidentally came to read this PR anew and found some questions that I thought were interesting. Feel free to ignore these if need or want be; i ended up sitting on these questions for several days but figured eh, let's ask. They're just curiosity, basically.

/// does match FWIDs against the boot-time-info in some cases. But, because
/// flash areas are mutated during update_server operations and stuff can
/// happen, any data in the non-active Hubris image needs to be treated as
/// untrusted (update_server does not alter its own image).

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The active image could be modified by the update_server task if there were a code path to do that.
There is a check testing to see if the slot to be updated is the currently executing Hubris slot. If so, the request is rejected.

pub const LENGTH_OFFSET: usize = 0x20;
pub const HEADER_OFFSET: u32 = 0x130;
const MAGIC_OFFSET: usize = HEADER_OFFSET as usize;
// An image may have an ImageHeader located after the

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The early versions of the bootloader did not have the header and we have had discussions in the past about removing the need for the header. update_server tolerates a missing ImageHeader.

drv/lpc55-update-server/src/images.rs Show resolved Hide resolved
lzrd added 3 commits October 11, 2024 15:50
Remove last of the saturated math calls.
Rename FlashRange::{store,exec} to {stored,at_runtime} for clarity.
Remove unnecessary image validation logic in fn padded_image_len.
@lzrd lzrd force-pushed the rot-image-access branch from d7ff24a to f8cf87c Compare October 11, 2024 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants