Skip to content
This repository has been archived by the owner on Jan 30, 2024. It is now read-only.

add some bounds checking to unwinding #184

Merged
merged 1 commit into from
Apr 29, 2021
Merged

Conversation

japaric
Copy link
Member

@japaric japaric commented Apr 29, 2021

probe-rs' memory access API does not perform bounds checking -- I assume because the hardware probe
itself does not do any bounds checking -- so reading non-existing memory results in a value of 0
being returned.

When a program has overflowed its stack, the HardFault exception is called. The exception entry
pushed some registers onto the stack (see Stacked struct). Because there's no or little stack memory
in the stack overflow scenario the registers are pushed outside the RAM region; this results in
register information being lost. Without this information we cannot locate the frame preempted by
HardFault and thus cannot unwind the stack.

Before this PR, the unwinding process didn't perform bounds check when reading the stacked
registers; this resulted in zeroes being reported by probe-rs. The unwinding process continued with
these wrong values, printed junk names as part of the backtrace and ultimately failed.

With this PR, Stacked::read performs a bound check so when the unwinder sees that invalid memory is
trying to be access it stops the unwinding process and reports a warning.


New output:

$ cargo run --bin stack-overflow
(..)
stack backtrace:
   0: HardFaultTrampoline
      <exception entry>
  (HOST) WARN  exception entry pushed registers outside RAM; not possible to unwind the stack
  (HOST) ERROR the program has overflowed its stack

Old output:

stack backtrace:
   0: HardFaultTrampoline
      <exception entry>
   1: {"package":"defmt","tag":"defmt_prim","data":"{=__internal_Display}","disambiguator":"9857217393540545671"}
error: the stack appears to be corrupted beyond this point
  (HOST) ERROR the program has overflowed its stack

probe-rs' memory access API does not perform bounds checking -- I assume because the hardware probe
itself does not do any bounds checking -- so reading non-existing memory results in a value of 0
being returned.

When a program has overflowed its stack, the HardFault exception is called. The exception entry
pushed some registers onto the stack (see Stacked struct). Because there's no or little stack memory
in the stack overflow scenario the registers are pushed outside the RAM region; this results in
register information being lost. Without this information we cannot locate the frame preempted by
HardFault and thus cannot unwind the stack.

Before this PR, the unwinding process didn't perform bounds check when reading the stacked
registers; this resulted in zeroes being reported by probe-rs. The unwinding process continued with
these wrong values, printed junk names as part of the backtrace and ultimately failed.

With this PR, Stacked::read performs a bound check so when the unwinder sees that invalid memory is
trying to be access it stops the unwinding process and reports a warning.
.as_ref()
.map(|ram_region| ram_region.range.clone())
// if no device-specific information, use the range specific in the Cortex-M* Technical Reference Manual
.unwrap_or(0x2000_0000..0x4000_0000);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although this is what the TRM specifies I know some vendors use 0x1000_0000 for CCRAM (core-coupled RAM)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like somewhere you might want to put the stack... hmm, can we use the initial value of SP here and round it down to a multiple of 0x1000_0000?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only ever seen the start of RAM being aligned to 0x1000_0000 but I'm not aware of that being a requirement of the architecture. I do know that VTOR needs to be aligned to a power of 2 that's proportional to the number of vectors in the vector table (learned that one the hard way).

That being said, is there any reason the sp_ram_region value can be None: that is it's not possible to locate the RAM region the SP belongs to? Perhaps the RAM region data that comes from probe-rs is not reliable? Unsure if that info is retrieved from the device itself or hardcoded in probe-rs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this data comes from a fixed database in probe-rs, which in turn is derived from vendor-provided documentation, so it might not be correct. I haven't seen that happen so far though.

If we do want to be smarter with the fallback here, this isn't really something that needs fixing before this can be merged though.

src/stacked.rs Show resolved Hide resolved
Copy link
Contributor

@jonas-schievink jonas-schievink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good so far, added some minor comments

@japaric
Copy link
Member Author

japaric commented Apr 29, 2021

bors r+

@bors bors bot merged commit 0225ddc into main Apr 29, 2021
@bors bors bot deleted the add-bounds-checking-to-unwinding branch April 29, 2021 12:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants