Fix hardfault in MPU configuration #1193
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While adding caboose support to
control-plane-agent
, I noticed a dead-reproducible issue which put the system into a weird state:Note that no task is marked as running, and system time is not advancing.
Here are the steps to reproduce:
7c1c470aa4c5a229bd8
(latest commit oncaboose-mgs-api
branch) onto a Gimletlet with a NICfaux-mgs -ltrace --interface en0 state
(substitute en0 for your favorite interface)Luckily, we got a good backtrace using
humility gdb --run-openocd
:Here's the relevant disassembly:
and register:
This right about here in MPU configuration, about to write region 3's address to
MPU_RASR
Here's the relevant region:
Sure enough, this is the caboose region that we just added to the MPU tables!
It's correctly aligned for a 512-byte region, so what gives?
Well, turns out that we can't hot-configure the MPU; we're writing a 512-byte-aligned address to
RASR
, butRBAR
may be configured to require a more stringent alignment (because it could be set to a larger size from a previous task). This means that the MPU is briefly in an invalid state, which is Bad News.The fix is to disable the MPU region when configuring it, then re-enable it. This means we have to go through the
RNR
register instead of being able to simultaneously select the region and address; alas.