-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sys/pm: Correctly access pm_blocker #13977
Conversation
6bbbc9a
to
a689b9d
Compare
unsigned state = irq_disable(); | ||
pm_blocker_t result = pm_blocker; | ||
irq_restore(state); | ||
return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only be necessary on non-32 bit machines, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There sadly is no guarantee, that the compiler will generate a single read whenever possible. There were examples on GCC generating two 32 bit stores on x86_64 when writing to a volatile
, as two stores with an immediate where faster than preparing the correct value in a register and writing that back to memory. As a result, some linux drivers stopped working when compiled with that GCC version. I think it is safe to say that the Linux kernel guys convinced the compiler guys that their compilers have no host to run on if they break kernels by splitting a single volatile
accesses into two accesses. But for non-volatile
accesses we should still be careful to do any assumptions on how the generated assembly looks like.
So what we could IMO safely do is something like this:
pm_blocker_t pm_get_blocker(void)
{
pm_blocker_t result;
if (ARCH_32BIT) {
volatile uint32_t *tmp = &pm_blocker.val_u32;
result.val_u32 = *tmp;
}
else {
unsigned state = irq_disable();
result = pm_blocker;
irq_restore(state);
}
return result;
}
It would be nice to make the FEATURES
like arch32_bit
available to C code for this. The compiler should than drop the dead branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARCH_32_BIT
is now available, can you adapt to your suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @maribu :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would merge without optimizations, that PR can take time and this is still potential source for issues.
Tested on nucleo-l152re
|
@maribu can you push to trigger github-actions on this one as well? |
Replace `volatile` access to pm_blocker by guarding the accesses with `irq_disable()` ... `irq_restore()`. `volatile` does only guarantee that no compiler optimizations are performed on a variable access, but does not provide atomic access. E.g. on systems with a memory bus of less than 32 bit, the access to pm_blocker cannot be done with a single CPU instruction. Thus, resorting to disabling IRQs is the easiest and most portable way to actually achieve atomic access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK, there are possible optimization for 32bit platforms as mentioned in #13977 (comment), it was agreed to postpone those to after #14331 is in.
Thanks :-) |
Contribution description
Replace
volatile
access to pm_blocker by guarding the accesses withirq_disable()
...irq_restore()
.volatile
does only guarantee that no compiler optimizations are performed on a variable access, but does not provide atomic access. E.g. on systems with a memory bus of less than 32 bit, the access to pm_blocker cannot be done with a single CPU instruction. Thus, resorting to disabling IRQs is the easiest and most portable way to actually achieve atomic access.Testing procedure
tests/periph_pm
should still workIssues/PRs references
Split out of #13973
Depends on and contains #13976