Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/pm: Correctly access pm_blocker #13977

Merged
merged 1 commit into from
Jul 15, 2020
Merged

Conversation

maribu
Copy link
Member

@maribu maribu commented Apr 29, 2020

Contribution description

Replace volatile access to pm_blocker by guarding the accesses with irq_disable() ... irq_restore().

volatile does only guarantee that no compiler optimizations are performed on a variable access, but does not provide atomic access. E.g. on systems with a memory bus of less than 32 bit, the access to pm_blocker cannot be done with a single CPU instruction. Thus, resorting to disabling IRQs is the easiest and most portable way to actually achieve atomic access.

Testing procedure

tests/periph_pm should still work

Issues/PRs references

Split out of #13973

Depends on and contains #13976

@maribu maribu added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer State: waiting for other PR State: The PR requires another PR to be merged first Area: pm Area: (Low) power management labels Apr 29, 2020
@benpicco benpicco added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed State: waiting for other PR State: The PR requires another PR to be merged first labels Apr 29, 2020
Comment on lines +86 to +89
unsigned state = irq_disable();
pm_blocker_t result = pm_blocker;
irq_restore(state);
return result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be necessary on non-32 bit machines, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There sadly is no guarantee, that the compiler will generate a single read whenever possible. There were examples on GCC generating two 32 bit stores on x86_64 when writing to a volatile, as two stores with an immediate where faster than preparing the correct value in a register and writing that back to memory. As a result, some linux drivers stopped working when compiled with that GCC version. I think it is safe to say that the Linux kernel guys convinced the compiler guys that their compilers have no host to run on if they break kernels by splitting a single volatile accesses into two accesses. But for non-volatile accesses we should still be careful to do any assumptions on how the generated assembly looks like.

So what we could IMO safely do is something like this:

pm_blocker_t pm_get_blocker(void)
{
    pm_blocker_t result;
    if (ARCH_32BIT) {
        volatile uint32_t *tmp = &pm_blocker.val_u32;
        result.val_u32 = *tmp;
    }
    else {
        unsigned state = irq_disable();
        result = pm_blocker;
        irq_restore(state);
    }
    return result;
}

It would be nice to make the FEATURES like arch32_bit available to C code for this. The compiler should than drop the dead branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARCH_32_BIT is now available, can you adapt to your suggestion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @maribu :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I still had #14331 in the pipeline, which is a better solution than what I suggested above.

I'm not sure if we should just stall this PR until #14331 is merged, or get it in without the optimization for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would merge without optimizations, that PR can take time and this is still potential source for issues.

@fjmolinas
Copy link
Contributor

Tested on nucleo-l152re

2020-06-16 13:46:25,100 #  help
2020-06-16 13:46:25,102 # Command              Description
2020-06-16 13:46:25,106 # ---------------------------------------
2020-06-16 13:46:25,113 # unblock_rtc          temporarily unblock power mode
2020-06-16 13:46:25,115 # reboot               Reboot the node
2020-06-16 13:46:25,119 # version              Prints current RIOT_VERSION
2020-06-16 13:46:25,124 # pm                   interact with layered PM subsystem
2020-06-16 13:46:25,126 # rtc                  control RTC peripheral interface
pm unblock 1
2020-06-16 13:46:32,541 #  pm unblock 1
2020-06-16 13:46:32,547 # Unblocking power mode 1.
> unblock_rtc 1 5
2020-06-16 13:46:36,423 #  unblock_rtc 1 5
2020-06-16 13:46:36,430 # Unblocking power mode 1 for 5 seconds.
> a
a
a

help
2020-06-16 13:46:43,047 #  help
2020-06-16 13:46:43,048 # Command              Description
2020-06-16 13:46:43,054 # ---------------------------------------
2020-06-16 13:46:43,059 # unblock_rtc          temporarily unblock power mode
2020-06-16 13:46:43,061 # reboot               Reboot the node
2020-06-16 13:46:43,072 # version              Prints current RIOT_VERSION
2020-06-16 13:46:43,074 # pm                   interact with layered PM subsystem
2020-06-16 13:46:43,077 # rtc                  control RTC peripheral interface

@fjmolinas
Copy link
Contributor

@maribu can you push to trigger github-actions on this one as well?

Replace `volatile` access to pm_blocker by guarding the accesses with
`irq_disable()` ... `irq_restore()`.

`volatile` does only guarantee that no compiler optimizations are performed on
a variable access, but does not provide atomic access. E.g. on systems with
a memory bus of less than 32 bit, the access to pm_blocker cannot be done
with a single CPU instruction. Thus, resorting to disabling IRQs is the easiest
and most portable way to actually achieve atomic access.
@cgundogan cgundogan added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Jul 15, 2020
Copy link
Contributor

@fjmolinas fjmolinas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, there are possible optimization for 32bit platforms as mentioned in #13977 (comment), it was agreed to postpone those to after #14331 is in.

@fjmolinas fjmolinas merged commit 3934f9f into RIOT-OS:master Jul 15, 2020
@maribu
Copy link
Member Author

maribu commented Jul 15, 2020

Thanks :-)

@maribu maribu deleted the pm_atomic_access branch July 15, 2020 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: pm Area: (Low) power management CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants