Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/pm: Bugfixes and improvements #13973

Closed
wants to merge 1 commit into from
Closed

Conversation

maribu
Copy link
Member

@maribu maribu commented Apr 29, 2020

Contribution description

  • Don't use volatile for pm_blocker
    • This does not provide atomic access (e.g. on devices with a memory bus of less than 32 bit)
  • Add pm_blocker_t pm_get_blocker()
    • This functions allows atomic access to the current state of pm_blocker
    • The variable pm_blocker is made static to prevent non-atomic access by applications
  • pm_set_lowest(): Simplify logic by disabling IRQs before searching the lowest allowed mode
    • Comparing up to four bytes is so fast, so no need to do this without IRQs disabled
  • pm_block() and pm_unblock(): Move the assert() into the critical core to
    • Otherwise we have TOC/TOU issues
  • Fix race condition pm_off() fallback implementation
    • The list of blockers was previously cleared with IRQs still enabled
      ==> It could occur that after clearing it and before pm_set_lowest() disables IRQs another thread blocks PM modes again. As a result, the pm_off() would have consumed more power than needed
    • Simply disabling IRQs makes sure that no additional power is consumed. In addition, this will prevent any thread to run after pm_off() is called, as IRQs will no longer wake up the device

Testing procedure

  • tests/periph_pm should still wore as expected
  • The recently added pm shell command should still work as expected

Issues/PRs references

None

- Don't use volatile for pm_blocker
    - This does not provide atomic access (e.g. on devices with a memory bus of
      less than 32 bit)
- Add pm_blocker_t pm_get_blocker()
    - This functions allows atomic access to the current state of pm_blocker
    - The variable pm_blocker is made static to prevent non-atomic access by
      applications
- pm_set_lowest(): Simplify logic by disabling IRQs before searching the lowest
  allowed mode
    - Comparing up to four bytes is so fast, so no need to do this without IRQs
      disabled
- pm_block() and pm_unblock(): Move the assert() into the critical core to
    - Otherwise we have TOC/TOU issues
- Fix race condition pm_off() fallback implementation
    - The list of blockers was previously cleared with IRQs still enabled
  ==> It could occur that after clearing it and before pm_set_lowest() disables
      IRQs another thread blocks PM modes again. As a result, the pm_off() would
      have consumed more power than needed
    - Simply disabling IRQs makes sure that no additional power is consumed. In
      addition, this will prevent any thread to run after pm_off() is called,
      as IRQs will no longer wake up the device
@maribu maribu added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Area: pm Area: (Low) power management labels Apr 29, 2020
@maribu
Copy link
Member Author

maribu commented Apr 29, 2020

Maybe @benemorius could also have a look to confirm this doesn't break the pm shell command added in his PR?

Copy link
Member

@roberthartung roberthartung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from my comments, code looks good. Why wasnt pm_set not used in pm_set_lowest before?

}

#ifndef PROVIDES_PM_LAYERED_OFF
void pm_off(void)
{
irq_disable();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that disabling IRQ makes sure we sleep, but still doesnt feel right to add this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? The function is called pm_off(). We should be sure that after this call nothing else gets executed.

Assume some one writes an application and uses it for a board that has full hardware support for all pm features, but only a CPU idle is implemented now and now pm_off(). Let's also assume that application calls due to a bug pm_off() in a low prio thread after that thread completes. In that case, the board would just continue to run the application, but with excessive power consumption: Once the pm_set_lowest() returns, the board will remain in the highest power consumption mode.

The excessive power consumption is only one of the issues. If that application is build against a new release of RIOT which ships an implementation of pm_off() for that board, the application behavior would change: The moment the low prio thread calls pm_off() all other threads are never run again. The user will assume the bug is in the new RIOT release, rather than in the premature call of pm_off().

Comment on lines +72 to +78
pm_blocker_t pm_get_blocker(void)
{
pm_blocker_t result;
unsigned state = irq_disable();
pm_blocker.val_u8[mode]--;
result = pm_blocker;
irq_restore(state);
return result;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems overly complicated, you want to create a copy for it, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The read to the stack variable result is guarded in irq_disable() and irq_restore(). This prevents e.g. on AVR devices that the read sequence is interrupted. Direct access to the variable could result in reading an inconsistent state, as some bytes could be retrieved at time A and the remaining bytes at time B. The result returned would be a blocker state that is potentially different to both time A and time B. This is something we have to prevent.

Additionally: irq_disable() and irq_restore() are memory barriers, which prevent elimination of seemingly unessecary reads.

E.g. a thread calling pm_get_blocker() in a loop and printing the result every 1 second. Without the memory barriers, the compiler would assume that the value read in the first iteration of the loop is still the same as in the second, as it is not changed by the current thread of execution. But as other threads or ISRs could have changed it, it is needed to be read again. https://lwn.net/ recently had a nice serious about compiler optimizations that are fully valid in terms of the C standard but render your program unusable, unless the programmer uses techniques like memory barriers to tell the compiler which time of optimizations are not applicable in a specific context.

@kaspar030
Copy link
Contributor

IMO, too many changes in one commit or even PR. Some of them I totally agree with, others I think need evaluation.

@maribu
Copy link
Member Author

maribu commented Apr 29, 2020

IMO, too many changes in one commit

I'll split it into multiple commits.

or even PR.

Are you serious? This PR changes less than 50 lines of code mostly in a single C file.

@kaspar030
Copy link
Contributor

kaspar030 commented Apr 29, 2020

Are you serious? This PR changes less than 50 lines of code mostly in a single C file.

Well, a PR that introduces pm_get_blocker() and uses it where needed, I review just by the looks.
Adding irq_disable() to pm_off() too.
"simplifying" pm_set_lowest() (which was optimized to minimize time in critical section) I'd like to look at the assembly output. Also, even with the change in its own commit, this would be juggling with commits compared to master.
Moving around asserts, didn't look into it yet at all.

Them being in one PR, the discussions get cluttered, the concentration split, ...
And if any issue arises, the overhead of figuring out what caused them is multiplied.

Line count doesn't matter, IMO.

@maribu
Copy link
Member Author

maribu commented Apr 29, 2020

OK, I'll split this in separate PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: pm Area: (Low) power management Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants