sys/pm: Bugfixes and improvements #13973

maribu · 2020-04-29T06:23:58Z

Contribution description

Don't use volatile for pm_blocker
- This does not provide atomic access (e.g. on devices with a memory bus of less than 32 bit)
Add pm_blocker_t pm_get_blocker()
- This functions allows atomic access to the current state of pm_blocker
- The variable pm_blocker is made static to prevent non-atomic access by applications
pm_set_lowest(): Simplify logic by disabling IRQs before searching the lowest allowed mode
- Comparing up to four bytes is so fast, so no need to do this without IRQs disabled
pm_block() and pm_unblock(): Move the assert() into the critical core to
- Otherwise we have TOC/TOU issues
Fix race condition pm_off() fallback implementation
- The list of blockers was previously cleared with IRQs still enabled
  ==> It could occur that after clearing it and before pm_set_lowest() disables IRQs another thread blocks PM modes again. As a result, the pm_off() would have consumed more power than needed
- Simply disabling IRQs makes sure that no additional power is consumed. In addition, this will prevent any thread to run after pm_off() is called, as IRQs will no longer wake up the device

Testing procedure

tests/periph_pm should still wore as expected
The recently added pm shell command should still work as expected

Issues/PRs references

None

- Don't use volatile for pm_blocker - This does not provide atomic access (e.g. on devices with a memory bus of less than 32 bit) - Add pm_blocker_t pm_get_blocker() - This functions allows atomic access to the current state of pm_blocker - The variable pm_blocker is made static to prevent non-atomic access by applications - pm_set_lowest(): Simplify logic by disabling IRQs before searching the lowest allowed mode - Comparing up to four bytes is so fast, so no need to do this without IRQs disabled - pm_block() and pm_unblock(): Move the assert() into the critical core to - Otherwise we have TOC/TOU issues - Fix race condition pm_off() fallback implementation - The list of blockers was previously cleared with IRQs still enabled ==> It could occur that after clearing it and before pm_set_lowest() disables IRQs another thread blocks PM modes again. As a result, the pm_off() would have consumed more power than needed - Simply disabling IRQs makes sure that no additional power is consumed. In addition, this will prevent any thread to run after pm_off() is called, as IRQs will no longer wake up the device

maribu · 2020-04-29T06:26:05Z

Maybe @benemorius could also have a look to confirm this doesn't break the pm shell command added in his PR?

roberthartung

Apart from my comments, code looks good. Why wasnt pm_set not used in pm_set_lowest before?

roberthartung · 2020-04-29T07:09:44Z

sys/pm_layered/pm.c

 }

 #ifndef PROVIDES_PM_LAYERED_OFF
 void pm_off(void)
 {
+    irq_disable();


I understand that disabling IRQ makes sure we sleep, but still doesnt feel right to add this change.

Huh? The function is called pm_off(). We should be sure that after this call nothing else gets executed.

Assume some one writes an application and uses it for a board that has full hardware support for all pm features, but only a CPU idle is implemented now and now pm_off(). Let's also assume that application calls due to a bug pm_off() in a low prio thread after that thread completes. In that case, the board would just continue to run the application, but with excessive power consumption: Once the pm_set_lowest() returns, the board will remain in the highest power consumption mode.

The excessive power consumption is only one of the issues. If that application is build against a new release of RIOT which ships an implementation of pm_off() for that board, the application behavior would change: The moment the low prio thread calls pm_off() all other threads are never run again. The user will assume the bug is in the new RIOT release, rather than in the premature call of pm_off().

roberthartung · 2020-04-29T07:10:28Z

sys/pm_layered/pm.c

+pm_blocker_t pm_get_blocker(void)
+{
+    pm_blocker_t result;
    unsigned state = irq_disable();
-    pm_blocker.val_u8[mode]--;
+    result = pm_blocker;
    irq_restore(state);
+    return result;


This seems overly complicated, you want to create a copy for it, right?

The read to the stack variable result is guarded in irq_disable() and irq_restore(). This prevents e.g. on AVR devices that the read sequence is interrupted. Direct access to the variable could result in reading an inconsistent state, as some bytes could be retrieved at time A and the remaining bytes at time B. The result returned would be a blocker state that is potentially different to both time A and time B. This is something we have to prevent.

Additionally: irq_disable() and irq_restore() are memory barriers, which prevent elimination of seemingly unessecary reads.

E.g. a thread calling pm_get_blocker() in a loop and printing the result every 1 second. Without the memory barriers, the compiler would assume that the value read in the first iteration of the loop is still the same as in the second, as it is not changed by the current thread of execution. But as other threads or ISRs could have changed it, it is needed to be read again. https://lwn.net/ recently had a nice serious about compiler optimizations that are fully valid in terms of the C standard but render your program unusable, unless the programmer uses techniques like memory barriers to tell the compiler which time of optimizations are not applicable in a specific context.

kaspar030 · 2020-04-29T08:14:48Z

IMO, too many changes in one commit or even PR. Some of them I totally agree with, others I think need evaluation.

maribu · 2020-04-29T08:46:47Z

IMO, too many changes in one commit

I'll split it into multiple commits.

or even PR.

Are you serious? This PR changes less than 50 lines of code mostly in a single C file.

kaspar030 · 2020-04-29T08:58:20Z

Are you serious? This PR changes less than 50 lines of code mostly in a single C file.

Well, a PR that introduces pm_get_blocker() and uses it where needed, I review just by the looks.
Adding irq_disable() to pm_off() too.
"simplifying" pm_set_lowest() (which was optimized to minimize time in critical section) I'd like to look at the assembly output. Also, even with the change in its own commit, this would be juggling with commits compared to master.
Moving around asserts, didn't look into it yet at all.

Them being in one PR, the discussions get cluttered, the concentration split, ...
And if any issue arises, the overhead of figuring out what caused them is multiplied.

Line count doesn't matter, IMO.

maribu · 2020-04-29T09:02:37Z

OK, I'll split this in separate PRs.

maribu added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Area: pm Area: (Low) power management labels Apr 29, 2020

maribu requested review from aabadie and kaspar030 April 29, 2020 06:23

maribu requested review from MrKevinWeiss and roberthartung as code owners April 29, 2020 06:23

roberthartung reviewed Apr 29, 2020

View reviewed changes

maribu closed this Apr 29, 2020

maribu deleted the pm_fixes branch April 29, 2020 09:03

This was referenced Apr 29, 2020

sys/pm: Add pm_get_blocker() #13976

Merged

sys/pm: Correctly access pm_blocker #13977

Merged

sys/pm: Fix behavior of fallback pm_off() implementation #13978

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sys/pm: Bugfixes and improvements #13973

sys/pm: Bugfixes and improvements #13973

maribu commented Apr 29, 2020

maribu commented Apr 29, 2020

roberthartung left a comment

roberthartung Apr 29, 2020

maribu Apr 29, 2020

roberthartung Apr 29, 2020

maribu Apr 29, 2020

kaspar030 commented Apr 29, 2020

maribu commented Apr 29, 2020

kaspar030 commented Apr 29, 2020 •

edited

Loading

maribu commented Apr 29, 2020

sys/pm: Bugfixes and improvements #13973

sys/pm: Bugfixes and improvements #13973

Conversation

maribu commented Apr 29, 2020

Contribution description

Testing procedure

Issues/PRs references

maribu commented Apr 29, 2020

roberthartung left a comment

Choose a reason for hiding this comment

roberthartung Apr 29, 2020

Choose a reason for hiding this comment

maribu Apr 29, 2020

Choose a reason for hiding this comment

roberthartung Apr 29, 2020

Choose a reason for hiding this comment

maribu Apr 29, 2020

Choose a reason for hiding this comment

kaspar030 commented Apr 29, 2020

maribu commented Apr 29, 2020

kaspar030 commented Apr 29, 2020 • edited Loading

maribu commented Apr 29, 2020

kaspar030 commented Apr 29, 2020 •

edited

Loading