Skip to content

Error path tightening: use MBED_NORETURN; add+use core_util_atomic_flag #8328

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 30, 2018

Conversation

kjbracey
Copy link
Contributor

@kjbracey kjbracey commented Oct 4, 2018

Description

Save some ROM space by putting MBED_NORETURN attributes on error functions and failed asserts.

As part of this, adjust error paths to make sure they don't return.

core_util_atomic_flag added to support the above.

See commit messages for more detail.

Fixes #8496.

Pull request type

[ ] Fix
[X] Refactor
[ ] Target update
[ ] Functionality change
[ ] Breaking change

@kjbracey
Copy link
Contributor Author

kjbracey commented Oct 4, 2018

@c1728p9 - you originally put the silent-exit recursion check in. Does this "insta-death" alternative suit as an alternative?

(I also note that mbed_error is reusing error's error_in_progress recursion flag as a buffer spinlock/mutex, which seems problematic - eg if someone called error inside a mbed_warning, it would be missed. Might need another patch to separate that, but worried about opening a can of worms about the spinlock).

@kjbracey
Copy link
Contributor Author

kjbracey commented Oct 4, 2018

Measured ROM saving is 464 bytes in a GCC_ARM K64F develop build of mbed-os-mesh-minimal. 604 bytes with ARM compiler.

(My first test saved 200 Kbytes, due to me having an error buffering test for #8076 in my main.cpp, and the attribute meant the compiler realised it didn't need to include any of the networking stuff used by the second half of main.)

@c1728p9
Copy link
Contributor

c1728p9 commented Oct 4, 2018

@kjbracey-arm the recursion check in mbed_halt_system was because in some cases error() led to a call to exit() which triggerd another error() and caused a loop. I think this happened when error() was called in interrupt context, but it has been a while.

It may be safer to leave leave the error_in_progress check in, since if any of the calls in mbed_die call error again this same problem will occur. The functions gpio_write, and gpio_init_out are implemented by the HAL so it would be hard to ensure that they never call error. Also in #8189 the function wait_ms may be changed to trap if called in interrupt context, which would be problematic as well.

One alternative to keeping error_in_progress is to make another function that guarantees no recursion by just spinning. Maybe something like like mbed_die_for_real.

@0xc0170 0xc0170 requested review from c1728p9 and a team October 4, 2018 15:26
{

// Prevent recursion if error is called again
if (error_in_progress) {
return;
mbed_die();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do mbed_die, previous error in progress will not be captured.

@kjbracey
Copy link
Contributor Author

kjbracey commented Oct 5, 2018

The code here is inconsistent - there's no recursion-stopping check on mbed_error, although I guess it tries to sidestep that original "from interrupt" case which originally caused it by not using exit if it thinks it's in IRQ context.

Also in #8189 the function wait_ms may be changed to trap if called in interrupt context, which would be problematic as well.

Good point - that might need a specialised wait function. I also note the mbed_die code has some sort of ifdef fudge which the log justifies as

a disable_irq() exception in common/board.c because the wait function is implemented in a interrupt for both NRF and SiLabs boards

Any idea what that's talking about? Presumably no longer relevant. Really don't want mbed_die leaving interrupts on - it should be stopping the whole system, not just blocking the current thread.

If we do mbed_die, previous error in progress will not be captured.

The previous error presumably has already been recorded. Note that error and mbed_error are using the same variable for two totally different meanings which conflict. error thinks it's an "error has happened" flag, so it's trying to catch recursion from "exit". mbed_error thinks it's an "I'm currently modifying the error buffer" spinlock.

@kjbracey kjbracey force-pushed the noreturn branch 2 times, most recently from afb68a7 to 2bc975d Compare October 5, 2018 13:53
@kjbracey
Copy link
Contributor Author

kjbracey commented Oct 5, 2018

Right, started digging a bit deeper into this - separated out the "in progress" flags, and put in a "simple halt" alternative.

@0xc0170 0xc0170 requested review from SenRamakri and bulislaw October 9, 2018 18:44
@cmonr
Copy link
Contributor

cmonr commented Oct 19, 2018

@deepikabhavnani @bulislaw @c1728p9 @SenRamakri When y'all are available for a re-review.

@kjbracey-arm This'll need to be rebased.

@kjbracey
Copy link
Contributor Author

Rebased. Reviewers - note that the commit message for the 2nd commit lists the changes and reasoning for stuff done to mbed_error exit paths.

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 24, 2018

@SenRamakri @deepikabhavnani Before another rebase, please review. I'll review shortly as well

* unless explicitly initialised with CORE_UTIL_ATOMIC_FLAG_INIT.
*/
typedef struct core_util_atomic_flag {
uint8_t _flag;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query - Should atomic operations be extended to 16-bit and 32-bit flags as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag is inherently an opaque 1-bit true/false boolean from an API point of view - the 8-bit storage is an implementation detail.

And I'm using uint8_t because the LDREXB function/intrinsic takes uint8_t *.

It would probably be okay to have a bool there and cast the pointer for LDREXB, but I hate casts. This way makes the code look neater, and means I'm not accidentally invoking undefined behaviour.

Only downside is that the compiler will insist on adding an extra instruction when I assign the _flag to the bool output of test-and-set - it thinks it needs to ensure all values from 1-255 come out as 1, when we know the thing can only be holding 0 or 1 in the first place.

Copy link

@deepikabhavnani deepikabhavnani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, but PR header and commits do not match. We have 3 commits here and header does not say anything about new atomic API's core_util_atomic_flag_test_and_set and core_util_atomic_flag_clear

@AnotherButler - Handbook update needed as well

@kjbracey kjbracey changed the title Add MBED_NORETURN attributes Error path tightening: use MBED_NORETURN; add+use core_util_atomic_flag Oct 24, 2018
@cmonr
Copy link
Contributor

cmonr commented Oct 25, 2018

Will start CI since it's quite light at the moment, but this needs a rebase first.

@cmonr
Copy link
Contributor

cmonr commented Oct 25, 2018

@bulislaw @SenRamakri Any thoughts?

@cmonr
Copy link
Contributor

cmonr commented Oct 25, 2018

@ARMmbed/mbed-os-maintainers

It looks like this PR could come into the next patch release, and would probably help the author(s) of #8496, but this refactor adding new functionality. Thoughts?

@kjbracey
Copy link
Contributor Author

#8496 arises from #8255 which didn't go to a patch release - they're testing master. This just needs to go in before 5.11.

@mbed-ci
Copy link

mbed-ci commented Oct 25, 2018

@mbed-ci
Copy link

mbed-ci commented Oct 25, 2018

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 25, 2018

Pipe closed error, restarting

/morph export-build

However tests - I see there 2-4 tests failing across all targets and toolchains (just one exception)

@cmonr
Copy link
Contributor

cmonr commented Oct 25, 2018

Test failures consistently around tests-mbed_hal-sleep_manager. Please take a look.

@kjbracey
Copy link
Contributor Author

Wow. There are tests that define stub versions of mbed_error that return.

That... won't work...

Might be able to do something with a longjmp?

@mbed-ci
Copy link

mbed-ci commented Oct 25, 2018

It will be too hard to try to intercept and continue from a trapped
error once error functions are marked [[noreturn]], so make the error
return tests conditional on error trapping being disabled.
@kjbracey
Copy link
Contributor Author

kjbracey commented Oct 29, 2018

Tests adjusted to not deliberately provoke mbed_error while intercepting it.

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 29, 2018

Travis docs fails : test_lock_gt_ushrt_max was removed so also should be from the docs (one more function as well, they are linked there)

Intercepting mbed_error will be too hard after mbed_error becomes
[[noreturn]], so remove tests that do this.
An atomic flag primitive is sometimes wanted, and it is cumbersome to
create it from the compare-and-swap operation - cumbersome enough that
people often don't bother.

Put in a core_util_atomic_flag that follows the C11/C++11 atomic_flag
API, such that it could be mapped to it with #define later.
Various fixes in preparation for making sure error calls do not return.

* Clear out handle_error's use of error_in_progress as a sort of spin
  lock; this is most likely to deadlock if ever activated, and conflicts
  with error's use of error_in_progress. Use a normal critical section lock.

* Make error use same mbed_halt_system helper as mbed_error.

* Make error's recursion check avoid print and proceed to halt, rather
  than returning.

* Make mbed_error use error_in_progress to avoid recursion in same way
  as error() does.

* Give mbed_halt_system its own recursion check in case of error in
  mbed_die - give it a simple fallback.

* Make the in_progress things properly atomic, just in case.
Save some ROM space by putting MBED_NORETURN attributes on error
functions and failed asserts.

mbed_error was documented as returning an error code. It never
actually could return, so documentation updated, but return type
kept.
Don't need loops at two layers.

Also tighten up slightly-invalid extern "C" markings.
@cmonr
Copy link
Contributor

cmonr commented Oct 29, 2018

/morph build

@mbed-ci
Copy link

mbed-ci commented Oct 29, 2018

Build : SUCCESS

Build number : 3500
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/8328/

Triggering tests

/morph test
/morph export-build
/morph mbed2-build

@mbed-ci
Copy link

mbed-ci commented Oct 29, 2018

@mbed-ci
Copy link

mbed-ci commented Oct 30, 2018

@cmonr cmonr merged commit 5ed07c2 into ARMmbed:master Oct 30, 2018
@cmonr cmonr removed the needs: CI label Oct 30, 2018
@mattbrown015
Copy link
Contributor

Hi @kjbracey-arm,

Could/should system_reset also be made MBED_NORETURN?

I tried calling it from our definition of mbed_die and got the following warning:

[Warning] main.cpp@423,1: 'noreturn' function does return

This generates the warning:

MBED_NORETURN void mbed_die() {
    system_reset();
}

but this doesn't:

MBED_NORETURN void mbed_die() {
    NVIC_SystemReset();
}

@kjbracey
Copy link
Contributor Author

Could/should system_reset also be made MBED_NORETURN?

Yes it should.

@kjbracey
Copy link
Contributor Author

PR created. The warning you're getting is harmless - you know system_reset doesn't return even if the compiler doesn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants