MBED_ASSERT No Longer Calls mbed_die #8496

mattbrown015 · 2018-10-22T16:34:17Z

Description

#8255 changed MBED_ASSERT so that it uses mbed_error rather than calling mbed_die directly.

While I have some sympathy for the intent of this PR it has changed the behaviour of our application when asserts fire.

mbed_error can eventual call mbed_die via mbed_halt_system but it won't in the case of MBED_ASSERT because mbed_assert_internal calls core_util_critical_section_enter.

This is important to our application because it is a remote battery powered device. We want our device to just restart/reboot if an assert fires. Spinning is no use to us because it will just take the device offline and flatten the battery.

We have provided our own definition of mbed_die that just calls system_reset. This worked for us before #8255.

When we started our application, back with V5.9, we didn't really understand the error API and used MBED_ASSERT. It's too late to change all our calls to MBED_ASSERT to MBED_ERROR and I'm not entirely sure they are equivalent.

To some extent I think assert should be simple and I don't think mbed_error is simple. Asserts shouldn't fire and if they do it means no one knows what has happened. Error logging is very noble but in an embedded system that is crashing is it going to achieve anything? In other words, I think I liked MBED_ASSERT better before.

Another fix for my problem would be to remove the call to core_util_critical_section_enter. Is it really necessary to call core_util_critical_section_enter from mbed_assert_internal? This call was added a long time ago and when mbed_assert_internal was very different.

I also appreciate that we should have a watchdog enabled but at the moment we don't and implementing one correctly is not that straightforward.

Issue request type

[ ] Question
[ ] Enhancement
[x] Bug

The text was updated successfully, but these errors were encountered:

linlingao · 2018-10-22T21:31:09Z

Internal Jira reference: MBOTRIAGE-1736

kjbracey · 2018-10-23T07:57:42Z

I think #8255 is reasonable - the problem is the implementation of mbed_error. I have a follow-up PR which covers this here: #8328

See if that covers your issue. It should attempt to call mbed_die in the first instance as it goes MBED_ASSERT->mbed_error->mbed_halt_system->mbed_die. The "spin and halt" behaviour is reduced to back-up only in the event of recursive crash inside mbed_die.

mattbrown015 · 2018-10-23T08:12:43Z

I think #8255 is reasonable - the problem is the implementation of mbed_error.

Agreed! :-)

See if that covers your issue.

It looks like it should and it makes sense to me. I will try eventually.

I've been having a real ding-dong with mbed-os in the last week or so, several PRs have broken our app and really slowed us down. I'm currently stuck on LoRaWAN PR which increases the app size so that we no longer fit in flash. I think I'll be writing another issue soon and then I'll try #8328.

mattbrown015 · 2018-10-23T13:38:33Z

I tried f899341 and 4915cbe and believe they do what I want! :-)

With these changes and our mbed_die definition MBED_ASSERT, MBED_ERROR, exit(1) and abort all result in a restart.

0xc0170 · 2018-10-23T13:59:04Z

I've been having a real ding-dong with mbed-os in the last week or so, several PRs have broken our app and really slowed us down. I'm currently stuck on LoRaWAN PR which increases the app size so that we no longer fit in flash. I think I'll be writing another issue soon and then I'll try #8328.

Can you please report all to us, I would like to understand what PR broke your app (doing retrospective) ? Good start would be to know the version diff - what you used and which one broke.

I tried f899341 and 4915cbe and believe they do what I want! :-)

👍

mattbrown015 · 2018-10-23T15:04:09Z

Can you please report all to us, I would like to understand what PR broke your app (doing retrospective) ?

Oh, don't worry, I've been raising issues! :-)

As of a couple of hours ago we're using the latest mbed-os #ba23fef.

#8343
#8473
#8496
#8500

The first 2 were the most serious and cryptic and were STM32 specific. These things happen, right??
And, the ST team have got them fixed now.

Possibly the ST team need some more tickless, RTC tests.

ciarmcom · 2018-10-23T16:22:17Z

Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-18

linlingao added the mirrored manually label Oct 22, 2018

linlingao added component: rtos labels Oct 22, 2018

ciarmcom added type: bug Jira status: OPEN labels Oct 23, 2018

kjbracey mentioned this issue Oct 24, 2018

Error path tightening: use MBED_NORETURN; add+use core_util_atomic_flag #8328

Merged

cmonr closed this as completed in #8328 Oct 30, 2018

ciarmcom added Jira status: CLOSED and removed Jira status: OPEN labels Oct 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MBED_ASSERT No Longer Calls mbed_die #8496

MBED_ASSERT No Longer Calls mbed_die #8496

mattbrown015 commented Oct 22, 2018 •

edited by ciarmcom

Loading

linlingao commented Oct 22, 2018

kjbracey commented Oct 23, 2018 •

edited

Loading

mattbrown015 commented Oct 23, 2018

mattbrown015 commented Oct 23, 2018

0xc0170 commented Oct 23, 2018 •

edited

Loading

mattbrown015 commented Oct 23, 2018

ciarmcom commented Oct 23, 2018

MBED_ASSERT No Longer Calls mbed_die #8496

MBED_ASSERT No Longer Calls mbed_die #8496

Comments

mattbrown015 commented Oct 22, 2018 • edited by ciarmcom Loading

Description

Issue request type

linlingao commented Oct 22, 2018

kjbracey commented Oct 23, 2018 • edited Loading

mattbrown015 commented Oct 23, 2018

mattbrown015 commented Oct 23, 2018

0xc0170 commented Oct 23, 2018 • edited Loading

mattbrown015 commented Oct 23, 2018

ciarmcom commented Oct 23, 2018

mattbrown015 commented Oct 22, 2018 •

edited by ciarmcom

Loading

kjbracey commented Oct 23, 2018 •

edited

Loading

0xc0170 commented Oct 23, 2018 •

edited

Loading