Skip to content

MBED_ASSERT No Longer Calls mbed_die #8496

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mattbrown015 opened this issue Oct 22, 2018 · 7 comments
Closed

MBED_ASSERT No Longer Calls mbed_die #8496

mattbrown015 opened this issue Oct 22, 2018 · 7 comments

Comments

@mattbrown015
Copy link
Contributor

mattbrown015 commented Oct 22, 2018

Description

cc @MateuszMaz

#8255 changed MBED_ASSERT so that it uses mbed_error rather than calling mbed_die directly.

While I have some sympathy for the intent of this PR it has changed the behaviour of our application when asserts fire.

mbed_error can eventual call mbed_die via mbed_halt_system but it won't in the case of MBED_ASSERT because mbed_assert_internal calls core_util_critical_section_enter.

This is important to our application because it is a remote battery powered device. We want our device to just restart/reboot if an assert fires. Spinning is no use to us because it will just take the device offline and flatten the battery.

We have provided our own definition of mbed_die that just calls system_reset. This worked for us before #8255.

When we started our application, back with V5.9, we didn't really understand the error API and used MBED_ASSERT. It's too late to change all our calls to MBED_ASSERT to MBED_ERROR and I'm not entirely sure they are equivalent.

To some extent I think assert should be simple and I don't think mbed_error is simple. Asserts shouldn't fire and if they do it means no one knows what has happened. Error logging is very noble but in an embedded system that is crashing is it going to achieve anything? In other words, I think I liked MBED_ASSERT better before.

Another fix for my problem would be to remove the call to core_util_critical_section_enter. Is it really necessary to call core_util_critical_section_enter from mbed_assert_internal? This call was added a long time ago and when mbed_assert_internal was very different.

I also appreciate that we should have a watchdog enabled but at the moment we don't and implementing one correctly is not that straightforward.

Issue request type

[ ] Question
[ ] Enhancement
[x] Bug

@linlingao
Copy link
Contributor

Internal Jira reference: MBOTRIAGE-1736

@kjbracey
Copy link
Contributor

kjbracey commented Oct 23, 2018

I think #8255 is reasonable - the problem is the implementation of mbed_error. I have a follow-up PR which covers this here: #8328

See if that covers your issue. It should attempt to call mbed_die in the first instance as it goes MBED_ASSERT->mbed_error->mbed_halt_system->mbed_die. The "spin and halt" behaviour is reduced to back-up only in the event of recursive crash inside mbed_die.

@mattbrown015
Copy link
Contributor Author

I think #8255 is reasonable - the problem is the implementation of mbed_error.

Agreed! :-)

See if that covers your issue.

It looks like it should and it makes sense to me. I will try eventually.

I've been having a real ding-dong with mbed-os in the last week or so, several PRs have broken our app and really slowed us down. I'm currently stuck on LoRaWAN PR which increases the app size so that we no longer fit in flash. I think I'll be writing another issue soon and then I'll try #8328.

@mattbrown015
Copy link
Contributor Author

I tried f899341 and 4915cbe and believe they do what I want! :-)

With these changes and our mbed_die definition MBED_ASSERT, MBED_ERROR, exit(1) and abort all result in a restart.

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 23, 2018

I've been having a real ding-dong with mbed-os in the last week or so, several PRs have broken our app and really slowed us down. I'm currently stuck on LoRaWAN PR which increases the app size so that we no longer fit in flash. I think I'll be writing another issue soon and then I'll try #8328.

Can you please report all to us, I would like to understand what PR broke your app (doing retrospective) ? Good start would be to know the version diff - what you used and which one broke.

I tried f899341 and 4915cbe and believe they do what I want! :-)

👍

@mattbrown015
Copy link
Contributor Author

Can you please report all to us, I would like to understand what PR broke your app (doing retrospective) ?

Oh, don't worry, I've been raising issues! :-)

As of a couple of hours ago we're using the latest mbed-os #ba23fef.

#8343
#8473
#8496
#8500

The first 2 were the most serious and cryptic and were STM32 specific. These things happen, right??
And, the ST team have got them fixed now.

Possibly the ST team need some more tickless, RTC tests.

@ciarmcom
Copy link
Member

Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants