_sbrk heap & stack collision check is risky #9542

alzix · 2019-01-30T09:31:02Z

Description

Reopening old discussion #1561 as the SW and HW has changed since last revisited.

Stack and Heap collision checked in _sbrk implementation by comparing to get_MSP() is risky.

                +-----------+ 0x00000000
                |           |
                |           |
                +-----------+
                |           |
                |    heap   |
                |     |     |
                |     |     |
                |     V     |
                +-----------+ <-- MSPLIM (ARMv8-M)
                |     ^     |
                |     |     |
                |     |     |
                |   stack   |
                |           |
   get_MSP() -->|           |
                |           |
                +-----------+
                |           |
                |           |
                +-----------+ 0xFFFFFFFF

In current mbed-os implementation MSP is used mainly for during initialization sequence and by interrupt handlers. MSP size can be predicted and carefully chosen by system integrator according to the application use case.
Current implementation assumes dynamic stack limit watermark and makes system unpredictable. Big allocations may work before stack is full but fail afterwards.

In addition as mentioned by @pan- #1561 (comment) , the check is not atomic and we do know for sure that MSP is used by IRQ handlers. This can lead to corruption.

Now we have new HW which was not considered during last discussion. For ARMv8-M we have stack limit registers. These registers will trigger fault once push operations are executed. This dynamic stack limit approach makes them unusable, as regular write to memory will be permitted.

Issue request type

[ ] Question
[x] Enhancement
[ ] Bug

CC @sg- @erezlandau

The text was updated successfully, but these errors were encountered:

kjbracey · 2019-01-30T10:21:08Z

The check is not atomic and we do know for sure that MSP is used by IRQ handlers. This can lead to corruption.

On this particular point - there's no atomicity issue here. MSP is totally atomic from the point of view of anyone calling malloc/_sbrk. And actually it's totally constant if in an RTOS environment - the main stack is always empty when queried by a thread (apart from the boot callstack leading up to osKernelStart).

Yes, interrupt handlers will change MSP while they're running, but that will be invisible to a thread/process reading it, because they put it back before returning. (Just like they change the PC but put it back).

So comparison against MSP is fine in principle - the problem is just that you're not allowing space for the interrupt handlers - if you want to base the calculation on MSP you need to be checking against CurrentMSP - StackSpaceAllowedForInterrupts - it seem's that was @pan-'s point in that comment.

Does PR #9092 change anything here? That was doing something to regularise main stack sizes. Maybe platforms are better at specifying their limit explicitly now such that the MSP check is redundant anyway - they could look at a stack base symbol? And yes, that explicit limit could also be programmed into MSPLIM.

pan- · 2019-01-30T11:55:38Z

@kjbracey-arm Thanks, that was my point.

I had a quick look at #9092 and it seems like we can use mbed_stack_isr_start as it is defined globally:

if (new_heap >= (unsigned char *)__get_MSP()) can be transformed into if (new_heap >= mbed_stack_isr_start)

ciarmcom · 2019-01-30T12:37:12Z

Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-813

kjbracey · 2019-01-30T12:47:10Z

I don't think we necessarily to rule out dynamic behaviour in ARMv8 either - MSPLIM could be dynamically set to current sbrk position, rather than mbed_stack_isr_start, so it can use any memory not used by the heap.

Now, that's not very useful in the RTOS setup, where the main stack is just for interrupts, but conceivably it remains useful in non-RTOS builds where the so-called "isr" stack is actually where main is running. I don't know if we set "ISR_STACK_SIZE" bigger for non-RTOS builds?

You might still want the MSP-based comparison if the MSP is allowed to spill below the nominal "ISR_start".

pan- · 2019-01-30T14:45:38Z

Agreed that MSPLIM can be dynamically adjust but we may miss cases where the ISR stack goes beyond its limit. Therefore an application may appear to work when memory pressure is low and crashes when the memory pressure is high. Wouldn't it be better to have it crashing all the time ?

I don't know if we set "ISR_STACK_SIZE" bigger for non-RTOS builds?

From what I read we don't. It should be at least equal to the main thread stack size and at most it should be equal to the stack size reserved for IRQ processing and the stack size reserved for the main thread.

deepikabhavnani · 2019-01-30T19:24:48Z

Thanks @alzix for raising this issue, I will like to add few things here:

The default model for GCC_ARM is one region ram model, which uses this implementation of _sbrk(). Its mainly for non-RTOS version. Most partners have added their own version wrap_sbrk() and _sbrk() to have 2-region ram model because of few issues listed below.

This is good time to rethink on single RAM region model (stack and heap together) for RTOS less build reasons being:

GCC allocates memory at 4K boundary. ISR stack in Mbed-2 is 4K but in case of Mbed OS 5 it is 1K. Placing stack always at the bottom or after heap, we leave 3K of RAM memory unused. malloc unexpected behaviour on GCC_ARM compiler #5386
In case of separate RAM banks we cannot extend Heap to multiple banks, and hence users prefer separate region for heap memory. Enable split heap in SRAM1 and SRAM2 on Nucleo L476RG #8963

Also #9092 is breaking change for all non-RTOS builds, we will need 2 ram region model to fix this. #9523

deepikabhavnani · 2019-01-30T19:29:28Z

I don't know if we set "ISR_STACK_SIZE" bigger for non-RTOS builds?

ISR stack size for non-RTOS builds is 4K and 1K for RTOS build.
Default stack size is in targets.json for RTOS less build https://github.com/ARMmbed/mbed-os/blob/master/targets/targets.json#L30

Overwritten in RTOS lib.json for RTOS build https://github.com/ARMmbed/mbed-os/blob/master/rtos/mbed_lib.json#L29

deepikabhavnani · 2019-02-11T13:55:18Z

@alzix - Stack and heap collision is taken care of in #9571.
Stack and heap are now defined as two separate memory regions. Please review #9571 if this looks good for Armv8M?

deepikabhavnani · 2019-03-01T16:25:33Z

~~Closing this as fix is applied in #9571. Feel free to reopen if issue still exists~~

@ARMmbed/mbed-os-maintainers - I cannot close this issue. Did something change ?

ciarmcom added type: enhancement labels Jan 30, 2019

ciarmcom closed this as completed Mar 1, 2019

ciarmcom added Jira status: CLOSED and removed Jira status: OPEN labels Mar 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

_sbrk heap & stack collision check is risky #9542

_sbrk heap & stack collision check is risky #9542

alzix commented Jan 30, 2019

kjbracey commented Jan 30, 2019

pan- commented Jan 30, 2019

ciarmcom commented Jan 30, 2019

kjbracey commented Jan 30, 2019

pan- commented Jan 30, 2019

deepikabhavnani commented Jan 30, 2019

deepikabhavnani commented Jan 30, 2019 •

edited

Loading

deepikabhavnani commented Feb 11, 2019

deepikabhavnani commented Mar 1, 2019 •

edited

Loading

_sbrk heap & stack collision check is risky #9542

_sbrk heap & stack collision check is risky #9542

Comments

alzix commented Jan 30, 2019

Description

Issue request type

kjbracey commented Jan 30, 2019

pan- commented Jan 30, 2019

ciarmcom commented Jan 30, 2019

kjbracey commented Jan 30, 2019

pan- commented Jan 30, 2019

deepikabhavnani commented Jan 30, 2019

deepikabhavnani commented Jan 30, 2019 • edited Loading

deepikabhavnani commented Feb 11, 2019

deepikabhavnani commented Mar 1, 2019 • edited Loading

deepikabhavnani commented Jan 30, 2019 •

edited

Loading

deepikabhavnani commented Mar 1, 2019 •

edited

Loading