Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM32L4 Stack and Heap overlap problem #1348

Closed
helmut64 opened this issue Sep 24, 2015 · 35 comments
Closed

STM32L4 Stack and Heap overlap problem #1348

helmut64 opened this issue Sep 24, 2015 · 35 comments

Comments

@helmut64
Copy link
Contributor

For the new STM32L4 I found out a major problem, the first 4k of the heap overlap with the stack!
More details here: https://developer.mbed.org/questions/60933/STM32L4-Stack-and-Heap-overlap-Disco/

I believe the problem is in startup_stm32l476xx.s (Disco & Nucleo)

The following line must be inserted/changed: (TOOLCHAIN_ARM_STD + TOOLCHAIN_ARM_MIKRO) initial_sp EQU 0x20018000 ; Top of RAM1 96k

This would at least enable the 96k of RAM1, however there is another 32k SRAM2 at 0x10000000-0x10008000. My preference would be use the SRAM2 for the stack and HEAP, and add the SRAM1 for additional HEAP.
Regards Helmut

@0xc0170
Copy link
Contributor

0xc0170 commented Sep 24, 2015

Thanks for reporting. I just referenced the issue in PR which is now for review. Please review

@helmut64
Copy link
Contributor Author

The change looks good. However, at present I have no local ARM compiler setup for local testing, I will test it on mbed.org when available.

@helmut64
Copy link
Contributor Author

I found another problem in the stm32l476xx.sct config files (Nucleo abd Disco)
TOOLCHAIN_ARM_STD and TOOLCHAIN_ARM_MICRO:
old: RW_IRAM1 (0x20000000+0x188) (0x20000-0x188) { ; RW data
new: RW_IRAM1 (0x20000000+0x188) (0x18000-0x188) { ; RW data
TOOLCHAIN_GCC_ARM: STM32L476XX.ld
old: RAM (rwx) : ORIGIN = 0x20000188, LENGTH = 128K - 0x188
new: RAM (rwx) : ORIGIN = 0x20000188, LENGTH = 96K - 0x188

@0xc0170
Copy link
Contributor

0xc0170 commented Sep 25, 2015

@helmut64 Can you send a pull request?

cc @bcostm

@helmut64
Copy link
Contributor Author

helmut64 commented Oct 4, 2015

The stack problem is now fixed with my previous checkins, the Nucleo L4 board works fine, the stack is correct however there are remaining problems/limitations: ( I am using the mbed.org online compiler)
a) The Disco L4 board crashes (it looks like before main)

b) The Nucleo L4 board works fine, however the main memory heap/data region be mapped to 0x2000000 and should be mapped to 0x1000000 as defined in the .sct file.

c) Allocating memory (in 4k blocks) it crashes after 84k (via malloc). I was expecting a NULL return instead of a crash. I am also to be able to alloc about 124k before I get can malloc error.

Could it be that the mbed.org compiler has somewhere a linker file which needs to be updated?
Any ideas?

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 14, 2015

@helmut64 Still having problems with L4 and allocations?
The linker file should use the same as in here (if using mbed-src), or at least with the fixes you sent, they are in the latest mbed library release.

You can test the your example using sources from this repo to compare results.

@helmut64
Copy link
Contributor Author

@MartinKojtal, the problem is not fully resolved.

  • The stack localtion problem is resolved.
    On The L4 Nuceo it works, no stack craches, correct stack address,
  • However there are still memory size problems.
  • On The L4 Disco it craches before main.
    I got a new Keil uVision this week which supports the L4, my plan was to investigate further, using a the debugger, however the next 14 days my time is very limited.

The intention is that the data/heap memory gets first used from the SRAM2 at 0x10000000 (32k) and further used from the SRAM1 at 0x20000000, once the SRAM2 is used up until the stack starts. I configured this is the .sct files however data/heap starts at 0x20000000 (SRAM1). So somewhere is a different config which results into problems:

  • Once 96k (SRAM) are used my malloc (4K blocks), the last malloc craches, exspected are the total of 128k (32k+96+) (less stack, data, etc.)

Regards
Helmut

@ghost
Copy link

ghost commented Jan 13, 2016

Is there still some problem with the STM32L4 memory configuration?
I have a NUCLEO-L476RG board and I made some tests with it.

  • Using the online compiler I build the blinky example, but no blinking on the board.
  • I cloned the mbed-src and build it for my board, then build the blinky example, but no blinking.
  • I modified the linker script so that the heap and stack are placed on SRAM1 and rebuild. It blinks!

The toolchain used for the offline builds was GCC ARM.
I have not tested with any other toolchain. Also I cannot say if this is a valid fix for an issue or just a workaround that happens to hide a real issue underneath.

@MarceloSalazar
Copy link

@0xc0170 could you please have a look into this?

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 13, 2016

@melomaa Using the latest master? This issue was for uvision , which was supposed to be fixed. We shall have a look at GCC ARM linker script file. can you point out what exactly you have changed?

cc @bcostm @adustm

@helmut64
Copy link
Contributor Author

I have not worked on this for some time. I verifyed today via the online compiler using the STM32L3 Nucleo board which works ok, the STM32L Discovery still crashes before main. My comment above from "Oct 4, 2015" is still valid. My testing belongs to the mbed online compiler.

My plan was to debug it using the uvision IDE or gcc/gdb, which I have no done so far. For the gcc/gdb I am still looking for setup options on my Mac. I got uvision running on my Windows box, however I may need to purchase an unlimited copy because the free version has a 32k limit.

@ghost
Copy link

ghost commented Jan 15, 2016

@0xc0170
I was using the release 111.

In the linker script file the memories are defined
SRAM2 (rwx) : ORIGIN = 0x10000188, LENGTH = 32k - 0x188
SRAM1 (rwx) : ORIGIN = 0x20000000, LENGTH = 96k

I changed for heap and stack sections:

  • } > SRAM2
  • } > SRAM1

and then the stack top definition

  • __StackTop = ORIGIN(SRAM2) + LENGTH(SRAM2);
  • __StackTop = ORIGIN(SRAM1) + LENGTH(SRAM1);

@helmut64
Copy link
Contributor Author

@melomaa,

Below is a little test I wrote to verify the memory locations. Please note that when I call the MaxMemoryBlockAvailable (enable it first) the system hangs at the end of the memory, however it should give a NULL return which works fine on the STM32L1. Second we should get almost 128 kb 32+96 (minus stack) which is not the case. Third starting with the 32kb SRAM 2 and keeping the stack on the top os SRAM 2 has a major advantage we can disable SRAM 1 in deep sleep mode which allows further sleep modes and energy savings. I am available to discuss this further.

Regards Helmut

#include "mbed.h"
#include

DigitalOut myled(LED1);

Serial *bs;

void SunFunc() {
uint32_t temp[16];
bs->printf("stackSunFunc=0x%x\r\n", &temp);
}

size_t
MaxMemoryBlockAvailable(bool print)
{
int i = 0;

for (i = 1; i < 1024; i++) {
    char *p = new (std::nothrow) char[i*1024];
    if (p)
         delete[] p;
    else {
        i--;
        break;
    }
}
bs->printf("MaxMemoryBlockAvailable: %d kB\r\n", i);
return i * 1024;

}

int main() {
uint32_t temp = 0;
bs = new Serial(USBTX, USBRX);
bs->baud(115200);

printf("Hello\r\n");
printf("Hello2\r\n");
// MaxMemoryBlockAvailable(true);

char *p1 = new  char[4096];

memset(p1, 0, 4096);


bs->printf(" main=0x%x\r\n", &main);
bs->printf(" data=0x%x\r\n", &bs);
bs->printf("stack=0x%x\r\n", &temp);
bs->printf("   p1=0x%x-0x%x\r\n", p1, p1+4096);
SunFunc();

for (int i = 0; i < 5; i++)
    bs->printf("Hello(%d) ", i);
bs->printf("\r\n");

for (int i = 0; i < 21; i++) {
    void *t = malloc(4096);
    memset(t, 0, 4096);

    if (t == NULL) {
        bs->printf("malloc: Out of memory\r\n");
        break;
    }
    bs->printf(" data%d=0x%x\r\n", i, t);

}

while(1) {
    myled = 1; // LED is ON
    wait(0.2); // 200 ms
    myled = 0; // LED is OFF
    wait(1.0); // 1 sec
}

}

@star297
Copy link
Contributor

star297 commented Jan 15, 2016

To start with the online compiler needs adjusting, build details report 32K RAM max (unless this has been corrected since I last updated/checked), these targets have 128K, I would start there first and get this corrected. The .sct files are the same for both targets, can't see why there is a problem between them, perhaps check the backend set up on the Mbed compiler.

TBH I have been using both boards without problem's, probably lucky so far.

The other problem you will run up against is the interface MCU will not always 'connect' correctly to the main MCU for programming. Downloading user code will sometimes 'brick' the MCU and will no longer program the MCU unless you power cycle the board to get the interface MCU led back to 'RED' condition, reload user code and it will be working again.

Using an external programmer (SEGGER FLASHER) has no problems at all. So I would check the algo's for the L476 targets in the ST-Link firmware.
This did mislead me thinking I had code lock up issues in particular if interrupt and sleep code is employed, however this turned out to be caused by the interface firmware.

@helmut64
Copy link
Contributor Author

@star297, I have multiple STM32L4 Disco and Nucleo boards, non of them are briked, I can load the original factory sample app using the STLink-utility. The original initial version had a wrong stack location which I fixed a few month ago. We can probably go back to the origianl config an swap the SRAM1 and SRAM2 configs which means the 96k are first.

The mbed drive based flashing works, I can flash stuff.

Question, you are telling me that the disco baord works with your app, are you using the on-line compiler?

@star297
Copy link
Contributor

star297 commented Jan 17, 2016

Yes, working Helmut. I only use Mbed online with the Nucleo board to develop my code then use the Segger Flasher to program my prototype board. I have only used 16K of RAM so far so maybe not hit the Stack/Heap issue. But I do use deepsleep extensively so will need this part retained in this mode.

I have clock set up for 80MHz, found that Erik's timer WakeUp library would not work with the 48MHz set up. But that suits me better as I need high speed for a short period of time 100mS or so every minute then deepsleep for the rest of the time.

Picture below is using quite a few resources 2 x i2c, SPI, AnalogIn, RTC and InterruptIn. Has been running for a couple of weeks, 22uA deepsleep maintaining the Sharp memory LCD and 15mA wake up.

I have been using the disco board as well but not to the same extent. I will try the same code on this board too and see what happens, perhaps there is a problem in the set up there.
I did check the .sct in TOOLCHAIN_ARM_STD on both targets and they are the same.
This is defined with the 32k as retained:

RW data 32k L4-ECC-SRAM2 retained in standby

I have not checked the manual, perhaps this is the only option with these two RAM blocks.

But that Mbed build report of 32K RAM max is worrying, maybe the online compiler only counts the first block, not including the second block. In which case perhaps there is no problem there.

Paul

l476

@helmut64
Copy link
Contributor Author

@star297, thanks for your explanation, nice work your project, something similar is also my target. At present I work with the Nucleo and Disco boards, I started using the L1, now L4, next is a own PCB.

Regarding the max memory info, here may be the source of the problem. I organized the SCT that the 32k comes first, the stack is on top of the 32k and the 96k comes second. My idea was that we can be enhance the sleep mode where we have an option to turn off the 96k when needed, certainly this needs to be arranged with your project, maybe to reserve the 96k on startup and manage the 96k separately.

Here are some of the existing problems:

  • The mbed IDE shows only 32k instead of 128 (the sum of SRAM1,SRAM2)
  • Allocating memory via "new (std::nothrow) , see MaxMemoryBlockAvailable() above" crashes
    the system once we run out of memory. The same test loop works fine on the L1 system which
    returns NULL when memory it runs out of memory.
  • Allocating memory via new gives us only up to 32k instead of a little bit less that 128k.
    (minus code, data, stack)

Somebody who knows the mbed and ARM runtime and linker internals should be able to help.

Regards Helmut

PS: For programming you should also be able to use the Nucleo ST-Link interface which allows flashing via the USB drive and tty output via USB.

@star297
Copy link
Contributor

star297 commented Feb 8, 2016

Helmut, could you run your tests again, the Flash/RAM size settings are now correctly set up on Mbed.

@helmut64
Copy link
Contributor Author

helmut64 commented Feb 9, 2016

Dear Paul,

the Flash/RAM sizes look correct in mbed, however there are two remaining problems:
b) Nucleo L4: allocating more than the available memory freezes instead of returning NULL
#include
char _p = new (std::nothrow) char[96_1024];
I use the same check on the L1 and it works great.

a) Discovery L4: crashes before main. (same as before).

Regards Helmut

@star297
Copy link
Contributor

star297 commented Feb 18, 2016

Helmut, we now have a fix for the interrupt problem.
Transpired that the NVIC_RAM_VECTOR_ADDRESS is set to 0x20000000 instead of 0x10000000 in the cmsis_nvic.c file.
Setting this correctly now gives reliable interrupt function.
This may well have an effect with the issue you found here.
Could you make the NVIC change and retry?
Regards Paul

@helmut64
Copy link
Contributor Author

@paul, at 0x20000000 starts the regular RAM, this is certainly a problem. I updated the stm32l476xx.sct which has the vectors at RW_IRAM1 (0x10000000) and missed the cmsis_nvic.c file. I don't have a local compiler, at present I use only the mbed.org one.

I don't have a local ARM compiler, however I can do the changes on git-hub.
Have you done a local compilation of mbed (which compiler/debugger?)

Thank you, Helmut

@bcostm
Copy link
Contributor

bcostm commented Feb 19, 2016

We have also changed the startup file for ARMCC and I can pass helmut's stack/heap example. I mean a NULL pointer is returned now without crash. I need to continue testing it but it looks ok... If you want I can send this startup file for you to test it also. Tell me.

@helmut64
Copy link
Contributor Author

@bcostm, at present I don't own an ARM local compiler, I have used mbed only. My µVision has a 32kb limit which is good for debugging my code, but it cannot handle the entire mbed size.

@bcostm
Copy link
Contributor

bcostm commented Feb 19, 2016

No problem helmut. I have tested your stack/heap example with an update of the ARM_MICRO startup file on Nucleo and Disco L4 boards with uVision. And everything looks ok. For ARM_STD the example works without changing the startup file. The new startup file is here if someone want to test it also before I send a PR:

startup_stm32l476xx_ARM_MICRO_new.zip

@helmut64
Copy link
Contributor Author

@bcostm, I have a question for your change. You specified the heap size of "0x8000 ; 32KB", however there are 98 vectors = 392 bytes (0x188) the the beginning of the RW_IRAM1.

My original idea was that the vectors and stack in the in 32kb segment, that the heap also starts in the 32kb segment and continues in the 96k segment once the 32k are filled up. The benefit would be that the 32k supports the higher power saving mode where the 96k are switched off.

@bcostm
Copy link
Contributor

bcostm commented Feb 21, 2016

Your idea is great but I don't know how to do it...
@betzw do you know if this is possible ?

@betzw
Copy link
Contributor

betzw commented Feb 22, 2016

I do not think that it is possible to specify that the heap is build up of more non-consecutive memory regions. As far as I know the uClib of the ARM toolset supports only two variables to specify the heap: __heap_base & __heap_limit. This seems not enough to me to describe the actual situation on STM32L4.
Pls. consider also, that the heap is now allocated (entirely) in the RAM bank starting at 0x20000000!

@helmut64
Copy link
Contributor Author

@betzw, Hallo Wolfgang, I cannot imagine that there are not multiple regions where heap memory is located. There are many cases where an MCU has built-in memory and add-on memory which is all mapped into the main memory area at different locations.

Independent of this I believe a working solution for the STM32L4 series has priority because we never had a reliable working mbed version for the L4 series.

Regards Helmut

@betzw
Copy link
Contributor

betzw commented Feb 22, 2016

Regarding multiple regions we need to ask ARM.
Regarding a reliable L4 mbed version, Bruno's PRs should get us closer to this objective.

@helmut64
Copy link
Contributor Author

helmut64 commented Mar 4, 2016

@bcostm, @betzw, I tested today the L4 Disco and Nucleo boards, they booth work with the sample applications, etc. Regarding the memory test I get 96kb as expected, however the MaxMemoryBlockAvailable() hangs when it tries to allocate 97kb. On the L1 Nucleo board it return null as expected when it runs out of memory.

I expect to get soon a Keil uVision license which allows me to debug this.

The original problems are fixed. Thanks for helping.

@helmut64 helmut64 closed this as completed Mar 4, 2016
@betzw
Copy link
Contributor

betzw commented Mar 7, 2016

@helmut64 which toolchain did you use?

@helmut64
Copy link
Contributor Author

helmut64 commented Mar 7, 2016

@betzw, I used the mbed.org online compiler.

@bcostm
Copy link
Contributor

bcostm commented Mar 8, 2016

Hi @helmut64, I think it is normal that the MaxMemoryBlockAvailable() function still hangs with the mbed IDE. It's because the PR #1554 is not yet merged.

@helmut64
Copy link
Contributor Author

helmut64 commented Mar 8, 2016

@betzw Hi Wolfgang, I just know that using the STM32L1 Nucleo board this was working as expected on the L4 board it hangs. In my app I check regularly the available memory via such a loop report how much memory is available and to detect low memory situations.

PS: I purchased an ARM Keil MDK µVision compiler today, which means I can do true debugging without any code limits.

@betzw
Copy link
Contributor

betzw commented Mar 8, 2016

Ciao @helmut64, you might want to take a close look at issue #1561 and the discussion which is or rather has been ongoing there above all regarding to what people in ARM tell about the mbed-classic memory model, as this might mean for you that the way you try to get hold of the currently available amount of heap memory might make crash your application ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants