-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Stack] Increase stack to 5k and reduce stack allocations in rules #1858
Conversation
Removed some unneeded stack allocations of 240 bytes in the rules processing. Also gave the `process_system_event_queue()` a bit more priority to make sure it will be looked at at least once every 500 msec.
why not just drop -DNO_EXTRA_4K_HEAP? |
For core 2.4.1 it doesn't do anything. |
just a question for my understanding: if I add -DNO_EXTRA_4K_HEAP as compile-flags in Arduino Framework, my nodes go into boot-loops. does this change something in the settings structure/place or similar? Do I need to start with a "clean" node if I change this option (eg. zero all setings/memory before flashing)? after removing this option and compiling/flashing again, the nodes worked as before... btw: I use latest esp git core normally.... |
@clumsy-stefan You should no go into boot loops. But you may need to recompile everything (make clean) |
ok, thanks. I did recompile (incl. the lwip2 libs) and did get these loops... I'll investigate a bit further at the weekend why this happens... it seems to happen just when it tries to connect to the WiFi as the "startup" rule gets processed (plays a sound on the speaker) |
Hmm, reboots like that (wifi related) also happened a while ago when the 'nightly builds' resulted in faulty compiles. |
I always ran my own builds, as I need some plugins that are not in the standard build. I compile it on a Mac in the Arduino framework. but as said, it just happened when I addeded -DNO_EXTRA_4K_HEAP -DCONT_STACKSIZE=5120 |
As suggested by @s0170071 [here](letscontrolit#1774 (comment))
Core is now set to 2.4.1 again, so this flag is not needed anymore. Also not sure if it did fix something in 2.4.2 core.
@TD-er: if I add -DCONT_STACKSIZE=5120 to the Arduino build environment and compile it I run into bootloops (serial log attached, the wdt reset happens all at the end, before it connects to the WiFi...) Not sure what's different between the Arduino Framework and the plattformio build.... but for sure this define kills my builds.... any idea where I could/should look? ets Jan 8 2013,rst cause:2, boot mode:(3,7) load 0x4010f000, len 1384, room 16 RuleDebug Processing:rules1.txt RuleDebug: 100: system#sleep do RuleDebug: 101: rtttl,15:d=4,o=5,b=125:c.,c,8c,c.,d#,8d,d,8c,c,8c,2c. ets Jan 8 2013,rst cause:4, boot mode:(3,7) wdt reset RuleDebug Processing:rules1.txt INIT : Booting version: SMY_2.07 Oct 6 2018 08:25:31 (ESP82xx Core 9bc8ea1b, NONOS SDK 2.RuleDebug: 101: rtttl,15:d=8,o=5,b=150:8f |
tested with my oscillating rules: Free Mem: | 13168 (11824 - sendContentBlocking) <<- draining at 1k/s If you disable the rules processing immediately before it runs out of memory, it will fully recover. |
@TD-er I do also have trouble with a larger stack and the extra4kHeap. Could it be that the larger cont stack does overlaps too much with the sys stack when you define extra4kheap ? |
@TD-er sorry to spam you, but I just got an idea. The last time I have seen a memory drain that recovers (see 2 posts above) was when the ESP got wifi packets that no one read. The packets are stored by the core/lwip until it crashes. The core guys consider this a feature, not a bug. Its a bug on our side. |
Just to make sure we're talking about the same thing here. For core 2.4.2 and up, not adding this define does yield 4k extra RAM, but it lowers the location of the Arduino stack and overlaps the system stack. The 'cont' stack does grow when defining |
@s0170071 Do you have some links on how to detect if there is data in the LWIP buffers and how to empty them? |
I think you just have to call the various handle functions often enough.... |
And what if there are packages sent to the node for which no handle function is set? |
Then it would not recover if I stop the rules ;-) |
And what calls do you have in the rules that read from the LWIP buffers? |
None. |
@TD-er NO_EXTRA_4K_HEAP also enables WPS support |
@uzi18 That's good to know, and also a very big "why would anyone ever do that???" |
found a bug: static byte nestingLevel = 0; If you initialize it again and again each time you come by, what good is a check of that byte two lines later... ?
edit: it is in misc.ino |
A static variable will only be initialized once. So this is not a bug. |
@TD-er sorry, my bad. You're right.
To save you some searching, the rules I use are:
|
There is an f.close() missing at the end of function rulesProcessingFile(String fileName, String& event) |
@s0170071 That's silly, I looked at the code of FS.h and it seems there is no destructor of the object. |
It is not resolving the issue but f.close seems to delay the inevitable crash a bit. |
I am already adding a f.close() to all instances of fs::File And it may also cause strange corruptions I guess when there remains a lock to a file. |
ok, I got a theory, just right now not the time to prove it: Looking at function
|
It looks like this Only thing is that this |
As suggested [here](letscontrolit#1858 (comment)) the previous used flag may also enable WPS. So I used [this suggestion](esp8266/Arduino#5148 (comment)) to only re-allocate the stack at link time.
@TD-er some simplification in misc.ino, rulesProcessingFile()
is pretty much the same as:
and you won't need the buf furthermore in rulesProcessing, if I set event to "" for the empty rules files, the memory drain is gone.
so runnig an event to an empty file via rulesProcessingFile(fileName, event); drains memory. |
Maybe because there isn't a newline at the end of an empty file? |
Just a quick heads-up: The unit from tonight that only processes one rules file is approaching
rules were running all night. disabling rules, no reboot shows:
with reboot:
|
Arggghh !, I know why it worked better when only processing rules1.txt. Still, the basic question remains, where did the memory go? |
Removed some unneeded stack allocations of 240 bytes in the rules processing.
Also gave the
process_system_event_queue()
a bit more priority to make sure it will be looked at at least once every 500 msec.