-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calls made to malloc/free will enable interrupts on return #6246
Comments
Thanks for the analysis. The core currently doesn't support masking a specific interrupt level. That doesn't mean it can't be implemented, only that at this time locking is implemented as all or nothing. In any case, your investigation hints at a deeper underlying issue with very wide scope. Dev Notes:use case as follows: Considerations: |
Your sketch passes all three tests with:
It's not simple as that because of definitions in Arduino.h included from everywhere, so I ended up copying the class into umm for testing because some include-and-definition-location rework is required - I can try to make a PR to prepare for that change if you need so) Thanks |
@d-a-v, thanks, but I am good. I have already worked through this issue for my needs. I saw this issue in the code while exploring how UMM_POISON worked. I ended out looking at how umm_malloc.cpp worked for a while and made a few other changes. For example, I monitored the min/max timings for the locks in, info/malloc/free/realloc when running my project. Now that I have my project working, I am making my way through different things I have changed and identifying the ones that are problems, that need to be reported/fixed. Writing does not come easy for me. It takes me quite a few drafts before hitting send. Thanks to everyone for all that you do here. I don't know how you find the time to do so much. |
@mhightower83 your writing was good enough and your analysis detailed enough to lead my thinking right to an unknown problem that could explain a lot of odd crashes. Please uphold your practice of reporting in that way! |
@d-a-v os there anywhere in the umm code where: |
UMM's very simple, basically a single file of straight C. There are no object delete/creates, so (b) should not be an issue. There is only 1 UMM_CRITICAL_ENTRY per function, so that's fine. There is nested calls (realloc sets CRITICAL but then can call free() which also sets CRITICAL). However that's not a problem since only the first one will do anything. |
@earlephilhower the nested case you mention seems to me like a variation of the use case I described above. I took a quick look at umm, and realloc is the worst case: it calls free and malloc from inside the critical section:
The code from point A to point B inclusive won't be protected as is expected, unless we implement nested locks. |
Thinking further, nested locks are actually already supported via multiple instances of InterruptLock, where each instance represents one level of nesting (the instance saves the previous state, then restores it on destruction). However, @d-a-v 's solution only applies to code under our control. Is there any way to know if the ets_intr_blah() functions are called from anywhere else, e.g.: the closed libs? |
You can get a list of imports (undefined externals) easy enough for the |
@devyte @mhightower83 This |
@d-a-v I did not go into it on issue description; however, I placed wrappers around ets_intr_lock/unlock thinking I would replace it with something that worked better handled nested calling and preserved INTLEVEL. I did that and things didn't make sense. I took a step back and just built a monitor wrapper around ets_intr_lock/unlock monitoring lock nesting depth and some lock time durations, max, INTLEVEL, etc. Things got weird. I found that lock depths go negative and then return to 0! I have come to the conclusion that ets_intr_lock/unlock may be doing exactly what Espressif wants for the blob, but not what we need or thought it does. This is just a hypothetical. I have no idea if anything like this happens in the blob:
As best I can tell ets_intr_lock/unlock is not documented. It is incidentally used in an example by Espressif and we see the function name in the ROM linker table. I think based on their names, we have made assumptions about what they are for and how they are to be used. Which is all we have to go on when documentation is weak. IMHO I would not try to fix ets_intr_lock/unlock, it may not be broken; however, I would not use it for any purpose in my work. For this lenght, this message was a bit quick for me. Not sure all my thoughts made it into words. I hope this is clear. |
Again, thanks for these details and your findings. After discussion with @earlephilhower and @devyte, we plan to remove exposure of To manage nested block, implementation could rely on an internal stack, or even better on @earlephilhower 's idea just a counter and the initial saved state that would be restored on the last
Do you mean that there can be more |
Yes, the depth range I saw was -1 to 2. That is why I think ets_intr_lock()/ets_intr_unlock() should be left alone just for the blob to use. I don't think they are using it for nesting locks in any way or form. Just an interrupt disable/enable method. And, maybe look back to see if they were called with interrupts enabled. BTW: I went through variations on the counter thing, saved/restored INTLEVEL on an array as the depth stepped up and down. The down is where things got weird and went negative. (my depth variable was originally unsigned :P) My conclusion was to just stay away from ets_intr_lock()/ets_intr_unlock(), far far away. Let the blob have it. |
Oh, I wasn't sure how this was going to flow.
|
Some of my forgotten thoughts and concerns on the method used for the critical section:
|
Your suggestion of moving the memmove out of the critical section seems like a simple way to get a much shorter period of IRQs disabled! (I'd go further and say that if you're doing malloc/free/realloc/new/delete inside an ISR, you're doing something very wrong, but most users here aren't professional or shipping critical embedded systems.) FWIW, we've not seen the time spent with IRQs disabled in UMM as a source of problems. The bug of re-enabling IRQs even when they were disabled, is of course something that needs fixing. But the actual UMM operations themselves seem pretty solid as-is. |
Yes, I think that problem only surfaced when you used the print option. You fixed it before I could report it :). For me, the problem showed up, when I tried to see how a UMM_POISON failure look like.
However, when you use Arduino/cores/esp8266/umm_malloc/umm_malloc.cpp Lines 1768 to 1771 in adf2b14
And for some, a lot of calls are made to ESP.getFreeHeap() to track leaks.
By fixing the critical section operation, the IRQs disabled time may increase, as memcpy will then run behind a critical section. |
This is really opening a can of worms, but it seems to me there's no reason |
I remember it goes through the whole linked list every time edit: nice !
side note: If we improve umm, maybe we could do it with minimum changes in file structure so we can also propose them upstream. |
Sorry, I am sometimes too suttle. I have changes for that. I'll do a pull request as sneak peak at what I have. When I get back home. (About an hour) And we can reduce it from there. |
Thanks for all details. To summarize my understanding:
side notes:
|
Actually,
I'll have to take another look at that. I remember following that part of the code trying to understand how the error was handled. And looked up how it was supposed to be handled. I must have had one of my moments of confusion. [Edited] I thought it worked. |
@d-a-v before I forget again, thank you for the summary acknowledgment that was reassuring.
For "maybe fragmentation too" I went back to some of my earlier edits of umm_malloc.cpp to try and remember some details. My concern with the fragmentation calculation involves the multiplies and I was concerned that this could add significantly to the run time to do a malloc or free; however, I never explored if that was true. On the other hand, unless I am in error, I think the existing calculation for freeSize2 could be improved by moving the summing of the square of sizeof(umm_block) outside the while loop. Then freeEntries * square of sizeof(umm_block) would give you the final adjustment to add-in. Also, there was something about removing DBG_LOG_FORCE with forced printing - this improved something. I think it was mainly size (I did try this in IRAM for a while), maybe a little speed, since it removes an if from the loop. But, then I don't know if this is a valued feature. Maybe support it sa a build option. Sorry, recollection is weak on this one. Hmm, I am starting to realize that there isn't much I have touched on in this module. I get distracted and go in way too many directions when I work on a problem. I guess I have a little ADHD.
I think this is something different, I wrote I was thinking of this change you made: d-a-v@0b68126 |
The metric is the Euclidean/SRSS/L² norm on free space chunks (free chunk = contiguous free blocks): According to @earlephilhower we don't have HW multiplier for 32bits*32bits=>64bits but I think we have it for 32bits*32bits=>32bits and that's the one used because the result is stored into a 32bits int. This may have to be checked. And even in that case I don't know the cost. I think however this calculation could only be enabled in debug mode.
Only in debug mode no ?
Yes, it's the logic from upstream that has been improved by its author I think, but I haven't looked into details. I was simply mentioning it because sources are about to diverge. |
Actually, the define for DBG_LOG_FORCE is always there. There are currently no conditional for that debug macro. Arduino/cores/esp8266/umm_malloc/umm_malloc.cpp Lines 594 to 600 in 7c67015
Sorry I got my factorization wrong. I believe there is a square constant that can be factored out and multiplied back at the end of the sum. Reducing the total number of multiplies within the while loop. Maybe this will illustrate my thinking:
|
I just had a look at the esp8266 boot rom disassembly by @trebisky and a read of the Xtensa instruction set reference manual.... ets_intr_lock() actually returns the old interrupt level - by accident or design I'm not sure, but it's baked in rom now... Looking at the disassembly:
rsil reads the register into a2, then updates it with '3'.
proves it works, and
will restore the register as appropriate. note the (saved & 0x0F) is necessary as the whole register is returned and only the low 4 bits contain the interrupt level. We could certainly do with some better locking mechanisms, and applying this knowledge to my copy of umm_malloc and lwIP doesn't seem to have made any difference, but hey, there it is :-) |
The ets_intr_blah() functions have one major flaw: the intr reg stored/returned is never restored, but instead the unlock re-enables all interrupts. That means e. g. you can't have nested critical sections, such as is currently the case in umm realloc, because the code part between the inner unlock and the outer unlock won't be protected. I don't know if any of the above is the cause of what you're seeing, but any help is appreciated, either confirming, implementing, finding some other cause of instability, etc. |
Agreed that not being able to select the interrupt level is a flaw, and there needs to be the option to use that if required. |
#6274 is merged, the original issue here is addressed. For whatever is pending, let's discuss in a new issue. |
Basic Infos
Platform
Settings in IDE
Problem Description
To simulate an IRQ handler calling malloc I used
xt_rsil()
to set INTLEVEL.when malloc/free are called with an elevated INTLEVEL, INTLEVEL is set to 0 on return.
I noticed that
ets_intr_lock()
andets_intr_unlock()
are being used to handle critical section for umm_malloc.ets_intr_unlock()
does not restore the previous INTLEVEL; it does anrsil a2, 0
.Ref: #2218 (comment) Thus causing the immediate problem.
MCVE Sketch
Debug Messages
Other Thoughts and Observations
The comments in the code indicate that the locking macros must be allowed to nest. In regards to the comment about
_umm_malloc()
having a call to_umm_free()
I think the code may have been changed after that was written; however, there are calls to_umm_malloc()
and_umm_free()
from_umm_realloc()
that would require nest support.Arduino/cores/esp8266/umm_malloc/umm_malloc_cfg.h
Lines 103 to 114 in 5a47cab
I have explored using
XTOS_SET_MIN_INTLEVEL
andXTOS_RESTORE_INTLEVEL
fromESP8266_RTOS_SDK/components/esp8266/include/xtensa/xtruntime.h
for the UMM_CRITICAL_ macros. It has been working well for me. And there are comments inxtruntime.h
about how it works, something I could never find forets_intr_lock()
. I like that you can specify anINTLEVEL
toXTOS_SET_MIN_INTLEVEL
and if the current level is higher it will keep it, instead of demoting it, asets_intr_lock
did.The text was updated successfully, but these errors were encountered: