-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Bugfixes in FX data allocation #4783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
realloc was not handled properly. also added *intermediate* fix for waitForIt(), see wled#4779
|
""" WalkthroughThe changes update memory management in the Changes
Estimated code review effort3 (120 minutes) Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
wled00/FX_fcn.cpp (1)
1690-1690: Acknowledge the intermediate timeout fix.The 100ms margin addition provides a safety buffer for the wait timeout. However, as noted in the TODO comment, this appears to be a temporary workaround.
Consider tracking the proper timeout fix mentioned in issue #4779 to ensure this temporary solution gets replaced with a more robust implementation. Would you like me to help create a tracking issue or investigate the root cause of the timeout problem?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
wled00/FX_fcn.cpp(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: blazoncek
PR: wled/WLED#4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like `if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))` is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: willmmiles
PR: wled/WLED#4712
File: wled00/FX_fcn.cpp:992-993
Timestamp: 2025-06-07T15:58:42.579Z
Learning: In WLED segments, the backing pixel buffer is always allocated to the full physical segment size, but effect rendering only uses a subset when mirror or grouping are enabled. For 2D segments, this subset is vWidth() * vHeight(), and for 1D segments it's vLength(). Color processing operations like fade_out should iterate over this rendering subset, not the full allocated buffer.
Learnt from: blazoncek
PR: wled/WLED#4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the `getSegment()` function guards against out-of-bounds segment IDs by falling back to the main segment ID, and `getFirstSelectedSegId()` falls back to `getMainSegmentId()` if no segments are selected, ensuring no crashes when used through the `setValuesFromFirstSelectedSeg()` macro.
Learnt from: blazoncek
PR: wled/WLED#4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the `getSegment()` function guards against out-of-bounds segment IDs, and `getFirstSelectedSegId()` falls back to `getMainSegmentId()` if no segments are selected, ensuring no crashes when used through the `setValuesFromFirstSelectedSeg()` macro.
Learnt from: blazoncek
PR: wled/WLED#4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the `getSegment()` function guards against out-of-bounds segment IDs, and `getFirstSelectedSegId()` falls back to `getMainSegmentId()` if no segments are selected, ensuring no crashes when used through the `setValuesFromFirstSelectedSeg()` macro.
Learnt from: blazoncek
PR: wled/WLED#4482
File: wled00/udp.cpp:343-347
Timestamp: 2025-02-19T12:41:22.676Z
Learning: The setName() method in WLED's Segment class (FX_fcn.cpp) handles all necessary validation internally, including:
1. Null pointer checks
2. Length validation against WLED_MAX_SEGNAME_LEN
3. Memory allocation/reallocation
No additional validation or name clearing is needed before calling setName().
Learnt from: DedeHai
PR: wled/WLED#4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
Learnt from: blazoncek
PR: wled/WLED#4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: blazoncek
PR: wled/WLED#4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.200Z
Learning: In WLED, maximum segment name length varies by platform:
- ESP8266: 32 characters (WLED_MAX_SEGNAME_LEN = 32)
- ESP32: 64 characters (WLED_MAX_SEGNAME_LEN = 64)
This platform difference can cause truncation when syncing longer names from ESP32 to ESP8266. Additionally, the WLED UI has limitations regarding modified maximum segment name lengths.
Learnt from: DedeHai
PR: wled/WLED#4682
File: wled00/FX.cpp:8997-9005
Timestamp: 2025-05-09T18:43:15.355Z
Learning: In the WLED codebase, SEGMENT.custom3 is always constrained to the range 0-31 and will not exceed this range.
wled00/FX_fcn.cpp (11)
Learnt from: blazoncek
PR: wled/WLED#4667
File: usermods/user_fx/user_fx.cpp:27-30
Timestamp: 2025-04-30T05:41:03.633Z
Learning: In WLED, the Segment::allocateData() method already includes optimization to check if data is allocated and sufficiently sized, handling buffer reuse to reduce memory fragmentation. Adding an external check like `if (SEGENV.data == nullptr && !SEGENV.allocateData(dataSize))` is unnecessary and could be problematic, as it bypasses proper size verification.
Learnt from: willmmiles
PR: wled/WLED#4712
File: wled00/FX_fcn.cpp:992-993
Timestamp: 2025-06-07T15:58:42.579Z
Learning: In WLED segments, the backing pixel buffer is always allocated to the full physical segment size, but effect rendering only uses a subset when mirror or grouping are enabled. For 2D segments, this subset is vWidth() * vHeight(), and for 1D segments it's vLength(). Color processing operations like fade_out should iterate over this rendering subset, not the full allocated buffer.
Learnt from: blazoncek
PR: wled/WLED#4482
File: wled00/udp.cpp:343-347
Timestamp: 2025-02-19T12:41:22.676Z
Learning: The setName() method in WLED's Segment class (FX_fcn.cpp) handles all necessary validation internally, including:
1. Null pointer checks
2. Length validation against WLED_MAX_SEGNAME_LEN
3. Memory allocation/reallocation
No additional validation or name clearing is needed before calling setName().
Learnt from: DedeHai
PR: wled/WLED#4750
File: usermods/audioreactive/audio_reactive.cpp:311-321
Timestamp: 2025-06-27T06:04:44.652Z
Learning: In WLED's AudioReactive usermod, FFT buffer memory (valFFT, windowFFT) is allocated once during initialization and never deallocated, following typical embedded system design where core functionality buffers persist for the application lifetime. Raw buffer pointers for aligned allocations don't need to be tracked in the current implementation since deallocation is never performed.
Learnt from: blazoncek
PR: wled/WLED#4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the `getSegment()` function guards against out-of-bounds segment IDs, and `getFirstSelectedSegId()` falls back to `getMainSegmentId()` if no segments are selected, ensuring no crashes when used through the `setValuesFromFirstSelectedSeg()` macro.
Learnt from: blazoncek
PR: wled/WLED#4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the `getSegment()` function guards against out-of-bounds segment IDs, and `getFirstSelectedSegId()` falls back to `getMainSegmentId()` if no segments are selected, ensuring no crashes when used through the `setValuesFromFirstSelectedSeg()` macro.
Learnt from: blazoncek
PR: wled/WLED#4658
File: wled00/led.cpp:90-91
Timestamp: 2025-04-24T09:31:06.879Z
Learning: In the WLED codebase, the `getSegment()` function guards against out-of-bounds segment IDs by falling back to the main segment ID, and `getFirstSelectedSegId()` falls back to `getMainSegmentId()` if no segments are selected, ensuring no crashes when used through the `setValuesFromFirstSelectedSeg()` macro.
Learnt from: blazoncek
PR: wled/WLED#4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.200Z
Learning: In WLED, maximum segment name length varies by platform:
- ESP8266: 32 characters (WLED_MAX_SEGNAME_LEN = 32)
- ESP32: 64 characters (WLED_MAX_SEGNAME_LEN = 64)
This platform difference can cause truncation when syncing longer names from ESP32 to ESP8266. Additionally, the WLED UI has limitations regarding modified maximum segment name lengths.
Learnt from: blazoncek
PR: wled/WLED#4482
File: wled00/udp.cpp:147-149
Timestamp: 2025-02-19T12:43:34.199Z
Learning: ESP8266 and ESP32 platforms have different maximum segment name lengths in WLED, which can cause truncation when syncing segment names between devices. This platform difference affects the user experience when using the segment name sync feature.
Learnt from: netmindz
PR: wled/WLED#4728
File: wled00/FX.h:378-378
Timestamp: 2025-06-15T09:59:52.720Z
Learning: In WLED's FX.h, MODE_COUNT represents the highest FX_MODE_ ID + 1, not the total count of FX_MODE_ definitions. This is because some effects may have been removed but their IDs are not reused, creating gaps in the numbering sequence. MODE_COUNT is used for array bounds and range checking.
Learnt from: netmindz
PR: wled/WLED#4728
File: wled00/FX.h:378-378
Timestamp: 2025-06-15T09:59:52.720Z
Learning: In WLED's FX.h, MODE_COUNT represents the highest FX_MODE_ ID + 1, not the total count of FX_MODE_ definitions. This is because some effects may have been removed but their IDs are not reused, creating gaps in the numbering sequence. MODE_COUNT is used for array bounds and range checking.
🔇 Additional comments (1)
wled00/FX_fcn.cpp (1)
162-174: Excellent memory management fix!This change properly handles
realloc()failure scenarios and prevents memory leaks. The implementation correctly:
- Uses a temporary pointer to store
d_realloc()result- Preserves the original pointer if reallocation fails
- Properly frees old memory and updates tracking variables on failure
- Maintains accurate
_usedSegmentDataaccountingThis addresses the critical issue described in the PR where the original data pointer was lost if
d_realloc()returned null.
willmmiles
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. No objections to a stopgap for #4779.
|
thanks for reviewing. I found a few more bugs. will add them to this PR. Edit: |
|
I am testing ESP32 for larger setups with the latest code. |
found one of the culprits leading to crashes in 1D->2D expansions. Corner MUST be boundary checked as it blindly writes the max dimension, crashing anything non quadratic immediately. also removed realloc() in other places, improving fragmentation.
|
IIRC, the web server core doesn't typically do any allocations bigger than a single packet (1500 bytes). AsyncJSONResponse serializes on a packet-by-packet basis; it doesn't need a contiguous block either, other than the globally allocated JSON buffer. Do you get HTTP 500s from specific requests? The web server core monitors free space before accepting each call. |
wled00/FX_fcn.cpp
Outdated
| void* newData = d_realloc(data, len); // note: realloc returns null if it fails but does not free the original pointer | ||
| if (!newData || newData != data) // realloc failed or used a new memory block causing fragmentation. free and allocate new block | ||
| d_free(newData); | ||
| d_free(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to free the old buffer, realloc() has already free()d it for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(unless it returned null)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but only if it does not fail, i.e. if newData is valid. Or ist that incorrect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
But if realloc did allocate a new buffer for you, there's no point in releasing it and trying again; malloc could just give you the new buffer back, since there wasn't enough space to extend the old one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, I had an error in my thought process thinking that if a new segment is created and the old data copied it would help fragmentation if deleting all and allocating new but that is exactly what realloc() actually does.
I did not investigate what exactly happens. I looked more into why heap is fragmented so much. Sometimes I do get 503, other times its just stuck on loading without error in the browser. |
Under extreme heap pressure it will just drop the connection entirely. It needs to be pretty extreme though (biggest block <2048, free heap < 8192). |
|
I have the UI failing at ~8k smallest block and 70k free heap. so must be something else at play. I am actually working on PS memory management, these are just things I picked up along the way and had a play and some printouts just to make sure its not the PS causing it. It may still be at fault but its pretty hard to say as I am debugging on several fronts... |
There are a lot of transient allocations in the TCP stack. The view from the web server might be much tighter than elsewhere in the code. You can try building with |
|
This is what I get. 64x32 2D matrix, ESP32, selecting some RAM heavy PS FX: Last FX is running happily, just UI is dead. Edit: after changing FX a few times (even non-PS FX) UI breaks once more. |
|
got the other error again: it just stops responding, largest free block is only 1.2k at this point: after this, largest free block increases to 4k but UI never recovers. |
Sorry the output is somewhat inscrutable. The values are pointer to the AsyncClient object, the remote port number, the number of web requests queued, the largest free RAM block, and the available heap. So yeah, it's just out of memory. :( |
wled00/FX_fcn.cpp
Outdated
| if (length() != oldLength) { | ||
| if (pixels) pixels = static_cast<uint32_t*>(d_realloc(pixels, sizeof(uint32_t) * length())); | ||
| else pixels = static_cast<uint32_t*>(d_malloc(sizeof(uint32_t) * length())); | ||
| if (pixels) d_free(pixels); //pixels = static_cast<uint32_t*>(d_realloc(pixels, sizeof(uint32_t) * length())); // using realloc can cause additional fragmentation instead of reducing it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a misleading comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is what I have seen in tests: when using realloc, fragmentation was worse. When using free/malloc it was somewhat better but both approaches do cause fragmentation. The idea behind this is that free + malloc allows it to fill gaps left behind.
Got any test scenario that is good to test which works better in practice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sole purpose of realloc() functions is to reduce fragmentation. free + malloc will only fill gaps if garbage collector did its job. Unfortunately I have no knowledge of MM on ESP platforms.
IIRC @softhack007 was having a lot of issues with memory fragmentation in the past but I no longer remember what was the outcome (except that FX memory is no longer released and it only grows once allocated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is zero garbage collection, its first come, first serve from the bottom up.
so if a block of memory divides a larger block, free -> malloc should move it to the bottom, realloc will keep it there if it fits.
Using realloc can prevent fragmentation but it can also make it worse if large buffers are at play. At least that is my understanding and also what I have seen in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then go a step further and simplify p_realloc and d_realloc to use p_free/p_malloc and d_free/d_malloc instead.
It is the cleanest solution.
However ESP8266 will need the same.
memory is a bit of an issue currently. If I set up: 64x64 using 8 Panels, 512 LEDs each and have a bootup preset = solid the UI already does not work anymore. there is "plenty" of free heap (60k) but largest block is too small it seems (around 10k). |
|
@DedeHai this was a great find and a poor performance on my side. I should've thought this better. However I would prefer to fix the issue at its core, like so: void *p_realloc(void *ptr, size_t size) {
int caps1 = MALLOC_CAP_SPIRAM | MALLOC_CAP_8BIT;
int caps2 = MALLOC_CAP_DEFAULT | MALLOC_CAP_8BIT;
void *newbuf = nullptr;
if (psramSafe) {
if (heap_caps_get_free_size(caps2) > 3*MIN_HEAP_SIZE && size < 512) std::swap(caps1, caps2); // use DRAM for small alloactions & when heap is plenty
newbuf = heap_caps_realloc_prefer(ptr, size, 2, caps1, caps2); // otherwise prefer PSRAM if it exists
if (newbuf) return newbuf; // realloc successful
else {
p_free(ptr); // free old buffer if realloc failed (to keep consumer allocation logic simple)
return p_malloc(size); // fallback to malloc if realloc failed (buffer will not be copied!!!)
}
}
newbuf = heap_caps_realloc(ptr, size, caps2); // fallback to default realloc
if (newbuf) return newbuf; // realloc successful
else {
p_free(ptr); // free old buffer if realloc failed
return heap_caps_malloc(size, caps2); // fallback to malloc if realloc failed
}
}I do agree that this hides the issue of failed reallocations downstream but IMO that is an acceptable tradeoff as the code is simpler. |
Indeed it is. 😁 Inventing the logic to switch between DRAM and PSRAM is the key. And if the PSRAM is fast enough for LED and effect data. |
- realloc does not free original buffer on fail - see also wled#4783 - align DRAM alloc to 32bit read (experimental)
|
@blazoncek can you explain the intent of this function? void *d_malloc(size_t size) {
int caps1 = MALLOC_CAP_DEFAULT | MALLOC_CAP_8BIT;
int caps2 = MALLOC_CAP_SPIRAM | MALLOC_CAP_8BIT;
if (psramSafe) {
if (size > MIN_HEAP_SIZE) std::swap(caps1, caps2); // prefer PSRAM for large alloactions
return heap_caps_malloc_prefer(size, 2, caps1, caps2); // otherwise prefer DRAM
}
return heap_caps_malloc(size, caps1);
}is the idea to put all rendering buffers in PSRAM, even if there is plenty of DRAM available? What am I missing? |
|
FWIW, deferring the mapping to the last step in the render process - as we now do with blending - should make things more performant when using PSRAM. PSRAM on ESP32 has a cache in SRAM which is paged in and out: so when you hit a PSRAM address, that page gets swapped in, and then "nearby" accesses are just as fast as any other SRAM until the page needs to be swapped out again. So if you're doing a lot of linear accesses (ie. start to end, in row order) it's pretty fast; if you're doing a lot of random or non-linear accesses (such as when applying ledmaps, inversions, mirroring, 2d expansion, etc.) it can be really slow paging in and out all the time. The more of the mapping logic that can be done later, the faster it will go. |
|
d_ should prefer DRAM while p_ should prefer PSRAM. Assuming PSRAM is available. |
Is that true for rev.1 and rev.3 ESPs? @softhack007 had very bad results with PSRAM on classic ESP32 in the past. I also have a test units rev.1 and rev.3 so can compare when time permits. |
I'll review it in more detail, but as far as I could tell, it's fundamental to the memory architecture -- there isn't any way for the CPU to directly address PSRAM, it all goes through the MMU and is backed by some SRAM. MMU bugs in those accesses are entirely possible, though. |
|
Especially MMU and paging issues with dual-core systems -- there seem to be a lot of SMP coherency-related fixes and workarounds in the ESP-IDF source code... |
huge as in larger than MIN_HEAP_SIZE (which is only 2k)? |
Whatever the code says. 😄 BTW at the time of writing MIN_HEAP_SIZE was 8k. |
|
8k min heap makes a lot more sense :) at least for ESP32's. webUI / webserver issue: I ran some testing on ESP32: as you can see in the logs above, at 8k of heap (actual usable heap) its already very unstable. @willmmiles the reason the webUI does not work right is closely tied to available heap (obviously) but once it breaks its broken for good: I can get it to half-load the UI, throwing some "json abruptly ends" kind of errors, once the heap recovers to over 20k with a 10k contiguous part it still stays broken until reboot. So keeping a minimum heap of about 8k makes sense but something still breaks the UI. 2k min heap is uselsess. Also: on ESP32 @blazoncek I found one more subtle bug I could not pinpoint: when transitions are enabled but allocation of pixel buffer fails, FX are still executed with |
|
@willmmiles I see you changed the MIN_HEAP_SIZE to 2k in your async webserver update. at the same time WLED_REQUEST_HEAP_USAGE is set to 12k. Does that mean you expect any request to not be processed correctly below 12k of heap? If so, the MIN_HEAP_SIZE should be larger than that no? |
Not a bug of segment logic per-se. If a request for segment's pixel buffer fails (during start of transition), a copy of "old" segment is made, however its We can add |
- stop transition - add aditional check in service() - see discussion in wled#4783
|
I'm aware there are still unresolved issues with surviving heap exhaustion in the web server stack. I haven't gone all the way to the bottom yet - I believe there are some problems even in the wifi drivers themselves where they can hang up connections, leaking packet buffers, if there's an allocation failure on some critical path. I'd tried to track down as many places as possible and make sure things are handled gracefully, but I haven't managed to get them all yet. Sorry - still some work to do there. :( The heap availability checks in AsyncWebServerWLED do use MALLOC_CAPS to check only for memory that There are actually several layers of heap checks in the web server:
|
- ESP32 C3 has no PSRAM, it now uses default alloc functions - also added missing UI info for "Error 7"
|
Update:
from my side this is ready to merge. |
Why not also add a fix for inactive "old segment"? All it needs is add
It will solve this. BTW I also noticed (while poking in my fork) that there are no safeguards if |
that should be a seperate PR, I already wrote it down in my notes.
I also have a fix for that: release FX data if ram runs low. PS can allocate quite a lot of FX memory which should be released but not if there is no shortage of ram. edit: |
|
Very happy to see some optimisations being done for better memory management for WLED. Not something for this PR, but at some point I think it would be good to a bit of a review of PSRAM usage overall as especially with octal, we could make greater use of it and in general I'm a big fan of providing stability where we can - i.e better to perhaps drop the frame rate than be unresponsive in the UI or reboot rather than avoiding SPRAM due to speed concerns |
the new alloc functions should take care of that, parameters could be fine-tuned a bit more for specific use-cases but the general approach of @blazoncek d_alloc/p_alloc is sound. |
| } | ||
|
|
||
| void *d_calloc(size_t count, size_t size) { | ||
| int caps1 = MALLOC_CAP_DEFAULT | MALLOC_CAP_8BIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't new in this commit, but one thing to be aware of is that MALLOC_CAP_DEFAULT by itself doesn't mean "use internal memory" -- it means "use the memory pool that malloc() draws from". This can matter if PSRAM is available to malloc(), either because the IDF was built with CONFIG_SPIRAM_USE_MALLOC, or if heap_caps_malloc_extmem_enable() was called; both allow this to return a PSRAM buffer. This isn't necessarily wrong per se -- certainly p_*alloc() above doesn't care -- but it might be worth thinking about here in the d_*alloc() functions.
If we want to insist on using internal memory, use MALLOC_CAP_INTERNAL. If we want to limit it to the pools that malloc() can also reach, use MALLOC_CAP_INTERNAL | MALLOC_CAP_DEFAULT. As you've noticed, there are some internal SRAM pools that malloc() won't touch, such as the block of shared data/instruction RAM intended for dynamically loadable or self-modifying code; and when SPI RAM is enabled, the IDF configuration can reserve some internal memory in a separate non-malloc()-accessible pool for DMA buffers. Using only MALLOC_CAP_INTERNAL will permit usage of those pools, if we want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By all means! I set up the framework, more skilled people should tune it. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using only
MALLOC_CAP_INTERNALwill permit usage of those pools, if we want to.
you beat me by a few hours, I ran tests on exactly this and already prepared a few new function calls that work as intended. Conclusion of my endevours in a nutshell:
S3, S2 and C3 all have IRAM and DRAM as well as RTCRAM enabled to be used as heap. I initially thought they do not but turns out that was a false test.
ESP32 however does not and can not BUT: as we are using all 32bit colors now, we can tap into IRAM which is 32bit accessible only. I tested that and it works, just need to run some more tests and make sure the buffers NEVER get accessed in any other way. It will give us about 50k of extra internal RAM (until more functions are put in IRAM that is).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By all means! I set up the framework, more skilled people should tune it. 😉
Now if you really want to get fancy, you can play fun and games with the MMU to coalesce fragmented blocks to a region of contiguous virtual address space so they look and feel like a flat buffer ... ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now if you really want to get fancy, you can play fun and games with the MMU to coalesce fragmented blocks to a region of contiguous virtual address space so they look and feel like a flat buffer ... ;)
Eh, scrub that. The MMU implementation isn't general enough to be useful for this purpose. It looks like it's meant to be used for task stacks and flash/SPIRAM caching only, pretty much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to insist on using internal memory, use
MALLOC_CAP_INTERNAL
Causes bootloop when allocating (and initialising) ledmap on ESP32.
EDIT: ... when paired with MALLOC_CAP_32BIT.
All is well.
|
Any objectons to merge this as-is and then build upon it instead of polishing what was supposed to be a quick bugfix? we can continue the discussion in a follow-up PR with improvements to allocation handling. |
Why? It is a FX allocation bugfix. |
to me it is a different issue, related to segment buffer and has different implications I need to look into and test. |
|
+1 for merge now. I've been looking at locking/task safety for #4779 -- I think one of the sub-tasks on the way there will be to make |
So why then this? It does not belong in this PR then. |
you are correct, however I already tested that along the way. my comment was less about mixing up PRs and more about getting this one finished. |
d_realloc()returned null . This fixes the memory leak_usedSegmentDatawas also wrong:Segment::addUsedSegmentData(-_dataLen);must not be called ifd_malloc()fails, only for realloc.Summary by CodeRabbit