-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing mmap()/munmap()/mremap() features for in-place adjustment of anonymous mappings #21816
Comments
My first comment here would be regarding the fact that "PHP's memory manager relies heavily on mmap() and munmap() for allocating and reallocating memory as anonymous mappings". As you may be aware emscripten (and wasm in general) does no support mmap at all, and simply attempts to fake some parts of it by calling malloc/free. While we can continue to improve our fake mmap support, its normally best for serious application to avoid using this fake mmap at all and instead fall back to I'm certainly open to accepting improvments to our fake mmap support, but I would perhaps consider modifying |
If you want to see how fake our mmap support is or if you want to try to improve it the code is here: emscripten/system/lib/libc/emscripten_mmap.c Lines 114 to 138 in 140a17c
|
Thank you for your thoughtful feedback, @sbc100. TL;DR -- After writing everything out, I think you are correct that it would be better to use more direct memory management APIs. In addition, I don't know whether it really makes sense for PHP to be using In case you are interested, here are the details:
I initially intended to pursue adding deeper support for manipulating anonymous mapped regions, but after taking the time to write out the tradeoffs, I agree that it is probably use more direct allocation methods. On the downsides of mmapI don't know the history or thought behind PHP's memory management implementation but will attempt to talk this out based on a reading of the implementation. One of the tricky things with PHP memory allocation is that it prefers allocating 2MB chunks aligned to 2MB. This alignment is not naturally guaranteed by There are trade-offs when choosing whether to allocate 2MB-aligned memory using Initial aligned allocation
In contrast, 👉 For initial allocation, Growing an existing aligned allocationAFAIK, for memory allocated using
The known cost of growing 2MB-aligned memory with
Perhaps 👉 Given the above comparisons, simply allocating a larger 2MB-aligned region with On official support for
|
Before closing this, let's double check in case I am missing something: Are there any memory management APIs other than mremap() that could be used to attempt growing memory in place without the risk of data being moved to an address that is not aligned to a 2MB boundary? I do not currently believe so and understand that Emscripten does not support emscripten/system/lib/libc/emscripten_syscall_stubs.c Lines 213 to 216 in 7e7c057
|
No I don't think so, because we don't have virtual memory support its not really possible to grow without moving (unless you get luck and there happens to be some free space after your allocation.. i.e. the happy path of |
Sorry, @sbc100. It turns out I was wrong about the worst mmap case in the above comment. This is incorrect:
The potential worst case of growing a 2MB-aligned memory region with mmap within PHP is actually So the constant cost of "reallocating" memory with And the comparative costs of using mremap/mmap are:
It's not such a dramatic difference as I made it seem above, but I'm not sure it is worth the work to sometimes obtain the lower |
Thanks for confirming.
Thanks for the pointer. That sounds like really interesting work. Initially, when considering making a PR, it seemed like improving mmap()/mremap()/munmap() support would require adding explicit support to one of the allocators and conditionally using that support depending on the selected allocator. Would that kind of allocator-specific change be considered or does the solution need to be more universal? (If so, it seems like starting with emmalloc would be simpler than dlmalloc) |
We've just landed our prototype mmap/munmap replacement to start testing it out in our runtime. If it shakes out well I may look into what it would take to expand its support to the point that we could consider upstreaming it into emscripten. It would be good to understand what features are actually necessary for it to be sufficient as a foundation for emscripten dlmalloc/mimalloc. Stuff that's missing right now:
|
Great! I think multithreading and 64-bit are both important. Non of the other things matter to malloc implementations I think. Specifically map_fixed seems basically impossible without an actual virtual address space, and custom alignment I don't think matters since mmap doesn't support that anyway. It should always do be page aligned. |
Version of emscripten/emsdk:
3.1.57
The WordPress Playground project uses Emscripten to compile the PHP runtime as Wasm, and it uses those builds to run PHP in the browser and on Node.js.
PHP's memory manager relies heavily on mmap() and munmap() for allocating and reallocating memory as anonymous mappings. There are two things PHP cannot do properly using Emscripten's mmap() and mumnmap() implementations:
For the PHP runtime, the primary thing that is missing from Emscripten is a munmap() implementation that supports partial unmapping of anonymous mappings. This would cover item 1 and half of item 2 (truncation). An example of aligned allocation with mmap()/munmap() can be found here with the actual munmap() call here. An example of truncation can be found here which leads to the same actual munmap() call.
Secondly, PHP attempts to extend anonymous mappings in-place using either mremap() or mmap(). When using mremap(), no flags are provided. When using mmap(), PHP passes one of the following combinations of flags:
MAP_PRIVATE | MAP_ANON | MAP_FIXED | MAP_EXCL
MAP_PRIVATE | MAP_ANON | MAP_TRYFIXED
When PHP cannot extend memory in-place, it resorts to allocating another aligned chunk of memory and copying the contents of the current chunk into it. This increases memory requirements and leads to more frequent out of memory conditions than when memory can be extended in-place.
Without in-place adjustment of anonymous mappings
Without the ability to adjust anonymous mappings in-place, PHP mishandles memory and incorrectly assumes partial unmapping works, leading to large memory leaks in certain cases.
As a workaround, Playground currently use a PHP extension to install alternate memory allocation handlers that can only allocate aligned memory and free it. No in-place adjustment is supported. Reallocation of aligned regions requires maintaining old and new memory regions at the same time while the old is copied to the new.
Could Emscripten be updated to support in-place adjustment of anonymous mappings?
cc @ThomasTheDane
The text was updated successfully, but these errors were encountered: