-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OOM during std::vector allocation doesn't terminate the program #11042
Comments
We have to catch errors there to know if allocation failed? If it failed then the VM has no more memory to give us. Then we return an error code to our caller, which ends up as a malloc failure. By default But maybe I don't understand your question? |
Do you see a case where it should halt but doesn't? (or should return 0 from |
Maybe I was misled by the comment -
Does undefined here mean "JS undefined" or "undefined behavior"?
I am not sure I see this happening... With these command line options, targeting a .js file (so .js + .wasm are produced):
The behavior that I observe is that on an OOM coming from STL (std::vector => operator new), I see the following output:
Does that indicate that abort was triggered? This is with 1.39.12; with an earlier Emscripten version I get this instead:
But maybe there was a fix in 1.39.12 around OOM error handling, unsure. |
Oh, heh, sorry for the confusion! It returns a JS undefined. It could also have a If your example is OOMing around the 2GB or 4GB limits, then recent fixes may be necessary. One has landed in #10811 (for 2GB) and one is in #11047 (for 4GB). Both of the call stacks you mention would be bugs in emscripten - the error should be clearly from malloc (or sbrk, I forget). If those 2 PRs don't fix things, let me know! |
With 1.39.13 the behavior seems to depend on the presence of Without -g:
With -g (not sure what prints the 'undefined' text):
With -g2:
The changes you linked don't seem to be in 1.39.13 so I can check back after 1.39.14. |
Ok, 1.39.14 was just tagged, and all the changes should be in there. |
Still happens with 1.39.14:
I'll try to debug this to see what's going on. |
Ah, can you maybe get me a testcase to debug, or simple steps to build + reproduce? |
This seems to reproduce the problem: #include <vector>
#include <stdio.h>
int main()
{
std::vector<std::vector<int>> v;
for (int i = 0; ; ++i)
{
printf("iteration %d\n", i);
std::vector<int> vv(1024*1024, i);
v.push_back(std::move(vv));
}
} Compile with:
Running with latest node I get this output:
|
I see, thanks @zeux ... so the issue here is exceptions. The libc++ I'm not totally sure what to do here. On the one hand, this seems to work as intended. But it also seems like a problem that might be common (but I don't recall it being). Separately, it looks like growth disables // Setting this option [ALLOW_MEMORY_GROWTH] on will disable ABORTING_MALLOC, in other words,
// ALLOW_MEMORY_GROWTH enables fully standard behavior, of both malloc
// returning 0 when it fails, and also of being able to allocate more
// memory from the system as necessary. Perhaps we should change that - it seems like we should, but this is very old behavior that hasn't been an issue til now... |
So if exceptions are disabled and memory allocation fails, I'd expect something like std::terminate to get called from the new handler? https://en.cppreference.com/w/cpp/memory/new/set_new_handler That is, I'd expect that if exception support is compiled out, the (throwing) new operator terminates the program instead of returning 0. If this did happen I'd be happy because I just want the OOM to be prominently highlighted as such at the correct source location. Alternatively, supporting aborting malloc in memory growth mode would work for me as well. |
Interesting, yeah, the docs you linked to say
But the libc++ code has this: If exceptions are disabled, that does not throw Hopefully people that understand the c++ spec better than me can clarify, I'm confused 😕 |
Using set_new_handler that calls abort() produces this btw:
Which is much better, although I'd love to not see |
Ah, I guess the JS abort prints undefined twice (to stdout & stderr), and C abort is mapped to JS abort.
So I suppose the verdict is that things work mostly as they should, it's just that the behavior isn't particularly intuitive. I'm not sure that there are good options here, since changing libc++ or installing a default new handler seem slightly odd, and aborting malloc doesn't seem ideal either as that's not what you'd expect a C malloc to do in OOM cases. |
Yeah that double printing of error messages I find annoying .. we should look into fixed that. Looks like it could even print 3 times if you have an |
I think the recommendation to build with Perhaps we could rephrase is as |
This makes sense to me for any case where abort is called from Emscripten JS library itself, but not necessarily if it's called from C - if a user says Of course if the Emscripten C standard library has any conditional prints before calling C abort, then passing ASSERTION=1 might still be valuable. |
I thought about this a bit more and I feel like there needs to be some change in Emscripten. The reason for this is that I realized that the problem isn't merely that operator new returns 0 - it's that 0 is a valid pointer in Wasm. So consider the default 2GB heap, and a program like the above that allocates many vectors that add up to more than 2 GB. The vectors that don't get enough memory get allocated with array data = 0, but then the code proceeds to write to that pointer. Effectively this results in silently corrupting a lot of program state. I think this is why the errors that I've been getting from this are so random and odd - you aren't going to get an out-of-bounds access error on the vector that couldn't be allocated, it'll be a random error somewhere else. What's worse, with a 4GB heap basically all pointers are valid, so you won't ever get an out-of-bounds access error. Of course this also applies to malloc, but at least for malloc it's normal to expect the users to check for NULL, whereas for C++ containers OOM should lead to an exception or a program crash - and in libc++ in other environments there'd be a crash when dereferencing NULL shortly after allocation at least... |
I agree. I doesn't seem to make sense that we should disable exceptions and also not terminate when we would otherwise throw |
It looks like chromium installs a new handler for this purpose: https://source.chromium.org/chromium/chromium/src/+/master:base/process/memory_linux.cc;drc=ccf17f2881e6c077f20d4421ebcee2bbd5486db2;bpv=1;bpt=1;l=58?q=set_new_handler&ss=chromium&originalUrl=https:%2F%2Fcs.chromium.org%2F I wonder if it make sense to install something like this by default in emscripten? |
Yeah, thinking on this some more, I strongly agree we should change this. In fact I remember a previous report of this issue, so it's recurring. It's just bad for Specifically, I think we should modify the one line in libc++/libc++abi to |
I'd rather do it standard way via set_new_handler TBH, but lets look that costs of doing that in some kind of static ctor over hacking libc++. And we need to remember to only do this when exception are disabled. |
…ory growth (#11079) When libc++/libc++abi are built with exceptions disabled, the new implementation there does not throw an exception for an error, but it also does nothing else. So new ends up returning a zero if malloc did, which can break programs. Technically libc++/libc++abi are doing a reasonable thing here, just removing all exceptions-related code when exceptions are disabled. The assumption is likely that a user program would set a new_handler if an error is desired. For us, we have to change this as our default mode is to have exceptions disabled, and we don't want users to need to know they need to do anything. This makes it abort instead. (Note that without growth this happened to always work, since we abort on any failing allocation. With growth enabled, though, malloc returns 0, and we end up in this situation.) Fixes #11042 As discussed there I also looked at the option of installing a set_new_handler that does an abort. That ends up increasing code size by a little (bit less than 1%) because if adds a global constructor, a function to the table, and some memory operations. It seems better to just modify new itself which avoids all that.
The behavior of code that OOMs right now is odd and unexpected:
By default, the error thrown by WebAssembly.memory.grow is caught and swallowed in emscripten_realloc_buffer; some other heap access in the future dies with "out of bounds memory access".
When assertions are enabled, an error message is printed to the output but it doesn't halt execution either.
Is there a reason why the exception is being caught here in the first place? It would be good to not do that at least if assertions are disabled, and maybe even if they are enabled (& use finally to log the error instead). This will simplify root cause analysis for OOMs.
The text was updated successfully, but these errors were encountered: