-
Notifications
You must be signed in to change notification settings - Fork 518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Callstack unwind during out-of-memory error handling may trigger another error #476
Comments
Hmm, assuming there's space in the value stack (which is the normal situation at any point) Duktape should first try to push an error, fail that with out-of-memory, and fall back to pushing a preallocated DoubleError object on the value stack. The double error doesn't require object allocations because the instance is created beforehand so the only thing this needs is the ability to write a tagged value ( Do you have an example which reproduces the case? Using a pooled allocator I either get a "Error: alloc failed" or similar, or a "double error" no matter what I try manually:
However I'm testing with Ecmascript code so maybe |
I'm trying to reproduce this outside my environment. In the meanwhile, here's the call stack
|
Ok, I think the issue here is that the error is handled more or less as intended but the callstack unwind then closes an environment record which tries to allocate memory when it's copying over variables to a closure. Doing that is mandatory even when unwinding due to an error because function instances referencing the variables might have been created before the error occurred. Not sure what the best fix would be, I'll have to think about this a bit. One will necessarily need to sacrifice some semantics related to scopes to be able to unwind even when scopes cannot be closed (or the scope objects would have to be allocated on function entry to ensure allocations are not needed on an unwind). This is probably not a major issue for sandboxing, and it's more important to shut down cleanly. Using another setjmp catchpoint for the unwind would catch the error. But that'd carry some overhead for all calls to handle a relatively rare case (an important case for sandboxing, of course). It'd also possible to rework the scope handling to use a lower level interface to manage the scope objects, so that throwing would be avoided. For instance, the scope objects could be resized before copying any properties over and if that failed, just skip copying variables. If the resize succeeded, the variables could be copied without a risk of an error. This would also be a bit faster than the pretty simple explicit property handling. |
It might also be useful for an allocation error (which only happens when emergency garbage collection has failed) to propagate out as an uncatchable error similar to script timeout errors. It's not really safe to continue, and as described above it may not be possible to respect all required Ecmascript semantics after the error. |
Going back to the original question, it'd of course be possible to expose some heap state to the application. But doing so creates a compatibility promise so it'd need to be something which can be guaranteed to exist and be relevant over time. One possible thing to expose would be to communicate with the user application about the need for emergency GC. For example:
Requesting a forced shutdown could work similarly to how a script timeout propagates, i.e. as an "uncatchable" error which isn't catchable using Ecmascript code but respects Duktape/C protected calls to allow native resources to be released. See #474. Even with this kind of integration in place, there's not necessarily an easily determined upper bound on how much memory is needed to shutdown cleanly (to respect finalizer guarantees etc). It'd be relatively easy to shut down forcibly without respecting e.g. finalizer guarantees - but that's also problematic for sandboxing. |
Wait, you need to allocate memory to unwind? That seems backwards, although I admit I'm not familiar with all the semantics related to Ecmascript call handling (the whole spec is quite convoluted). |
Yes, as things are now, unwinding creates an explicit scope object during unwinding. Unless that object were allocated on entry, memory is needed to unwind. This is not an Ecmascript requirement but a Duktape internal design issue. The memory could be allocated on entry instead which should fix this issue, but I'll need to check the unwind path to see if similar issues exist elsewhere. Ecmascript does require e.g. execution of "finally" blocks which will definitely need memory. Duktape also executes finalizers, which needs memory. But either of these can fail without impacting the unwind process which is the issue here, because an error during unwind is not inside the same protection as the call itself. Not having enough memory to run finalizers may be a problem by itself, of course, because you don't necessarily get an actual chance to free native resources. Dealing with actual out-of-memory is quite hard if you simultaneously want some guarantees (usually one needs preallocation to get them). |
That's all true. Although I've learned that if you're truly out of memory enough that malloc() starts to fail, you're pretty much dead in the water already; continuing with normal operation is not going to work so you may as well just clean up what you can and pack it in. |
Anyway, what I said about I meant in the more general sense; it isn't (necessarily) relevant here because the allocations are sandboxed, they may not "have to" fail. If I understand the issue correctly, a custom allocator is being used and needs to know whether an important allocation is being requested so the sandbox environment can exempt it from some quota. In that case a workable solution that would avoid exposing internal heap state would be to allow an additional parameter for the custom allocator which indicated the relative "importance" of the allocation. Normal heap allocations and anything else related to script execution would be zero, higher values would be for more critical needs (unwinding, errors originating from Duktape, etc.). Doing this without breaking API compatibility for the purposes of semantic versioning is left as an exercise for the reader. :) |
Yet another solution would be to have two callbacks: one to allocate memory during normal bytecode execution (which should always be sandboxed), and another to allocate memory for internal Duktape structures (which might be more useful to allow lenience). If the app needs both of these to be sandboxed, it will just use the same function for both, else it can provide separate functions. |
Yes, it's often viable to have some reserve. But how much? If the amount of memory needed for an unwind cannot be bounded, it's a workable solution for some cases, but not all. In particular, it will be easy to craft code which will exceed that reserve. |
That's why I suggested leaving it up to the allocation callback(s) still--the app still gets to decide how much to provide as "reserve". |
How would one arrive at a safe value? |
Well, the exact amount of reserve space is out of Duktape's scope, I'm just suggesting ways Duktape could communicate the relative "importance" of the request, with the application still ultimately in control of what to do with it. |
I know. But what does that solve? If the sandbox is running potentially unknown code, that code can always contrive to exceed whatever reserves are in place. The application implementing the sandbox is then responsible for figuring out a reserve which works - which is in general not possible to know in advance. Because there are solutions which work in all cases (and at least allow the heap to be terminated safely), I'd much rather do something like that. |
At a high level the two basic approaches which have consistent behavior would probably be:
|
I was mostly going on @jsas original feature request, to expose the heap Of course as touched on in your post above, if we can remove the requirement for allocations on unwind entirely, this whole point is moot. Even so, giving the user the ability to prioritize "infrastructure" allocations (even if the ultimate quota is the same) might not be a bad idea. |
Like I said above, I think it's important that allocation errors are handled reliably in all cases, so that we don't just shift the problem from one place to another. But having said that, it might still be useful to know some context to an allocation. Considering a "this is an infrastructure allocation" flag: in practice that would be quite difficult to implement. For example, some helper code might be called in response to Duktape internals doing something quite far away from the helper, or as a result of a public API call calling into Duktape internals which eventually calls the helper. There may be a lot of call levels in between; unless that "originating request" is somehow carried all through the call layers, it wouldn't work in practice and you'd have something similar to priority inversion. It would also be possible to set some heap level "unwinding" flag which would then be carried to all allocations while that is in progress. But it would then also apply to e.g. user finalizers which may do an arbitrary amount of work which probably shouldn't be flagged critical. So while it'd maybe be useful, it's quite difficult for the "this is an infrastructure allocation" flag to come out right. Providing a callback or other information about allocations which fail even after garbage collection would be possible and quite straightforward. This might be useful if allocation error leads to a hard exit so that some memory could be added to (mostly) handle the unwindind cleanly (e.g. allow finalizers to run). However, this would still rely on an actual solution to make the unwinding safe even if the memory added wasn't enough for a clean unwind. |
Yeah, that's true. I should really have known better about the helpers, having seen what the bytecode executor looks like firsthand. :) |
I'll give the unwind path scope handling a little test once I have 1.4 stuff in a reasonable shape. It should be relatively simple to just preallocate the properties for the scope object on function entry. |
#477 provides a testcase which seems to reproduce the issue using the "ajduk" command line tool. This should allow testing of a possible fix. |
What is ajduk, how is it different from normal |
Add testcase for out-of-memory unwind (GH-476)
It's the "duk" command line which uses the pooled allocator from AllJoyn.js. When compiled for x86 with pointer compression etc, it quite closely mimics very low memory targets. |
Just as a quick test I disabled variable value copying in the scope handling done on callstack unwind: as a result "ajduk" now fails gracefully:
Before the change there was a fatal error, so perhaps removing any mandatory allocations from scope unwinding would be enough. |
@jsas Here's a branch which disables the scope-related memory allocation on unwind: It breaks compliance so it's just a test, but it'd be interesting to hear if it fixes your fatal error issue. If so, we'd know the root cause is not something else. Here's a prebuilt dist package if that's useful to you: |
That build did the trick ( duk_safe_call returned 1 instead of calling fatal_error handler). Of course, there's no error on the stack to pick up at this point, so would it be possible to have the return value be an error code (like -53)? |
@jsas What does your |
The reason I ask is because of this:
|
@jsas That's good to hear - that branch is not usable as a solution because it just skips full scope handling in unwind, but it does identify the root cause and if this is the only concrete problem it should be relatively easy to fix. I'll need to review the unwind path fully to make sure though. There should be an error object on top of the stack, but it'd simply be a generic Error in this case - either an "alloc error" or similar, or if there wasn't memory to create an actual error, it would be a "double error". Both are simple |
As things are now, error objects are modeled after Ecmascript errors. They are not very verbose from a programmatic point of view: errors have a name, a message, and a stack trace, which are useful as human readable descriptions but programmatically you can basically rely on the error name and/or the error code. They don't provide a detailed distinction between error sources or error severity right now. One thing which is not very nice now is that Duktape specific (non-standard) errors are not proper subclasses but direct instances of
It would also be possible to add some additional boolean flags to errors to indicate severity somehow. For example, error objects could have a Another related improvement would be to change the simple "success vs. error" return code indication of safe calls to provide a three level indication of "success vs. normal error vs. severe error". I'm not sure how good this change would be in practice: would two error levels be enough, would drawing the line between error types be clean, etc. So I'd perhaps favor having that information in the error object itself and provide good accessors for that. Anyway, from sandboxing point of view, you should treat the error on the value stack carefully so that you don't invoke new memory allocations when dealing with the error. Because the value stack is preallocated the following should be safe at least:
Coercing the error to a string is not generally safe because it involves a temporary string which needs to be interned. The |
The Error won't be on the stack if nrets = 0 for the safe call, which is why I asked what his |
@fatcerberus nrets is indeed 0 for that particular call. @svaarala thx for tip about interning string if coercing error. makes sense. |
Yeah, you'll need to pass nrets == 1 if you want the error on the stack (you can pop the unneeded value off afterwards). |
When a custom allocator is used, there is no way to tell if the duktape handle_error flag is set so that an allocation for the error can occur. This makes "safe call" wrappers unable to catch allocation errors (because an error cannot be pushed onto the stack, resulting in the fatal_handler being called.
Is it feasible to expose some properties of the current state of the heap? For example, the internal handling_error flag. In that case, my custom allocator could detect if duktape is trying to allocate memory for an error to push onto the stack.
The text was updated successfully, but these errors were encountered: