Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unremovable Javascript Bindings #146

Closed
hashtag-smashtag opened this issue Apr 11, 2019 · 19 comments
Closed

Unremovable Javascript Bindings #146

hashtag-smashtag opened this issue Apr 11, 2019 · 19 comments
Assignees
Labels
enhancement New feature or request interop Polyglot Language Interoperability

Comments

@hashtag-smashtag
Copy link

Context bindings added by javascript are not removable.

Example test written in Groovy:

@Test
void "js-added bindings cannot be removed"() {
    def context = Context.create("js")
    context.eval("js", """
        let myLet;
        var myVar;
        const myConst = '';
        let MyNamedLetClass = class MyNamedLetClass {}
        const MyNamedConstClass = class MyNamedConstClass {}
        let MyUnnamedLetClass = class {}
        const MyUnnamedConstClass = class {}
        class MyClass {}
        function myFunction() {}
    """)

    def bindings = context.getBindings("js")
    def keys = bindings.getMemberKeys()
    assert keys.size() == 9
    keys.each { it ->
        try {
            bindings.removeMember(it)
        } catch (UnsupportedOperationException ignored) {}
        try {
            bindings.putMember(it, null)
        } catch (UnsupportedOperationException ignored) {}
        context.eval("js", "delete global.${it}")
        context.eval("js", "delete global['${it}']")
        context.eval("js", "delete ${it}")
    }
    assert bindings.getMemberKeys().size() == 0 // fails, still 9
}

While bindings we add via bindings.putMember are removable via bindings.removeMember.

Part of our design, like others (#67, #121), involves reusing pooled Contexts from a single Engine. We want to clear away bindings on context reuse. The only way we've found to achieve this is to ensure that all Javascript code on the Context runs wrapped in an anonymous function (example in Groovy):

def exports = context.eval("js", """
    (function() {
        class MyClass {}
        function myFunction() {}

        return {
            MyClass: MyClass,
            myFunction: myFunction
        };
    })()
""")

def jsClass = exports.getMember("MyClass");
// Can cache/organize etc along with context for reuse

Ideally bindings.removeMember could work for any binding? Or maybe a bindings.clearAll() method? Another idea is a bindings proxy: #44

@wirthi wirthi self-assigned this Apr 15, 2019
@wirthi wirthi added enhancement New feature or request interop Polyglot Language Interoperability labels Apr 15, 2019
@wirthi
Copy link
Member

wirthi commented Apr 15, 2019

Hi @hashtag-smashtag

Thanks for your report, I can confirm this is the current behavior. I will need some more investigation here. While we do not support the removing of bindings at all from JavaScript, this is somewhat correct as you could also not (successfully) remove the objects you create via delete. We could force that for bindings, but that would violate the JavaScript spec (why should you be allowed to remove a global via bindings, while you could not do the same from the language itself?).

Same applies to bindings.clearAll(). You obviously don't want builtins like Object or Array removed by that - but how should we distinguish? Something like resetToInitialState might make sense - but for that, you can already just create a new Context - that should be fast as well, especially if you use source code caching for your code (the reason you probably use pooling?)

Best,
Christian

@hashtag-smashtag
Copy link
Author

@wirthi thanks for the response. The suggested resetToInitialState would indeed be useful. And/or the ability to reset to some 'snapshot'?

Regarding creating new Contexts: Even using a single Engine and Sources for caching across Contexts (from the same engine) does not provide the performance of reusing pooled/warmed Contexts. We have a real-time low-latency product, so this area is critical for us.

@nhoughto
Copy link

Also looking for a way to "reset" things or restore global to a known state as a workaround for oracle/graal#631.
The most obvious way to not reparse/reprocess things again is to take the expensively computed object out of its original context and add it back into a new context but as per 631 thats not supported. So a workaround could be to take the expensively computed object and re-use the same context (but reset it), but hitting this issue with the reset =)

Can anyone comment on how the graaljs ScriptEngine implementation achieve this? The JSR 223 mandates this behaviour via Bindings, so there should be an example to point at that replicates the kind of "reset" behaviour we are after? Looking at https://github.com/graalvm/graaljs/tree/master/graal-js/src/com.oracle.truffle.js.scriptengine/src/com/oracle/truffle/js/scriptengine but can't see anything obvious..

@nhoughto
Copy link

Did a little more investigation into workarounds for this, including wrapping / scoping to be able to 'reset' the context. It seems like there are a few ways to do this with normal JS, anonymous functions etc as described above. There appears to be a gap around using ES6 modules (which work well in graaljs), because ES6 modules imports are only allow at the root level of a file, not inside any block they aren't wrappable in the same way as a normal function or load() call is. So while we can use wrapping to 'reset' normal JS it isn't possible when using ES6 modules and thus we are back at the beginning needing a supported way to manipulate the bindings.

Seems like we are stuck between this issue (resetting and reuse within the current context) and oracle/graal#631 (reuse an object across contexts). With no options that work for all approaches? 😢

@mellster2012
Copy link

+1 for context reset to a snapshot state. Sharing values across context and/or returning to prior context states would be extremely helpful for our transition from Nashorn to GraalJS.

@wirthi
Copy link
Member

wirthi commented Jun 11, 2019

The solution we suggest is to create a new context and use caching to achieve the necessary performance. https://www.graalvm.org/docs/reference-manual/embed/#code-caching-across-multiple-contexts

Reseting the context to a certain state - e.g. by removing certain bindings - is not possible right now and would not be faster regarding peak performance. @hashtag-smashtag you state this solution would not be fast enough - do you have a working benchmark for that? What do you compare it to, an existing solution in Nashorn?

Best,
Christian

@nhoughto
Copy link

nhoughto commented Jun 11, 2019 via email

@nhoughto
Copy link

nhoughto commented Aug 24, 2019

Came across this, thought it might be a decent workaround, its kinda overkill. Doesn't work OTB either, could likely patch it to work or lift some ideas to solve for immutable global object in graal.

https://github.com/Agoric/realms-shim

Edit: Just realised this obviously doesn't solve for the es6 modules problem, ignore me..

@wirthi
Copy link
Member

wirthi commented Nov 24, 2020

Don't see any immediate TODO here for us. Our preferred and strongly suggested methods for this kind of Problem is: Create a new Context for each iteration and use the Source code sharing to still have the execution fast.

Regarding https://github.com/Agoric/realms-shim: we do adopt ECMAScript proposals once their adoption seems reasonably safe, i.e. when they reach stage 3. Happy to accept contributions earlier, though.

@wirthi wirthi closed this as completed Nov 24, 2020
@fdlk
Copy link

fdlk commented Nov 24, 2020

@wirthi Are you aware that even with source code sharing, loading a library into a clean context is quite slow?

95 millis passed creating 1000 graal contexts
6776 millis passed creating 1000 contexts and loading 34Kb javascript library

I am creating the contexts in the same Engine instance.

@hashtag-smashtag
Copy link
Author

Exactly, Source sharing is not sufficient when one needs to load/eval a bunch of classes/libs into a new Context every time. It's that 'ok this is how I want every new Context to be populated' state that we need on every new Context, quickly.

@wirthi
Copy link
Member

wirthi commented Nov 24, 2020

Can you please share a minified but running example, so we can argue about actual code and inspect its runtime behavior.

@fdlk
Copy link

fdlk commented Nov 24, 2020

Can you please share a minified but running example, so we can argue about actual code and inspect its runtime behavior.

Hi!
I've extracted the relevant bits to https://github.com/fdlk/graaljs-146
Timings are a bit faster but the trend is the same:

Built 1000 contexts in 59ms
Built 1000 contexts + read in library in 3808ms

@wirthi
Copy link
Member

wirthi commented Nov 24, 2020

Hi @fdlk

thanks for your code. Makes it a lot easier to argue.

part 1 - not evaluating anything
That is a bit problematic - you don't do anything in the context. Due to lazy initialization of the engine, this is obviously super-fast (nothing is happening, right?), and not a fair comparison. You need to evaluate at least some piece of dummy code like "1+2" here.

    Source sourceMini = Source.newBuilder("js", "1+2", "test.js").build();
    long start = System.currentTimeMillis();
    for (int i = 0; i < 1000; i++) {
      context = contextBuilder.build();
      context.eval(sourceMini);
      context.close();
    }
    System.out.println("Built 1000 contexts in " + (System.currentTimeMillis()-start) + "ms");

part 2 - evaluating the cached magma.js
This is taking more time (which is expected). After fixing part 1 described above, part 2 takes ~3x the time of part 1 on my machine (1914ms vs 6290ms, your machine seems around twice as fast as mine). But look at the results in more detail, please. I am printing the time per iteration here:

i=0 => 791
i=1 => 71
i=2 => 58
i=3 => 77
i=4 => 80
i=5 => 44
i=6 => 37
i=7 => 43
i=8 => 38
i=9 => 34

Yes, the first few iterations are slow. The very first especially, as this is running mostly interpreted, nothing compiled yet. It gets faster though as during each iteration more code is compiled (and cached), until ultimately:

i=990 => 3
i=991 => 3
i=992 => 2
i=993 => 3
i=994 => 2
i=995 => 3
i=996 => 15
i=997 => 4
i=998 => 4
i=999 => 3

So we are down to 2-4 ms per iteration, while evaling the (cached!) lib of 34kb each time. With one outlier of 15ms here (could be GC, could be another process intervening, etc.).

   Source sourceLib = Source.newBuilder("js", reader, "magma.js").build();
    start = System.currentTimeMillis();
    for (int i = 0; i < 1000; i++) {
      long s2 = System.currentTimeMillis();
      context = contextBuilder.build();
      context.eval(sourceLib);
      context.close();
      System.out.println("i="+i+" => "+(System.currentTimeMillis()-s2));
    }
    System.out.println(
        "Built 1000 contexts + read in library in " + (System.currentTimeMillis()-start) + "ms");

I'd call that: the cache working.

Best,
Christian

@fdlk
Copy link

fdlk commented Nov 24, 2020

@wirthi
Hi!
I updated the project to do a bit more rigorous benchmarking:

Benchmark                       Mode  Cnt  Score   Error  Units
Demo.testEvalOnlyAddition       avgt   25  0,197 ± 0,016  ms/op
Demo.testEvalScriptAndAddition  avgt   25  2,544 ± 0,064  ms/op

It still is a significant difference, and that is also what we are seeing in production.
We have to reuse contexts or suffer a serious performance degradation compared to Nashorn where we solved this by copying the same bindings into a fresh context for each evaluation.

@wirthi
Copy link
Member

wirthi commented Nov 25, 2020

Hi @fdlk

confirmed, I see a 1:6 difference between just 1+2 or adding the library (3036ms VS 17592ms, for 10.000 iterations measured after warmup). So 0.3ms or 1.76ms per full roundtrip (including opening and closing of context).

We are working on all kinds of performance optimizations (especially around warmup, but of course also peak performance), and performance of the API (as in this case; you are not actually running a lot of JS code here) is watched (and improved) a lot. I don't think we can do anything immediate in this very area but we strive to improve here, of course.

Have you measured the performance degradation you care about is actually happening in practice? Assuming Graal.js IS faster than Nashorn when executing JS code (if that is not the case, that is a whole different issue we are happy to have a look at), this slower start per context should be compensated for by faster JS execution afterwards, when warmed up. Or asked concretely: How much is the impact of slower startup on the full execution? What is the whole roundtrip time of an execution (opening context, loading library, executing user code, closing context) on Graal.js VS Nashorn?

Best,
Christian

@fdlk
Copy link

fdlk commented Nov 26, 2020

Hi @wirthi

We really do see something like a factor 4 impact in production, but to be fair the complete comparison with Nashorn is a bit more nuanced.

Some context

We have a mapping project where we map 10000-20000 entities of one type to entities of another type.
The mapping algorithms are user defined and stored in a language called MagmaScript.
The algorithms are defined per column and typically are tiny scripts like $('height').value(), though they can be more intricate, delving through the entity tree, like $('child').attr('height').value() or doing calculations: $('width').times($('height)).value()
The $ function gets bound to the entity and then an attribute's algorithm gets evaluated to get the value for that attribute.

Speedup: Proxying vs Copying

The time-intensive step in Nashorn was creating a Map<String, Object> for the row because Nashorn didn't allow us to proxy native Java entities into the context, and the entities can reference other entities so we had to clone them up to a certain depth.

Now, in GraalJS (like we used to way back when in RhinoScript) we can create an ObjectProxy and need not copy a ton of attributes that the algorithm will only rarely use. This is great, for a typical project with entities that got copied 3 deep I saw a speedup from 3m 37s (Nashorn) to 46s (GraalJS). Yay!

You could say this is more something Nashorn was bad at than something that GraalJS does really great, but I really like it! 👍

Slowdown: Loading the library

But this speedup only works when we reuse the same Context for the entire mapping project. In Nashorn we created a fresh context for each cell and copied the bindings (like MagmaScript object and $ function) into it, to clean up any mess the previous evaluation might have left behind.
Some scripts are pretty wild and declare variables that get stuck in the context.

If we create a new context per entity, the execution goes back up to 1m46s. If we create a new context per attribute, we're back at 3m 33s.

What we did

We decided this was not worth the slowdown and now reuse the same context between evaluations where we have a clear large unit of work, like these mapping jobs. It took some twiddling to prevent an extra Context parameter in each method call of the mapping service.
There are some other places (MagmaScript validation expressions that check added and updated entities) where we do not have a clear unit of work. So for batch adds and updates we can reuse a context but I worry it'll add up to a significant slowdown of places where we lots of single adds occur on different repositories with validation expressions.

TL;DR

When running lots of little scripts that depend on a library, it really hurts if you need to load the library for each execution and we'd be seriously helped if there were some way to clean up a context after use or to copy compiled objects around between contexts. I suspect that others see this too.

@BbIKTOP
Copy link

BbIKTOP commented Feb 10, 2021

Have exactly the same problem here. Even in the same conditions migrating from nashorn. Is there a hope to get a solution?

@pramodanarase
Copy link

It is very important to share binding value across context. if we take example 1MB JSON data evaluation repeatedly with new context , it will hurt the performance.
There should be way to share data across context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request interop Polyglot Language Interoperability
Projects
None yet
Development

No branches or pull requests

7 participants