Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSR 223 different behaviour between compiled execution and evaluated #5876

Open
andsel opened this issue Sep 13, 2019 · 14 comments
Open

JSR 223 different behaviour between compiled execution and evaluated #5876

andsel opened this issue Sep 13, 2019 · 14 comments
Assignees

Comments

@andsel
Copy link
Contributor

andsel commented Sep 13, 2019

Environment

Provide at least:

  • JRuby version (jruby -v) and command line (flags, JRUBY_OPTS, etc)
    jruby-9.2.8.0
  • Operating system and platform (e.g. uname -a)
    Linux 5.0.0-27-generic JRUBY-4445 #28-Ubuntu SMP Tue Aug 20 19:53:07 UTC 2019 x86_64 x86_64
    x86_64 GNU/Linux
    JVM: OpenJDK 12.0.2

Other relevant info you may wish to add:

  • Installed or activated gems
  • Application/framework version (e.g. Rails, Sinatra)
  • Environment variables

Expected Behavior

Both eval invoked on ScriptEngine and on CompiledScript prints same value of local variable

  • Describe your expectation of how JRuby should behave, perhaps by showing how CRuby/MRI behaves.
    Both ways of execution should output the same string

  • Provide an executable Ruby script or a link to an example repository.
    NB put the tests in two separate unit test or run one at a time

@Test
public void testEvaluateCompiled() throws ScriptException {
    ScriptEngineManager manager = new ScriptEngineManager();
    ScriptEngine engine = manager.getEngineByName("jruby");

    Bindings bindings = new SimpleBindings();
    bindings.put("message", "my variable");

    CompiledScript compiled = ((Compilable) engine).compile("puts \">>>>>>\" + message");
    compiled.eval(bindings);
}

@Test
public void testEvaluateInterpreted() throws ScriptException {
    ScriptEngineManager manager = new ScriptEngineManager();
    ScriptEngine engine = manager.getEngineByName("jruby");

    Bindings bindings = new SimpleBindings();
    bindings.put("message", "my variable");

    engine.eval("puts \">>>>>>\" + message", bindings);
}

Actual Behavior

  • Describe or show the actual behavior.
    the test testEvaluateCompiled fails with error while the other print correctly the string >>>>>>global variable

  • Provide text or screen capture showing the behavior.

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.headius.backport9.modules.Modules (file:/home/andrea/.gradle/caches/modules-2/files-2.1/org.jruby/jruby-complete/9.2.8.0/2dbdff9d9f4142e4a0bc2b743ded9487a80459ef/jruby-complete-9.2.8.0.jar) to field java.io.FileDescriptor.fd
WARNING: Please consider reporting this to the maintainers of com.headius.backport9.modules.Modules
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
NameError: undefined local variable or method `message' for main:Object
  <main> at <script>:1



javax.script.ScriptException: org.jruby.embed.EvalFailedException: (NameError) undefined local variable or method `message' for main:Object

	at org.jruby.embed.jsr223.JRubyCompiledScript.wrapException(JRubyCompiledScript.java:103)
	at org.jruby.embed.jsr223.JRubyCompiledScript.eval(JRubyCompiledScript.java:92)
	at org.jruby.embed.jsr223.JRubyCompiledScript.eval(JRubyCompiledScript.java:77)
	at org.logstash.log.Log4jRubyScriptingTest.testEvaluateCompiled(Log4jRubyScriptingTest.java:42)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
	at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: org.jruby.embed.EvalFailedException: (NameError) undefined local variable or method `message' for main:Object
	at org.jruby.embed.internal.EmbedEvalUnitImpl.run(EmbedEvalUnitImpl.java:131)
	at org.jruby.embed.jsr223.JRubyCompiledScript.eval(JRubyCompiledScript.java:86)
	... 24 more
Caused by: org.jruby.exceptions.NameError: (NameError) undefined local variable or method `message' for main:Object
	at RUBY.<main>(<script>:1)
@headius
Copy link
Member

headius commented Sep 16, 2019

Huh yeah that doesn't seem right. I wonder if it's something to do with how it's compiling the code? Like perhaps it's compiling message as a method call since there's no context to indicate it's a variable, while the eval version sees it in the binding and makes it a variable access...

@andsel
Copy link
Contributor Author

andsel commented Sep 16, 2019

@headius I give you just little more context, I'm using this code to run as a Script element in Log4j2 and I can't change the way the variables are declared because it's done by Log4J itself.
Having eval and compiled works in the same way, identifying the message as a variable could be a great thing.

@kares
Copy link
Member

kares commented Dec 8, 2020

hmm, not sure whether I did something wrong trying to reproduce, but the test examples given above pass 🟢 fine.

given how log4j2 uses script engine I can see a clear difference (and some problematic expectations) when using the JRuby engine vs. using the Javascript engine:

based on the "THREADING" parameter reported by the script engine factory impl it will either isolate the script into a thread-local or ("THREADING" is set) assume bindings have a thread-local like behaviour.

        ScriptEngineManager manager = new ScriptEngineManager();
        ScriptEngine engine = manager.getEngineByName("jruby");
        System.out.println("JRuby THREADING: " + engine.getFactory().getParameter("THREADING"));

       // JRuby THREADING: THREAD-ISOLATED
        ScriptEngineManager manager = new ScriptEngineManager();
        ScriptEngine engine = manager.getEngineByName("javascript");
        System.out.println("JS THREADING: " + engine.getFactory().getParameter("THREADING"));

        // JS THREADING: null

JRuby by default uses a single (shared) runtime and thus message binding having thread-local semantics will lead to issues.

this can be fixed by using a different local context behavior (leading to multiple runtimes) :

    @Test
    public void testEvaluateCompiledThreadedConcurrent() throws Exception {
        System.setProperty("org.jruby.embed.localcontext.scope", "concurrent");
        //System.setProperty("org.jruby.embed.localcontext.scope", "threadsafe");
        ScriptEngineManager manager = new ScriptEngineManager();
        ScriptEngine engine = manager.getEngineByName("jruby");

        CompiledScript compiled = ((Compilable) engine).compile("puts \"concurent> \" + java.lang.Thread.currentThread.getName() + ' ' + message");

        for (int i=0; i<3; i++) {
            final int tid = i;
            new Thread(() -> {
                Thread.currentThread().setName("concurrent thread-" + tid);
                Bindings bindings = new SimpleBindings();
                bindings.put("message", "jruby threaded " + tid);
                try {
                    //compiled.eval(bindings); // DOES NOT WORK !
                    engine.eval("puts \"Y>\" + message", bindings);
                } catch(ScriptException ex) {
                    System.out.println("concurrent thread failed: " + ex + " " + tid);
                    ex.printStackTrace();
                }
            }).start();
        }
        Thread.currentThread().join(2000);
    }

... and also reproduces the original issue of different behavior with evaluated vs compiled execution.

in conclusion, the confusion comes from the fact that a single runtime is used by default with JSR223, thus a (global) binding for one engine will have potential consequences for others. but also due the fact of setting "THREADING" parameter, we'll need to investigate whether Log4J2's assumption around when the parameter is set are correct (might be the case since it's somehow natural to assume engines to be isolated by default - they sure are e.g. in Nashorn's case).

@kares
Copy link
Member

kares commented Dec 9, 2020

managed to reproduce as expected (actually confused why I had the simple compiled test passing previously).

the compiled script eval issue boils down to the JSR223 embed case creating a root+eval scope, in case of the "pre-compilation" (AOT parsing and keeping the AST node really) the local variable names are not known and thus the eval scope ends up with no variable names compared to engine.eval(string, bindings) the variable names will be known based on passed bindings.

@headius
Copy link
Member

headius commented Dec 9, 2020

There's an opportunity here to make adjustments in this behavior. Perhaps it would make more sense to do what other 223 engines do and have each engine get its own runtime?

kares added a commit to kares/jruby that referenced this issue Dec 10, 2020
when pre-parsing (compiling) scripts the parser will assume
potential loval variables to be method calls,

since the eval scope won't provide any local variable names
(compared to directly eval-ing with bindings passed in).

see jrubyGH-5876
kares added a commit to kares/jruby that referenced this issue Dec 10, 2020
when pre-parsing (compiling) scripts the parser will assume
potential loval variables to be method calls,

since the eval scope won't provide any local variable names
(compared to directly eval-ing with bindings passed in).

see jrubyGH-5876
@kares
Copy link
Member

kares commented Dec 10, 2020

Perhaps it would make more sense to do what other 223 engines do and have each engine get its own runtime?

Yes that would make sense as a default, to me at least. Ideally also not causing the global runtime to be set due javax.script.


Back to the issue here, I've added tests (in #6457) to avoid further confusion (there was a // Bug comment previously).

Also, realized we have some options on "fixing" the compiled script not having local-variable hints.
The only "viable" (simple really) option that would get the use-case log4j2 is using working is to delay 'compilation' -> effectively making it behave like an eval with bindings where we store the eval-ed unit for later re-use:

  • CompiledScript script = engine.compile("puts message") does nothing - merely stores the script string
  • on first script.eval(bindings) we get local-variables and thus can assume whether message is a method or a local

Down side here is - we do not parse and thus at least validate syntax on engine.compile as users might expect - they will only get a syntax wrapped ScriptException on first script.eval(...).
We need to handle engine.compile(Reader reader) as well, this would mean we read and keep the contents (as a string) around, for consistent compile behaviour.

Personally am not strong for the "delayed compilation" but it seems like a reasonable fix and actually what users might expect from a compiled script provided with local-variables despite the introduced negatives. Opinions?

@headius
Copy link
Member

headius commented Dec 10, 2020

It is unfortunate that you can bind new variables after forcing a script to compile, but I worry about deferring the compilation until first use. The delayed syntax checking you mention is one obvious issue, but also the compilation step is typically used by scripters to ensure they have something that is "ready-to-execute". Introducing a potential error and a parse/compile delay into the first invocation would not meet that expectation.

It might be acceptable to allow a recompile when evaluating a script against variable bindings that contain new variables, since that will potentially change the behavior of the code. If no late binding is provided, or the late binding contains no new variable names, we can just go with the initially-compiled code.

I am curious how other 223 languages handle this, since I would expect several of them to require variables to be bound before execution. Perhaps Python or Groovy have a similar issue here?

@kares
Copy link
Member

kares commented Dec 11, 2020

It might be acceptable to allow a recompile when evaluating a script against variable bindings that contain new variables, since that will potentially change the behavior of the code. If no late binding is provided, or the late binding contains no new variable names, we can just go with the initially-compiled code.

Hmm, this sounds a bit against having smt to execute but I get the slight difference - we would at least validate syntax.
Might look into this, but it feels a bit cumbersome to introduce re-compiling based on bindings, can imagine doing it once on first eval but not comparing bindings every time. So kind of like late compilation except we would still do parsing on compile and if there are locals on first (compiled) script.eval than re-parse (since I am not sure we could just swap out the root scope easily).

I am curious how other 223 languages handle this, since I would expect several of them to require variables to be bound before execution. Perhaps Python or Groovy have a similar issue here?

Believe Groovy does only allow method calls wout parenthesis when there isn't ambiguity, but I am certainly no expert.
Bit even just starting a console Groovy considers message a "local" property in case of a println message.
As for others (JS and Python) they should not have any ambiguity as they always require method calls with ().

@headius
Copy link
Member

headius commented Dec 11, 2020

Believe Groovy does only allow method calls wout parenthesis when there isn't ambiguity

Now that I think about it I remember the only cases where you can make a call without parens is if you are accessing a property (a field or a dynamically dispatched property lookup) which would have to be qualified with a ..

This may have changed over the years but it is likely that no other commonly-used language has the binding ambiguity that we do.

This is a bit of a conundrum, and lazily compiling might be the only option. I wish I had noticed this flaw in the design of JSR-223 back in the day.

Any thoughts on this @enebo?

@ccutrer
Copy link
Contributor

ccutrer commented Aug 15, 2022

What about automatically (when a new system property is set) prepend variables from bindings with $ or @? It's a workaround, but would at least allow usage in projects that allow any JSR223 language to be used, but without any specific Ruby support. The script would have to know this is happening, and be written to expect it.

@enebo
Copy link
Member

enebo commented Aug 15, 2022

@ccutrer Perhaps we can flesh out this idea a bit more? If you evaluate some ruby code at that point you enumerate bindings and then only convert those? More specifically anything in the file or just only anything at the top-level script scope?

-Djruby.js223.special.bindings.somename=true
bindings: foo, bar

foo = calculate(bar)

This would read what was in bar and also write back to foo. Would this work?

proc {
  foo = calculate(bar)
}.call

This second question is mostly a matter of seeing how much we should restrict this. I would think we would not read bar and write foo in this second example and limit the specialness of this to just the Ruby top-level script scope. I definitely think enforcing this outside the main file and non-top-level scopes would cause strange side-effects (like load a gem which assigns 'foo').

@ccutrer
Copy link
Contributor

ccutrer commented Aug 16, 2022

I was thinking the opposite direction:

-Djruby.js223.bindings.localinstancevars=true
Java:

...
final ScriptEngine engine = compiledScript.getEngine();
final ScriptContext ctx = engine.getContext();
ctx.setAttribute("input", source_value, ScriptContext.ENGINE_SCOPE);
Object result = compiledScript.eval();

Ruby:

directions = %w[N NE E SE S SW W NW N].freeze

# note that the script uses `@input`, even though the Java
# supplied `input`
if @input == "NULL" || @input == "UNDEF"
  "-"
else
  cardinal = directions[(@input.to_f / 45).round]
  "#{cardinal} (#{@input.to_i}°)"
end

It's up to the user to know to adjust their scripts when setting that property. This avoids having to parse and adjust the actual script, which as you pointed out could be problematic.

@enebo
Copy link
Member

enebo commented Aug 16, 2022

@ccutrer Can you open up a new issue on this? I think this should be pretty doable. It would definitely give more flexibility to just using jsr223 instead of forcing people to our internal redbridge apis. Others can weigh in on whether they can think of any problems in the new issue.

@ccutrer
Copy link
Contributor

ccutrer commented Aug 17, 2022

Oooh, I found a workaround. If you use v ||= nil in your script, it forces it to be recognized as a local variable, and then when you eval with a binding, v is properly set to what was passed in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants