Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to call .getStackName on Stack in Java with Node 12.17.0 #8288

Closed
vincent-dm opened this issue May 31, 2020 · 12 comments · Fixed by aws/jsii#1717
Closed

Unable to call .getStackName on Stack in Java with Node 12.17.0 #8288

vincent-dm opened this issue May 31, 2020 · 12 comments · Fixed by aws/jsii#1717
Assignees
Labels
bug This issue is a bug. jsii This issue originates in jsii, or this feature must be implemented in jsii. language/java Related to Java bindings package/tools Related to AWS CDK Tools or CLI

Comments

@vincent-dm
Copy link

vincent-dm commented May 31, 2020

I am using CDK from a REPL in Clojure (Java). After upgrading Node to 12.17.0 (the latest release of major version 12, which is recommended for use with CDK/JSII), commands start failing due to stream closed error.

Reproduction Steps

  • Install node 12.17.0.
  • Install Clojure (e.g. brew install clojure/tools/clojure).
  • Start a Clojure REPL session with CDK core as a dependency: clj -Sdeps '{:deps {software.amazon.awscdk/core {:mvn/version "1.42.0"}}}'.
  • Evaluate commands in the REPL by pasting them and pressing enter:
    • Load the CDK classes:
    (import software.amazon.awscdk.core.App software.amazon.awscdk.core.Stack$Builder)
    • Define a new Stack object as bar
    (def bar (.build (Stack$Builder/create (App.) "foo")))
    • Then try to get the name of the stack:
    (.getStackName bar)
  • This will throw an exception and shows the abbreviated error log:
Execution error (IOException) at java.lang.ProcessBuilder$NullOutputStream/write (ProcessBuilder.java:442).
Stream closed

Error Log

You can see a detailed stack trace of the last exception by evaluating *e in the REPL:

user=> *e
#error {
 :cause "Stream closed"
 :via
 [{:type software.amazon.jsii.JsiiException
   :message "Unable to send request to jsii-runtime: java.io.IOException: Stream closed"
   :at [software.amazon.jsii.JsiiRuntime requestResponse "JsiiRuntime.java" 103]}
  {:type java.io.IOException
   :message "Stream closed"
   :at [java.lang.ProcessBuilder$NullOutputStream write "ProcessBuilder.java" 442]}]
 :trace
 [[java.lang.ProcessBuilder$NullOutputStream write "ProcessBuilder.java" 442]
  [java.io.OutputStream write "OutputStream.java" 157]
  [java.io.BufferedOutputStream flushBuffer "BufferedOutputStream.java" 81]
  [java.io.BufferedOutputStream flush "BufferedOutputStream.java" 142]
  [sun.nio.cs.StreamEncoder implFlush "StreamEncoder.java" 318]
  [sun.nio.cs.StreamEncoder flush "StreamEncoder.java" 153]
  [java.io.OutputStreamWriter flush "OutputStreamWriter.java" 254]
  [java.io.BufferedWriter flush "BufferedWriter.java" 257]
  [software.amazon.jsii.JsiiRuntime requestResponse "JsiiRuntime.java" 84]
  [software.amazon.jsii.JsiiClient getPropertyValue "JsiiClient.java" 111]
  [software.amazon.jsii.JsiiObject jsiiGet "JsiiObject.java" 240]
  [software.amazon.jsii.JsiiObject jsiiGet "JsiiObject.java" 228]
  [software.amazon.awscdk.core.Stack getStackName "Stack.java" 624]
  [jdk.internal.reflect.NativeMethodAccessorImpl invoke0 "NativeMethodAccessorImpl.java" -2]
  [jdk.internal.reflect.NativeMethodAccessorImpl invoke "NativeMethodAccessorImpl.java" 62]
  [jdk.internal.reflect.DelegatingMethodAccessorImpl invoke "DelegatingMethodAccessorImpl.java" 43]
  [java.lang.reflect.Method invoke "Method.java" 566]
  [clojure.lang.Reflector invokeMatchingMethod "Reflector.java" 167]
  [clojure.lang.Reflector invokeNoArgInstanceMember "Reflector.java" 438]
  [user$eval142 invokeStatic "NO_SOURCE_FILE" 1]
  [user$eval142 invoke "NO_SOURCE_FILE" 1]
  [clojure.lang.Compiler eval "Compiler.java" 7177]
  [clojure.lang.Compiler eval "Compiler.java" 7132]
  [clojure.core$eval invokeStatic "core.clj" 3214]
  [clojure.core$eval invoke "core.clj" 3210]
  [clojure.main$repl$read_eval_print__9086$fn__9089 invoke "main.clj" 437]
  [clojure.main$repl$read_eval_print__9086 invoke "main.clj" 437]
  [clojure.main$repl$fn__9095 invoke "main.clj" 458]
  [clojure.main$repl invokeStatic "main.clj" 458]
  [clojure.main$repl_opt invokeStatic "main.clj" 522]
  [clojure.main$main invokeStatic "main.clj" 667]
  [clojure.main$main doInvoke "main.clj" 616]
  [clojure.lang.RestFn invoke "RestFn.java" 397]
  [clojure.lang.AFn applyToHelper "AFn.java" 152]
  [clojure.lang.RestFn applyTo "RestFn.java" 132]
  [clojure.lang.Var applyTo "Var.java" 705]
  [clojure.main main "main.java" 40]]}

Environment

  • CDK Version: 1.42.0 (build 3b64241)
  • Node Version: 12.17.0
  • CLI Version: aws-cli/2.0.17 Python/3.8.3 Darwin/19.5.0 botocore/2.0.0dev21
  • OS: Catalina 10.15.5
  • Language: EN

Other

  • Running the example above with Node 12.16.3 (the last release before 12.17.0) throws no exception and just works.
  • Note that once the exception is thrown, a new REPL session is needed to be able to use CDK again (CDK is the first software I encountered which exhibits this behavior). You can quit it using ctrl+d.
  • The comment below by @cadnce helped me get this issue clearer. Is seems it is related to the fact that the code runs in a REPL which reads user input after returning. When you paste this entire code snippet into the REPL and press enter it will not throw and correctly returns the name of the stack ("foo"):
(import software.amazon.awscdk.core.App software.amazon.awscdk.core.Stack$Builder)
(let [bar (.build (Stack$Builder/create (App.) "foo"))] (.getStackName bar))
  • Waiting does seem to be related too: introducing a wait of 200ms between creating the Stack and getting its name works:
(import software.amazon.awscdk.core.App software.amazon.awscdk.core.Stack$Builder)
(let [bar (.build (Stack$Builder/create (App.) "foo"))] (Thread/sleep 200) (.getStackName bar))
  • But a wait of 1000ms will throw the exception:
(import software.amazon.awscdk.core.App software.amazon.awscdk.core.Stack$Builder)
(let [bar (.build (Stack$Builder/create (App.) "foo"))] (Thread/sleep 1000) (.getStackName bar))
  • There are known issues when running CDK (or Jsii?) on Node >= 13.x, but so far the recommendation was always to downgrade to 12.x. Now that the latest release of 12.x is apparently also affected, this workaround is problematic since a lot of tooling (e.g. homebrew with versions plugin) doesn't allow installing a specific minor version.

This is 🐛 Bug Report

@vincent-dm vincent-dm added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 31, 2020
@vincent-dm vincent-dm changed the title Unable to synthesize using Node 12.17.0 or higher Unable to call .getStackName on Stack using Node 12.17.0 or higher May 31, 2020
@vincent-dm vincent-dm changed the title Unable to call .getStackName on Stack using Node 12.17.0 or higher Unable to call .getStackName on Stack in Java with Node 12.17.0 May 31, 2020
@SomayaB SomayaB added language/java Related to Java bindings package/tools Related to AWS CDK Tools or CLI labels Jun 1, 2020
@cadnce
Copy link

cadnce commented Jun 1, 2020

I can confirm this issue isn't specific to java, I have reproduced this in python, and was going to report the same myself.

It appears to be caused by the jsii process which is listening to the java/python (the big "HACK" warning is a bit of a giveaway)

https://github.com/aws/jsii/blob/master/packages/%40jsii/runtime/lib/sync-stdio.ts#L57

As a workaround to that we found we were tripping this instance by taking too long (making an external call) in the python code so. Perhaps you can refactor your similarly.

(I could also see this happening if you were to do something like pip install to package your lambda layers)

You can reproduce this easy on Linux like so, but it doesn't appear to cause the issue natively on osx (presumably to do with the socket handling)

import time
from aws_cdk import core
time.sleep(5)

@cadnce
Copy link

cadnce commented Jun 1, 2020

Thanks for the slightly updated issue with more detail. However I'm fairly sure with the new description you're now hitting a completely different error! (Certainly it no-longer appears to be the call stack overflow from before, so I may have to raise a separate issue pending someone else reviewing this)

@vincent-dm
Copy link
Author

vincent-dm commented Jun 1, 2020

@cadnce I accidentally showed only the abbreviated stack trace after rewriting the ticket to have a reproducable example.

I now added the full stack trace and AFAIK it it still essentially the same ("Unable to send request to jsii-runtime: java.io.IOException: Stream closed").

@shivlaks shivlaks added the jsii This issue originates in jsii, or this feature must be implemented in jsii. label Jun 2, 2020
@shivlaks shivlaks assigned RomainMuller and unassigned shivlaks Jun 2, 2020
@shivlaks
Copy link
Contributor

shivlaks commented Jun 2, 2020

adding the jsii label and assigning to @RomainMuller to take a look

@cadnce
Copy link

cadnce commented Jun 2, 2020

@cadnce I accidentally showed only the abbreviated stack trace after rewriting the ticket to have a reproducable example.

I now added the full stack trace and AFAIK it it still essentially the same ("Unable to send request to jsii-runtime: java.io.IOException: Stream closed").

Yes sorry, perhaps I misread what it said before as it looked too much like my issue!

I thought your backtrace from cdk was originally complaining about RangeError: Maximum call stack size exceeded (which, to me, explained why your java backtrace says the socket (to the jsii) is closed)

I've obviously confused the issue here so I'll raise a separate issue for it

@vincent-dm
Copy link
Author

vincent-dm commented Jun 2, 2020

I did also occasionaly get RangeError: Maximum call stack size exceeded in my own application (and I did mention it in my initial issue description, so you're not imagining things 😉), but I haven't been able to reproduce it so I decided to make this issue more focused. So I agree you should probably open a separate issue.

@SomayaB SomayaB removed the needs-triage This issue or PR still needs to be triaged. label Jun 2, 2020
@RomainMuller
Copy link
Contributor

So the real issue here is actually the RangeError: Maximum call stack size exceeded error.
After this has occurred, the jsii kernel process terminates and this is what then triggers the Stream closed error.

The issue stems from our reading synchronously from stdin, receiving an EAGAIN error and retrying (SyncStdio.readLine retries by recursing into itself, eventually getting out of stack space).

@cadnce
Copy link

cadnce commented Jun 8, 2020

Yea that was my point exactly 👍

RomainMuller added a commit to aws/jsii that referenced this issue Jun 8, 2020
…eadLine

When using node `>= 12.17`, `EAGAIN` errors consistently occur when
trying to read from `stdin` when there is no available data. The retry
mechanism for this was to recursively call back `SyncStdio.readLine`,
which could evtnually lead to a "Maximum call stack size exceeded" error
to occur (possibly hidden from the consumer, and later causing a "Stream
closed" error).

This changes how the retry mechanism works so it operates in a `while`
loop instead of making a recursive call, completely avoiding to run into
the growing stack issue.

Fixes aws/aws-cdk#8288
Fixes aws/aws-cdk#5187
@RomainMuller
Copy link
Contributor

Fun stuff is the Maximum call stack size exceeded issue only triggers faster from your clj execution because it will retry on that EAGAIN super fast (since human input is a bit slower than if a non-interactive program was doing that).

It's possible (probable?) your regular program also hit that, but never showed it out to you (you'd have had to run with JSII_DEBUG=1 and/or JSII_VERBOSE=1 I think.

@RomainMuller
Copy link
Contributor

Also - I found myself unable to write a simple reproduction for this behavior of node's fs.readSync throwing EAGAIN, which threw me off. At this point I don't know if this is caused by a weird interaction between how the JDK provisions the pipe for the subprocess, or if something happens earlier in the kernel process that triggers this behavior in node.

@cadnce
Copy link

cadnce commented Jun 8, 2020

Interesting, thanks I'll try again with those environment variables and give you my result (I also found on osx it didnt give the same EAGAIN socket behaviour compared to on a Debian VM)

I couldn't reproduce this using node/TS cdk because it doesn't use the jsii (obviously).

I should note I'm suspicious that something has changed in cdk space recently making this worse because I've spotted at least 3 references to people with this same issue in the gitter chat since this issue (sorry I don't have the links to hand)

@cadnce
Copy link

cadnce commented Jun 8, 2020

Wow, actually you have already PR'd this, amazing

mergify bot pushed a commit to aws/jsii that referenced this issue Jun 11, 2020
…eadLine (#1717)

When using node `>= 12.17`, `EAGAIN` errors consistently occur when
trying to read from `stdin` when there is no available data. The retry
mechanism for this was to recursively call back `SyncStdio.readLine`,
which could evtnually lead to a "Maximum call stack size exceeded" error
to occur (possibly hidden from the consumer, and later causing a "Stream
closed" error).

This changes how the retry mechanism works so it operates in a `while`
loop instead of making a recursive call, completely avoiding to run into
the growing stack issue.

Fixes aws/aws-cdk#8288
Fixes aws/aws-cdk#5187
Fixes aws/aws-cdk#8397


---

By submitting this pull request, I confirm that my contribution is made under the terms of the [Apache 2.0 license].

[Apache 2.0 license]: https://www.apache.org/licenses/LICENSE-2.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. jsii This issue originates in jsii, or this feature must be implemented in jsii. language/java Related to Java bindings package/tools Related to AWS CDK Tools or CLI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants