Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When spago run, if the target cli app manipulates console interface it may not terminate properly #1048

Closed
edwardw opened this issue Oct 7, 2023 · 34 comments

Comments

@edwardw
Copy link

edwardw commented Oct 7, 2023

Just use the example from purescript-node-readline repo:

module Main where

import Prelude

import Effect (Effect)
import Effect.Console (log)
import Node.EventEmitter (on_)
import Node.ReadLine (close, createConsoleInterface, lineH, noCompletion, prompt, setPrompt)

main :: Effect Unit
main = do
  interface <- createConsoleInterface noCompletion
  setPrompt "(type 'quit' to stop)\n> " interface
  prompt interface
  interface # on_ lineH \s ->
    if s == "quit" then close interface
    else do
      log $ "You typed: " <> s
      prompt interface

If spago bundle then node index.js, after typing quit the app terminates. But if spago run directly, it hangs. It terminates only after pressing ctrl-d. With spago run --verbose, it can be deduced that the hang must have happened here: src/Spago/Cmd.purs#L59.

This happened with spago version 0.93.9. I also checkout out this repo and built a fresh version from head. It also happened there.

@edwardw
Copy link
Author

edwardw commented Oct 7, 2023

Also dug a little bit around node documentation. Must be some reason that there exist three flavors of stream piping API stream.pipe, stream.pipeline and the experimental stream.compose. But better leave that to you node experts.

@JordanMartinez
Copy link
Contributor

Hm... This is probably a bug with node-execa. Spago is calling subprocess.stdin.shareParentProcessStdin, so the parent stdin is being shared with the subprocess stdin.

@f-f
Copy link
Member

f-f commented Oct 7, 2023

We're not using the latest execa version - @JordanMartinez do you think that "just" upgrading to latest might fix this?

@JordanMartinez
Copy link
Contributor

I'm not sure. I'll have to see what goes on in execa.

@JordanMartinez
Copy link
Contributor

So, after I updated Test/Prelude.purs' spago' to use spawn rather than exec, I added the following test. Note: the node-readline code below is v8.1.0, not v7.0.0 in the opening thread.:

    Spec.itOnly "can run interactive programs" \{ spago, spago' } -> do
      spago [ "init" ] >>= shouldBeSuccess
      spago [ "install", "node-readline" ] >>= shouldBeSuccess
      FS.writeTextFile (Path.concat [ "src", "Main.purs" ]) $ Array.intercalate "\n"
        [ "module Main where"
        , ""
        , "import Prelude"
        , ""
        , "import Effect (Effect)"
        , "import Effect.Console (log)"
        , "import Node.EventEmitter (on_)"
        , "import Node.ReadLine (close, createConsoleInterface, closeH, lineH, noCompletion, prompt, setPrompt)"
        , ""
        , "main :: Effect Unit"
        , "main = do"
        , "  interface <- createConsoleInterface noCompletion"
        , "  setPrompt \"(type 'quit' to stop)\\n> \" interface"
        , "  prompt interface"
        , "  interface # on_ closeH do"
        , "    log \"interface was closed\""
        , "  interface # on_ lineH \\s -> do"
        , "    log $ \"You typed: \" <> show s"
        , "    if s == \"quit\" then do"
        , "      log $ \"calling rl.close()\""
        , "      close interface"
        , "      log $ \"called rl.close()\""
        , "    else do"
        , "      log $ \"Looping\""
        , "      prompt interface"
        ]
      spagoProcess <- spago' StdinNewPipe [ "run" ]
      let delay200Ms = Aff.delay $ Milliseconds 200.0
      delay200Ms
      spagoProcess.stdin.writeUtf8 "foo\n"
      delay200Ms
      spagoProcess.stdin.writeUtf8 "bar\n"
      delay200Ms
      spagoProcess.stdin.writeUtf8 "quit\n"
      delay200Ms
      spagoProcess.stdin.writeUtf8End "quit\n"
      delay200Ms
      spagoProcess.result >>= shouldBeSuccess

which outputs the following:

✅ Selecting package to build: 7368613235362d376c5766502b59557a4c3459473333656f2b

Downloading dependencies...
Building...
[1 of 2] Compiling Main
[2 of 2] Compiling Test.Main
           Src   Lib   All
Warnings     0     0     0
Errors       0     0     0

✅ Build succeeded.

(type 'quit' to stop)
> You typed: "foo"
Looping
(type 'quit' to stop)
> You typed: "bar"
Looping
(type 'quit' to stop)
> You typed: "quit"
calling rl.close()
interface was closed
called rl.close()
You typed: "quit"
calling rl.close()
called rl.close()
STDOUT:
(type 'quit' to stop)
> You typed: "foo"
Looping
(type 'quit' to stop)
> You typed: "bar"
Looping
(type 'quit' to stop)
> You typed: "quit"
calling rl.close()
interface was closed
called rl.close()
You typed: "quit"
calling rl.close()
called rl.close()

I noticed that the program hangs if I don't call stream.stdin.end (or the writeUf8End part above). So, maybe we need to call stream.stdin.end after detecting that CTRL+D was pressed?

@JordanMartinez
Copy link
Contributor

JordanMartinez commented Oct 14, 2023

After more investigation (and re-reading the opening thread), the issue here seems to be in node-readline. For whatever reason (and as shown above), calling close interface isn't actually closing the interface.

Also, from the Node docs:

Calling rl.close() does not immediately stop other events (including 'line') from being emitted by the readline.Interface instance.

@JordanMartinez
Copy link
Contributor

This might be it. At the very bottom of readline.createInterface(options):

When creating a readline.Interface using stdin as input, the program will not terminate until it receives an EOF character. To exit without waiting for user input, call process.stdin.unref().

@f-f
Copy link
Member

f-f commented Oct 14, 2023

That makes sense, but I still don't get why it behaves differently when run directly with Node vs running it through Spago - it can't just be readline, it sounds like there is something that we are missing in how we set up the streams?

@JordanMartinez
Copy link
Contributor

Yeah, that's what I'm trying to figure out myself...

@JordanMartinez
Copy link
Contributor

JordanMartinez commented Oct 15, 2023

There's seemingly a difference between

Terminal
 |
 +- node run.js

and

Terminal
 |
 +- node spago.js
    |
    +- node run.js

@JordanMartinez
Copy link
Contributor

Regardless, the program written in the opening thread should not need to be changed in order for spago run to terminate once the first quit line is entered. If it works via bundling and node, then it should work via spago run.

So far, my experimentation has shown that the program only terminates if we end the child process' stdin. However, I don't see how spago could know when to end the child process stdin without knowing about the program being run itself.

AFAIK, the child process does not terminate because something is keeping the event loop alive, and that something appears to be the lack of an EOF sent to the child process' stdin in this unique situation where readline is using stdin as its input stream.

Maybe this changes if we run the child process directly via a shell? Otherwise, this might be one of those quirky Node bugs.

On a different note, I recall James Brock hitting something similar with node-streams. Something about stdin never actually emitting an end event. Unfortunately, I can't figure out where that was discussed.

@JordanMartinez
Copy link
Contributor

I'd suggest using one of these three node libraries to see what's causing the child process to stay alive, but they're all written via CJS modules.

@f-f
Copy link
Member

f-f commented Oct 15, 2023

cc @jamesdbrock hopefully this is the same issue you had 😄

@edwardw
Copy link
Author

edwardw commented Oct 15, 2023

Tried why-is-node-running:

"use strict"

import { createRequire } from 'module'
const require = createRequire(import.meta.url)
const whyRunningJS = require('why-is-node-running')

export const whyRunningLog = function() {
  whyRunningJS()
}

and

module WhyIsNodeRunning where

import Prelude
import Effect (Effect)

foreign import whyRunningLog :: Effect Unit

Inserting the line whyRunningLog into the original program after close interface, the result is a little bit surprising. If bundling then node index.js, it reports two TickObjects (what's that?):

(type 'quit' to stop)
> quit
There are 2 handle(s) keeping the process running

# TickObject
node:internal/async_hooks:202
node:internal/async_hooks:504
node:internal/process/task_queues:132
node:internal/streams/writable:477
node:internal/stream_base_commons:163
node:internal/stream_base_commons:151
node:internal/streams/writable:392

# TickObject
node:internal/async_hooks:202
node:internal/async_hooks:504
node:internal/process/task_queues:132
node:internal/bootstrap/switches/is_main_thread:259
node:internal/streams/readable:1023
node:internal/readline/interface:543
node:internal/readline/interface:529

and yet, it terminates normally. If spago run directly, there is only one TickObject and it hangs:

(type 'quit' to stop)
> quit
There are 1 handle(s) keeping the process running

# TickObject
node:internal/async_hooks:202
node:internal/async_hooks:504
node:internal/process/task_queues:132
node:internal/bootstrap/switches/is_main_thread:259
node:internal/streams/readable:1023
node:internal/readline/interface:543
node:internal/readline/interface:529

@f-f
Copy link
Member

f-f commented Oct 15, 2023

@edwardw I suspect it's the Node instance in the middle that it's hanging (this one that we launch to run the script), while the child node always terminates properly

@edwardw
Copy link
Author

edwardw commented Oct 15, 2023

@edwardw I suspect it's the Node instance in the middle that it's hanging (this one that we launch to run the script), while the child node always terminates properly

Makes sense, that explains that it also sees ctrl-d and then exits gracefully.

@f-f f-f added this to the spago-next alpha bugs milestone Oct 15, 2023
@JordanMartinez
Copy link
Contributor

JordanMartinez commented Oct 16, 2023

Maybe the hang is caused by execa's _.result waiting for something to finish, and so the resulting Aff never returns?

@f-f
Copy link
Member

f-f commented Oct 16, 2023

But the inner node instance should terminate when asked, no? So at that point the middle one (spawned with execa) should return too. It's as if we need a

cp.on('exit', (code) => {
    process.exit(code);
});

@JordanMartinez
Copy link
Contributor

My point is that Aff doesn't return until you tell it to. So, if I'm using Aff incorrectly at some point, the child process could have terminated but the corresponding Aff callback isn't called, thereby making the child process appear to hang.

Perhaps the example code should add more listeners to verify whether the child process is terminating or not. And we could add listeners to the child process itself to see what kind of events are being emitted.

@JordanMartinez
Copy link
Contributor

I added listeners to the child process, and can guarantee that it's not emitting an event after it initially spawns once I type 'quit'.

@JordanMartinez
Copy link
Contributor

Found the solution. We need to use inherit in the child process' stdio values when calling spago run. Skip the next paragraph to see how I figured that out.

However... In the execa version currently used in Spago, that's not possible. Moreover, if you updated execa and the Node libraries, we'd have to use the unsafe child process bindings as the safe child bindings assume the stdio value is ["pipe", "pipe", "pipe"]. I'll have to think about how that unsafeness should propagate up into execa.

I first thought we should verify that a node process (like spago) starting another node process (like the program) should or shouldn't terminate when we type 'quit'. This eliminates node-execa and Aff-usage as a possible source of issues.

I bundled the following program via spago to a file called index.js:

module Main where

import Prelude

import Effect (Effect)
import Effect.Console (log)
import Effect.Ref as Ref
import Node.EventEmitter (on_)
import Node.ReadLine (close, createConsoleInterface, closeH, lineH, noCompletion, prompt, setPrompt)

main :: Effect Unit
main = do
  interface <- createConsoleInterface noCompletion
  setPrompt "(type 'quit' to stop)\n> " interface
  prompt interface
  countRef <- Ref.new 0
  interface # on_ closeH do
    log "interface was closed"
  interface # on_ lineH \s -> do
    log $ "You typed: " <> show s
    count <- Ref.read countRef
    case s, count of
      "quit", 0 -> do
        Ref.write 1 countRef
        log $ "calling rl.close()"
        close interface
        log $ "called rl.close()"
      "quit", _ -> do
        log "This should not be printed"
      _, _ -> do
        log $ "Looping"
        prompt interface

I then wrote the following JS program called likeSpago.mjs:

import process from "node:process";
import cp from "node:child_process";

const stdioVal = "pipe";

const subprocess = cp.spawn("node", [ "index.js"], {
  stdio: [stdioVal, stdioVal, stdioVal]
});
subprocess.on("spawn", () => console.log("spawned"));
subprocess.on("exit", () => console.log("exited"));
subprocess.on("closed", () => console.log("closed"));

process.stdin.pipe(subprocess.stdin);
subprocess.stdout.pipe(process.stdout);

I then called node likeSpago.mjs. When stdioVal is pipe, we get the same behavior as above. But when it's inherit, whether or not we remove the stdin/stdout piping code at the bottom, node likeSpago.mjs terminates.

@f-f
Copy link
Member

f-f commented Oct 16, 2023

Man, that's some great digging 😄

I gather the next step is to patch execa to allow this without going the unsafe route?

@JordanMartinez
Copy link
Contributor

JordanMartinez commented Oct 16, 2023

Mm... I'm not sure. The 'unsafeness' is the fact that certain properties don't exist on the child process depending on what values get passed to stdio. Thus, either the type signature of ChildProcess.stdin is forall stdinValue. HasStdin stdinValue => ChildProcess stdinValue -> Readable () or ChildProcess -> Maybe (Readable ()).

Some of this I need to verify personally because I've assumed that writing content to a child process' stdin if the value for that stream was inherit will result in a runtime error (or silently discard the content, leaving the programmer confused as to why nothing was written).

@edwardw
Copy link
Author

edwardw commented Oct 16, 2023

What I found out yesterday is that as soon as one puts node process in a pipe, its stdin changes from Readable stream to a Socket: just compare node -p "process.stdin" and echo "" | node -p "process.stdin". And there is a related node ticket about no close event emitted when that happens: nodejs/node#25131. There is even a recently resurrected PR trying to fix that. Tried it but it doesn't really fix the problem for spago.

Could be relevant.

@edwardw
Copy link
Author

edwardw commented Oct 16, 2023

A workaround:

       close interface
+      Stream.destroy Process.stdin

A previous comment mentioned that there is a missing TickObject if spago run directly. That object ultimately leads to streams/writable.js#L760, which is why I propose this workaround. Not pretty, and I'm not sure how practical to apply it in a large code base, but at least it works in this toy example.

@edwardw
Copy link
Author

edwardw commented Oct 31, 2023

The "fix" is most unusual. I was trying to attack this again by adding this new test to node-execa package:

diff --git a/test/Test/Node/Library/Execa.purs b/test/Test/Node/Library/Execa.purs
index e508ff1..72d369c 100644
--- a/test/Test/Node/Library/Execa.purs
+++ b/test/Test/Node/Library/Execa.purs
@@ -88,6 +88,18 @@ spec = describe "execa" do
             result.timedOut `shouldEqual` true
           _ ->
             fail $ "Timeout should work: " <> show result
+  it "can run interactive programs" do
+    let
+      interactiveJSFile = Path.concat [ "test", "fixtures", "interactive.js" ]
+    spawned <- execa "node" [ interactiveJSFile ]
+      (_ { timeout = Just { milliseconds: Milliseconds 400.0, killSignal: stringSignal "SIGTERM" } })
+    spawned.stdin.writeUtf8 "node\n"
+    spawned.stdin.writeUtf8 "quit\n"
+    result <- spawned.getResult
+    case result.exit of
+      Normally 0 -> result.stdout `shouldContain` "You typed: node"
+      _ ->
+        fail $ (if result.timedOut then "Should not timeout: " else "Other error: ") <> show result
   describe "execaSync" do
     describe "`cat` tests" do
       it "input is file" do

and this js file:

const readline = require('node:readline')
const { stdin: input, stdout: output } = require('node:process')

const rl = readline.createInterface({ input, output })

rl.setPrompt('(type "quit" to stop)\n> ')
rl.prompt()

rl.on('line', (line) => {
  if (line === 'quit') {
    rl.close()
  } else {
    console.log(`You typed: ${line}`)
    rl.prompt()
  }
})

It failed as expected. BUT the following will succeed even being run piped:

...
const rl = readline.createInterface({ input, output })
rl.input._readableState.highWaterMark = 0
...

It has to do when running piped, the underlying Socket stream is buffered by default. It is very ugly hack, should we add it to the readline package?

@JordanMartinez
Copy link
Contributor

It is very ugly hack, should we add it to the readline package?

Probably not. We've already identified a non-hacky fix; why consider this hacky idea? I'm not seeing the advantage.

@edwardw
Copy link
Author

edwardw commented Oct 31, 2023

It is very ugly hack, should we add it to the readline package?

Probably not. We've already identified a non-hacky fix; why consider this hacky idea? I'm not seeing the advantage.

That's fair. But it at least shows that the purescript side is not necessary at fault here. Instead, the implementation detail of the node side leaks which leads to observable behavior difference from outside.

JordanMartinez added a commit to purescript-node/purescript-node-child-process that referenced this issue Nov 14, 2023
JordanMartinez added a commit to purescript-node/purescript-node-child-process that referenced this issue Nov 14, 2023
* Add note about stdin, pipe, and inherit

See purescript/spago#1048

* Drop unused functions

* Add variant of waitSpawned
@JordanMartinez
Copy link
Contributor

node-execa@5.0.0 now allows the stdio array to be configured. So, it allows us to use inherit when calling spago run rather than pipe.

@JordanMartinez
Copy link
Contributor

Blocked by #1122.

@f-f
Copy link
Member

f-f commented Jan 14, 2024

This should now be unblocked

@JordanMartinez
Copy link
Contributor

I think this already fixed actually now that we've merged #1129. And because it is interactive, I'm wondering how we'd even test this as the usage of inherit in a child process' stdin means I can't programmatically write to that stream via cp.stdin.write("foo").

There's already a test for this here: #1048 (comment). It's just a matter of figuring out how to add that to our test suite.

@f-f
Copy link
Member

f-f commented Jan 15, 2024

I just released 0.93.24 with all the latest patches and it doesn't seem to be fixed - the example in the first post still doesn't terminate when run with spago run

@JordanMartinez
Copy link
Contributor

JordanMartinez commented Jan 15, 2024

I just tried that out myself and it terminated fine.

$ npx spago run
Reading Spago workspace configuration...

✅ Selecting package to build: runbug

Downloading dependencies...
Building...
           Src   Lib   All
Warnings     0     0     0
Errors       0     0     0

✅ Build succeeded.

(type 'quit' to stop)
> hello1
You typed: hello1
(type 'quit' to stop)
> helo2
You typed: helo2
(type 'quit' to stop)
> quit
$ 

Versions used:

  • node: 20.10.0
  • spago: 0.93.24
  • purs: 0.15.14
  • OS: Linux

@f-f f-f closed this as completed Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants