Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process API Redux #473

Closed
wants to merge 11 commits into from
Closed

Conversation

patricoferris
Copy link
Collaborator

An updated version of #330 built on top of #472 -- this is more to inform #472 and any other changes we want to make to the low-level process API. I'm not convinced by f7a982f just yet, it was more of an experiment than anything else.

I've also half-disabled the libuv backend just to get it out of the way.

Some bits to do:

  • Expose environment variables in Eio.Process
  • Actually work out the best way to support passing in flows that are not backed by FDs.

Copy link
Collaborator

@talex5 talex5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - looking good so far!

lib_eio/process.ml Outdated Show resolved Hide resolved
@@ -0,0 +1,31 @@
type status = Exited of int | Signaled of int | Stopped of int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a polymorphic variant (often the Stopped case isn't possible, e.g. for an exit status).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like

type exit_status = [ `Exited of int | `Signaled of int ]

type status = [ exit_status | `Stopped of int ] 

be useful? If the Stopped case can sometimes happen in the exit case the exit_status method would still have to handle it. If it can never happen I suppose we can assert false it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it can never happen and we should assert false for now.

lib_eio/process.ml Outdated Show resolved Hide resolved

val spawn : sw:Switch.t -> #mgr -> ?cwd:Fs.dir Path.t -> stdin:#Flow.source -> stdout:#Flow.sink -> stderr:#Flow.sink -> string -> string list -> t
(** [spawn ~sw mgr ?cwd ~stdin ~stdout ~stderr cmd args] creates a new subprocess that is connected to the
switch [sw]. A process will be stopped when the switch is released.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be careful about the word "stopped" here. Normally a "stopped" process is one that has not exited but it just paused.

tests/process.md Outdated Show resolved Hide resolved
let buff = Buffer.create 128 in
Eio.Flow.copy flow (Eio.Flow.buffer_sink buff);
Buffer.contents buff
| _ -> failwith "Subprocess didn't exit cleanly!";;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably do with a helper function in lib_eio for checking the exit status. Process.await_exn or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea -- do you think instead of the val exit_status we should provide val await : t -> exit_status and val await_exn : t -> unit ?

inherit Eio.Process.mgr

method spawn ~sw ?cwd ~(stdin : #Eio.Flow.source) ~(stdout : #Eio.Flow.sink) ~(stderr : #Eio.Flow.sink) prog args =
let chdir = Option.to_list cwd |> List.map (fun (_, s) -> Process.Fork_action.chdir s) in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't ignore the base directory here (s is relative to that). It needs to use fchdir in that case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I've given it a quick go in 9f2a5ed which seems to be working but wasn't sure if that's right (or the cleanest way) to do it

@talex5 talex5 mentioned this pull request Mar 27, 2023
5 tasks
@patricoferris
Copy link
Collaborator Author

Thanks for the review @talex5, I see the #472 is merged, I'll get round to the corrections and rebase on the weekend :))

@talex5
Copy link
Collaborator

talex5 commented Mar 30, 2023

That's great! I'll probably make an Eio 0.9 release some time next week, and it would be great to have this in. The eio_luv backend is only used on Windows now (which doesn't work with OCaml 5.0) so it's probably OK if it just raises an exception if you try to do something it doesn't support.

@avsm
Copy link
Contributor

avsm commented Mar 31, 2023

The number of users of Eio/Windows is probably zero, given the lack of a stable OCaml 5.x release for it. Perhaps we should just remove it for Eio 0.9 and ease the path for this PR, and focus all efforts on getting the IOCP backend (#125) over the line?

(** [spawn ~sw mgr ?cwd ~stdin ~stdout ~stderr cmd args] creates a new subprocess that is connected to the
switch [sw]. A process will be stopped when the switch is released.

You must provide a standard input and outputs that are backed by file descriptors and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You must provide a standard input and outputs that are backed by file descriptors and
[stdin], [stderr] and [stdout] must be backed by file descriptors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, hopefully with the pipe changes I added in d553598 this can be completely removed as we check if there's an FD and if not we pipe into the sources and sinks (that code could do with a review I think)

talex5 and others added 3 commits April 1, 2023 13:15
Using `clone3` directly on Linux is difficult as there is no glibc
wrapper for it.
@patricoferris patricoferris marked this pull request as ready for review April 1, 2023 13:57
@patricoferris
Copy link
Collaborator Author

Perhaps we should just remove it for Eio 0.9 and ease the path for this PR, and focus all efforts on getting the IOCP backend (#125) over the line?

This sounds sensible. It should probably be removed in a separate PR, for now in f402423 I've brought it back but just stubbed out process support.

@patricoferris
Copy link
Collaborator Author

5afdbd4 added Eio.Stdenv.stdio to allow getting stdin, stdout, stderr together in one call. It changed the exit_status name to await and await_exn (perhaps this should be just wait and wait_exn).

@patricoferris
Copy link
Collaborator Author

The Fiber.fork calls in Process.spawn weren't quite right. In d6eec9f I removed them and this seems to fix the issues, the only concern I have is that the call to Process.spawn might yield now depending on what kind of stdin/stdout you pass to the call, is that okay ?

Copy link
Collaborator

@talex5 talex5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! We also need support for environment variables (at the moment, it always clears the environment). I think it's fine to make passing an environment optional and default to inheriting the current one.

Comment on lines +905 to +909
let echo =
Option.get @@ Eio_unix.resolve_program ~paths:[ "/usr/bin"; "/bin" ] "echo"
in
Eio.Switch.run @@ fun sw ->
let child = Eio.Process.spawn proc_mgr ~sw ~stdin ~stdout ~stderr echo [ "echo"; "hello" ] in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the low-level API should mimic the platform's native API, this high-level API is where we can add convenience features to make things easier to use.

For example, I think it should handle looking up the path for you. It could have an optional ?executable argument to take the path explicitly, but default to doing the sensible thing with argv[0] (i.e. look up implicit paths in $PATH). Note that this example wouldn't work on e.g. NixOS, where echo is at /run/current-system/sw/bin/echo.

Suggested change
let echo =
Option.get @@ Eio_unix.resolve_program ~paths:[ "/usr/bin"; "/bin" ] "echo"
in
Eio.Switch.run @@ fun sw ->
let child = Eio.Process.spawn proc_mgr ~sw ~stdin ~stdout ~stderr echo [ "echo"; "hello" ] in
Eio.Switch.run @@ fun sw ->
let child = Eio.Process.spawn proc_mgr ~sw ~stdout ["echo"; "hello"] in

stdin can default to /dev/null I think (we don't want to encourage passing it if it's not needed), and thinking about it further I'm OK with stderr being passed through by default. We already let all Eio code output traceln messages to stderr.

Not sure what's best for stdout. Making it optional and passing it though seems OK, as does making it a required argument. I think defaulting to /dev/null would be bad though as you wouldn't get any error indicating why it didn't work (unlike redirecting stdin, where you get end-of-file).

Comment on lines +920 to +929
let buffer = Buffer.create 4 in
let stdin, _, stderr = Eio.Stdenv.stdio env in
let stdout = Eio.Flow.buffer_sink buffer in
let echo =
Option.get @@ Eio_unix.resolve_program ~paths:[ "/usr/bin"; "/bin" ] "echo"
in
Eio.Switch.run @@ fun sw ->
let child = Eio.Process.spawn proc_mgr ~sw ~stdin ~stdout ~stderr echo [ "echo"; "hello" ] in
Eio.Process.await_exn child;
Buffer.contents buffer;;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we should add some other convenience functions like popen for cases like this, e.g.

Suggested change
let buffer = Buffer.create 4 in
let stdin, _, stderr = Eio.Stdenv.stdio env in
let stdout = Eio.Flow.buffer_sink buffer in
let echo =
Option.get @@ Eio_unix.resolve_program ~paths:[ "/usr/bin"; "/bin" ] "echo"
in
Eio.Switch.run @@ fun sw ->
let child = Eio.Process.spawn proc_mgr ~sw ~stdin ~stdout ~stderr echo [ "echo"; "hello" ] in
Eio.Process.await_exn child;
Buffer.contents buffer;;
Eio.Process.popen ["echo"; "hello"]

@@ -1743,3 +1782,4 @@ Some background about the effects system can be found in:
[Eio.Mutex]: https://ocaml-multicore.github.io/eio/eio/Eio/Mutex/index.html
[Eio.Semaphore]: https://ocaml-multicore.github.io/eio/eio/Eio/Semaphore/index.html
[Eio.Condition]: https://ocaml-multicore.github.io/eio/eio/Eio/Condition/index.html
[Eio.Process]: https://ocaml-multicore.github.io/eio/eio/Eio/Process/index.html
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this will be a broken link until the next release. Might be OK, or we could link to the source code until then.

@@ -47,7 +49,9 @@ module Stdenv = struct
let stdin (t : <stdin : #Flow.source; ..>) = t#stdin
let stdout (t : <stdout : #Flow.sink; ..>) = t#stdout
let stderr (t : <stderr : #Flow.sink; ..>) = t#stderr
let stdio (t : <stdin : #Flow.source; stdout: #Flow.sink; stderr : #Flow.sink; ..>) = t#stdin, t#stdout, t#stderr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very convinced about this. I suspect that providing defaults for the standard streams would make this unnecessary.

@@ -0,0 +1,31 @@
type status = Exited of int | Signaled of int | Stopped of int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it can never happen and we should assert false for now.

(** This functions waits for the subprocess to exit and then reports the status. *)

val await_exn : #t -> unit
(** Like {! await} except an exception is raised if the status is not [Exited 0]. *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame we won't have the failing command in the error message.

We could have a Process.run for the common case of spawn followed by await. Then it would be easy to add run_exn and have it include the command in the error.

Comment on lines +1 to +3
; (mdx
; (package eio_luv)
; (deps (package eio_luv)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental change?

let with_actions cwd fn = match cwd with
| Some ((dir, path) : Eio.Fs.dir Eio.Path.t) -> (
match Eio.Generic.probe dir Fs.Posix_dir with
| None -> fn actions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cwd isn't a native directory then it should raise an exception instead of ignoring it.

process (Process.spawn ~sw actions)
in
Option.iter (fun stdin_w ->
Eio.Flow.copy stdin stdin_w;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to fork off a another fiber to handle this (I see you had it that way earlier and then removed it for some reason).

Failing to fork will work OK for short messages, where the kernel will buffer them, but will deadlock for longer ones. It's also a problem for interactive programs (e.g. hooked up to GUI). Imagine the program wants to output a long set of terms and conditions and then ask for confirmation. Once the stdout pipe is full, it will stop and wait for us to read it, but we're still here waiting for the user to finish sending the input (which they won't do because they haven't seen any output yet).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very quickly, it seemed that there was a race between the process exiting and the standard input/output being read/written. This very reliably fails in the old fork version:

open Eio
open Eio.Std

let run env =
  let buf = Buffer.create 128 in
  Switch.run @@ fun sw ->
  let stdout = Flow.buffer_sink buf in
  let child =
    Process.spawn ~sw env#process_mgr ~stdin:env#stdin ~stdout ~stderr:env#stderr "/bin/echo" [ "echo"; "Hello" ]
  in
  Process.await_exn child;
  let contents = Buffer.contents buf in
  if contents = "Hello\n" then () else failwith contents

let () =
  Eio_main.run @@ fun env ->
  for i = 0 to 1000 do
    Eio.traceln "Process number %i" i;
    run env
  done

But your comment is right!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you need to wait for stdout to be copied to the buffer. Just waiting for the process to exit isn't sufficient. This works (but is easy to get wrong):

let run env =
  let buf = Buffer.create 128 in
  Switch.run (fun sw ->
      let stdout = Flow.buffer_sink buf in
      let child =
        Process.spawn ~sw env#process_mgr ~stdin:env#stdin ~stdout ~stderr:env#stderr "/bin/echo" [ "echo"; "Hello" ]
      in
      Process.await_exn child;
    );
  let contents = Buffer.contents buf in
  if contents = "Hello\n" then () else failwith contents

Perhaps we should just expose the pipes and let the user do the reading, as Lwt does? It's a bit of a pain with the types (which is why Lwt needs so many functions). Maybe GADTs would help (so if you pass ~stdout:Pipe then you can do Process.stdout child to get the flow).

Then we can add convenience functions on top (like popen) that collect the output for you, rather than hard-coding that behaviour in each backend.

https://docs.python.org/3/library/subprocess.html might be a good source of ideas.

- : string = "Hello, world\n"
```

Changing directory
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a test for the confined case too.

@talex5
Copy link
Collaborator

talex5 commented Apr 5, 2023

The Fiber.fork calls in Process.spawn weren't quite right.

What was the problem?

The number of users of Eio/Windows is probably zero, given the lack of a stable OCaml 5.x release for it. Perhaps we should just remove it for Eio 0.9 and ease the path for this PR, and focus all efforts on getting the IOCP backend (#125) over the line?

I'm planning to remove eio_luv right after the 0.9 release. Keeping it in 0.9 might make things slightly easier for people who used it directly and want to move to eio_posix, or who want to compare the performance between them.

@talex5
Copy link
Collaborator

talex5 commented Apr 9, 2023

I had a go at some more changes in my branch (at https://github.com/talex5/eio/tree/proc-redux). I think there's enough API design work still to do here that it's better to release Eio 0.9 without a cross-platform process API. Then we can spend a few weeks trying different APIs and get something merged for Eio 0.10 instead.

@talex5
Copy link
Collaborator

talex5 commented May 3, 2023

Rebased and updated as #499.

@patricoferris
Copy link
Collaborator Author

Thanks! I'll review the process PR later today :)) Closing in favour of it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants