Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add subprocess support #330

Closed
wants to merge 13 commits into from

Conversation

patricoferris
Copy link
Collaborator

This PR is primarily for generating a discussion around how best to support spawning subprocesses in Eio. In particular for issue #126. The current implementation makes the following decisions.

  • Child I/O by default is mapped to /dev/null
  • There's a lot of file-descriptor mangling related to child I/O which definitely needs double-checked.

One unfortunate side-effect is that adding subprocesses bypasses all capability-style security iiuc as someone could easily cd .. out of the current working directory of the process. Perhaps processes should live in a separate package?

Note: if it ever lands, it would probably be good to use io-uring-spawn: https://www.phoronix.com/news/Linux-LPC2022-io_uring_spawn

@c-cube
Copy link
Contributor

c-cube commented Sep 29, 2022

I think processes shouldn't live in another package :-). They're a cornerstone of any unix-y IO library, and having them tightly integrated with Flow is really nice!

If you worry about capability security, maybe this capability should be easy to disable (in an irreversible way)? But I personally think capabilities in OCaml are more about avoiding mistakes, than true security, since there's basically no sandboxing possible.

Copy link
Collaborator

@talex5 talex5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Child I/O by default is mapped to /dev/null

Might be better to make these non-optional and force the caller to decide what to do. Sending errors to /dev/null by default sounds frustrating.

One unfortunate side-effect is that adding subprocesses bypasses all capability-style security iiuc as someone could easily cd .. out of the current working directory of the process.

This is OK; it just means that the process manager is a powerful capability and you need to audit code that gets it carefully. Normally, you'd wrap it quickly in something more limited (e.g. that runs a single trusted command, or runs it inside bubblewrap, etc).

You could take a Path.t for the cwd. That would allow running a sub-process in a directory that wasn't in the parent's default mount namespace (and would fail if it wasn't backed by an FD).

An API like that would also be necessary on FreeBSD in cap_enter mode, since (presumably) there you wouldn't have access to chdir (only to fchdir).

I suggest renaming Process.t to Process.mgr, and Process.process to Process.t.

BTW, I've been idly wondering if we can (ab)use format strings for shell commands. e.g.

# Process.expand "rm -- %s" "My file";;
["rm"; "--"; "My file"]

Might be a bit of fun to get that working! Might also confuse people, though.

lib_eio_linux/eio_linux.ml Outdated Show resolved Hide resolved
lib_eio_linux/eio_linux.ml Outdated Show resolved Hide resolved
let process = object
inherit Eio.Process.t

(* TODO: Is Sys.chdir cwd domain-safe, ? *)
Copy link
Collaborator

@talex5 talex5 Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, and neither is forcing a lazy value. You'll probably need some C code for this bit.

lib_eio_linux/eio_linux.ml Show resolved Hide resolved
Copy link
Contributor

@anmonteiro anmonteiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments that jumped at me while reading the PR. Thanks for your work, @patricoferris

lib_eio/eio.mli Outdated Show resolved Hide resolved
lib_eio/process.mli Outdated Show resolved Hide resolved
lib_eio/process.mli Outdated Show resolved Hide resolved
lib_eio/process.mli Outdated Show resolved Hide resolved
@patricoferris
Copy link
Collaborator Author

Thanks @anmonteiro, all seemed sensible and I added the changes in b02b37b

I think I've addressed most comments there is just the relevant C code for being able to change directories for processes in the presence of multiple domains, I'll look into this soon.

lib_eio/eio.mli Outdated Show resolved Hide resolved
val pid : #t -> int
(** The process ID *)

val status : #t -> status Promise.t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val status : #t -> status Promise.t
val await_exit : #t -> status

(thinking about it further, the promise is only needed internally)

Is it even possible to get Stopped here? It looks like that can only happen if we're acting as strace, which I guess we'll never support anyway (or use a different API if we did).

You must provide a standard input and outputs that are backed by file descriptors and
[cwd] will optionally change the current working directory of the process.*)

val spawn_detached : #mgr -> ?cwd:Fs.dir Path.t -> stdin:Flow.source -> stdout:Flow.sink -> stderr:Flow.sink -> string -> string list -> t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the intended use of this? Is it the same as a double-fork?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was what was needed for this point in the original issue:

Child processes should be attached to a switch by default to avoid leaking them, but need a way to spawn detached sub-processes too.

Perhaps I misunderstood? Looking at https://stackoverflow.com/questions/881388/what-is-the-reason-for-performing-a-double-fork-when-creating-a-daemon it looks like I have, perhaps this could be added in a follow-up PR?

lib_eio/unix/eio_unix.ml Outdated Show resolved Hide resolved
method needs_close = needs_close
method set_needs_close b = needs_close <- b
method status =
let p, r = Promise.create () in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to create the promise (and do the wait) outside of status, which should just await the promise that already exists. Otherwise we'll try to reap the child once for each time the user calls status, whereas we actually need to do it exactly once.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Thanks, done in 4a37efb

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That only creates the promise outside. It's still doing the wait once for each call. It should just be:

method await_exit = Promise.await exit_status

let get_fd_or_err flow =
match get_fd_opt flow with
| Some fd ->
Unix.dup ~cloexec:false (FD.to_unix `Peek fd)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting an FD to be not close-on-exec needs to be done in the child process (in C). Otherwise, another domain might fork a process at the same time and inherit it.

let process = pid_to_process close pid in
let cleanup () =
(* Catch if the process is already finished when trying to stop it. *)
(try process#stop with Unix.Unix_error (Unix.ESRCH, _, _) -> ());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using pidfd_open(2) would likely solve these races, and would avoid the need for a systhread to do the waitpid.

lib_eio_linux/eio_linux.ml Show resolved Hide resolved


method spawn_detached ?cwd ~stdin ~stdout ~stderr cmd args =
let cwd = Option.map snd cwd in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all needs to be moved to C, at which point you can fchdir instead.


CAMLextern int caml_convert_signal_number(int);

CAMLprim value caml_signal_to_posix_signal(value v_signo)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have made that clearer in the code sorry. The problem is that the OCaml signal numbers don't match the POSIX ones (or at least the ones here: https://man7.org/linux/man-pages/man7/signal.7.html). That's why earlier commits were failing with:

Full test results in `/src/_build/default/lib_eio_linux/tests/_build/_tests/eio_linux'.
--
355 | Test Successful in 0.003s. 5 tests run.
356 | File "tests/process.md", line 1, characters 0-0:
357 | diff --git a/_build/default/tests/process.md b/_build/default/tests/.mdx/process.md.corrected
358 | index ade46dd..1eb6ef1 100644
359 | --- a/_build/default/tests/process.md
360 | +++ b/_build/default/tests/.mdx/process.md.corrected
361 | @@ -44,7 +44,7 @@ Stopping a subprocess works and checking the status waits and reports correctly
362 | let t = spawn ~sw "sleep" [ "sleep"; "10" ] in
363 | Process.stop t;
364 | Promise.await (Process.status t);;
365 | -- : Process.status = Eio.Process.Signaled 9
366 | +- : Process.status = Eio.Process.Signaled (-7)
367 | ```
368 |  
369 | A switch will stop a process when it is released.
370 | @@ -59,7 +59,7 @@ A switch will stop a process when it is released.
371 | in
372 | run ();
373 | Promise.await @@ Process.status (Option.get !proc);;
374 | -- : Process.status = Eio.Process.Signaled 9
375 | +- : Process.status = Eio.Process.Signaled (-7)

as I ran the tests using Luv (which doesn't use OCaml signal numbers). So I was converting them to be the same. Do you think there's a better approach?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a standard set of numbers. Even the Linux man-page you linked uses different numbers for different architectures. Seems tempting to use a polymorphic variant here instead (e.g. Eio.Process.Signaled `TERM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had started this when I was playing with signals:
https://github.com/haesbaert/eio/blob/signal/lib_eio/include/discover.ml
Basically exports the SIGFOO and also tries to compile a program with the non standard SIGBARS that get exported as an optional. It might make sense to point out also that the Stdlib signal calls work with both the ocaml values (the negative ones) and the "real ones".

@talex5
Copy link
Collaborator

talex5 commented Oct 27, 2022

One way to make this PR more manageable might be to first split out a PR adding subprocess support to Eio_linux.Low_level (or to eio-luv), then add the cross-platform support in a later PR.

@patricoferris
Copy link
Collaborator Author

Indeed, that seems like a better approach. Thanks for all the help and reviews so far, I'll open a new linux/luv low-level PR soon!

@bobot
Copy link

bobot commented Mar 13, 2023

(For later, getting information about the time spend by the process (user-time, wall-clock) is also an important and difficult information to get in a cross-platform way.)

@patricoferris
Copy link
Collaborator Author

Closing in favour of #473

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants