-
Notifications
You must be signed in to change notification settings - Fork 308
Description
Problem
Playground supports proc_open()
through a custom function called js_open_process()
added in #596. Instead of deferring to OS process opening functions, like PHP does by default, it calls a user-defined callback that spawns a new process and returns a ChildProcess
object.
js_open_process
then writes any stdin data using cp.stdin.write(dataStr);
and captures the output using cp.stdout.on('data', function (data) {
. The stdout
event listener is asynchronous and may receive data after 1ms or 100ms – we don't really know.
What does native PHP do?
Let's consider the following script:
<?php
$descriptorspec = array(
0 => array("pipe", "r"), // stdin
1 => array("pipe", "w"), // stdout
2 => array("pipe", "w") // stderr
);
$process = proc_open('less', $descriptorspec, $pipes);
if (is_resource($process)) {
fwrite($pipes[0], "Hello world!\n");
fclose($pipes[0]);
echo stream_get_contents($pipes[1]);
fclose($pipes[1]);
proc_close($process);
}
It will output "Hello world!\n". However, there are some nuances.
stream_get_contents
expects the input pipe to close. If we commentfclose($pipes[0]);
in that script above, it will hand indefinitely. I'm not sure where that behavior comes from, but both_php_stream_fill_read_buffer
andphp_stream_eof
sound like interesting candidates.fread()
reads whatever is available and returns. If we replacestream_get_contents()
withfread($pipes[1], 1024);
and comment thefclose()
call, the script will finish and outputHello world!\n
If I adjust the proc_open()
call to be proc_open('sleep 1; less')
, then the fread()
call takes more than 1 second which tells me something in it is blocking after all.
Similarly:
proc_open('sleep 1; echo "Hi"; sleep 1; echo "There";', /* ...args */);
// This call takes around 1s and outputs "Hi":
echo fread($pipes[1], 5);
// This call takes around 1s and outputs "There":
echo fread($pipes[1], 5);
Which makes it seem like fread
waits for any output, regardless of the buffer size. In this case, we could yield back in _php_stream_read()
and until the process fd has some data available.
If, however, I call stream_set_blocking($pipes[1], 0);
, then the fread()
call returns instantly. For plain streams, that call is translated to flags |= O_NONBLOCK; fcntl(fd, F_SETFL, flags)
. Cool! We're getting somewhere!
Here's some more resources:
https://bugs.php.net/bug.php?id=47918
I have encountered a number of applications that will cause PHP hang on
fread() until the process closes (regardless of whether or not the buffer has filled).
I have to disappoint here, the anonymous pipes are plain file descriptors
Possible Solutions
A proper fix would yield back to the event loop in the same place where PHP waits for data.
Idea 1 – Async-compatible libc implementation
Playground currently patches PHP with custom implementations of functions like select(2)
in an attempt to make the synchronous C code work in an asynchronous JavaScript runtime. These patches target PHP whereas in reality their goal is to replace blocking syscalls with async-compatible syscalls. Instead of PHP, we should be patching the syscalls library.
To illustrate the issue, here's the wasm_select
implementation:
EMSCRIPTEN_KEEPALIVE int wasm_select(int max_fd, fd_set * read_fds, fd_set * write_fds, fd_set * except_fds, struct timeval * timeouttv) {
emscripten_sleep(0); // always yield to JS event loop
int timeoutms = php_tvtoto(timeouttv);
int n = 0;
for (int i = 0; i < max_fd; i++)
{
if (FD_ISSET(i, read_fds)) {
n += wasm_poll_socket(i, POLLIN | POLLOUT, timeoutms);
} else if (FD_ISSET(i, write_fds)) {
n += wasm_poll_socket(i, POLLOUT, timeoutms);
}
}
return n;
}
If the relevant syscalls knew how to wait for asynchronous events, we wouldn't need that at all.
Here's the Emscripten-provided library_syscall.js
file that handles syscalls:
Functions like __syscall__newselect
or __syscall_poll
could return Asyncify.sleep()
whenever required, handle the asynchronous data flow, and call wakeUp()
whenever a regular OS would.
Idea 2 – Handle EWOULDBLOCK
in _fd_read
I patched _fd_read
as follows:
function _fd_read(fd, iov, iovcnt, pnum) {
try {
var stream = SYSCALLS.getStreamFromFD(fd);
// console.log({ stream });
var num = doReadv(stream, iov, iovcnt);
HEAPU32[pnum >> 2] = num;
return 0;
} catch (e) {
console.error(e);
console.trace();
if (typeof FS == 'undefined' || !(e.name === 'ErrnoError')) throw e;
return e.errno;
}
}
And got an error with errno=6
. It seems like in Emscripten that means EWOULDBLOCK
, which makes sense. I found this related Emscripten issue:
pipe() doesn't create a blocking pipe
It seems like we could detect that error and return Asyncify.sleep()
instead.