-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support console
pool to stream output of long-running commands
#68
Comments
console
pool to stream output of long-running commands
On a semi-related note, I have a couple of long-running commands that I'd prefer to see status output on (things like invoking a build in a separate project to get its build outputs, and invoking package managers that may take time to download+install things). What kept me from using console with ninja is the fact that it also serializes access, so the build becomes less parallel if the output is shown from multiple commands. I presume that the serialization was done to avoid interleaved output from multiple commands? Would a pool like console that allowed such interleaving (line buffered) make sense? As n2 shows each running command on a separate line, one alternative to streaming the lines with scrolling would be to have the last line output by a given command to be shown to the right of the running command - even if it were only updated once a second, that would give some indication of how things are progressing. But I guess it would not be helpful in cases where a long-running task printed errors and then printed other text. Not an essential feature in my case anyway - n2 showing what tasks are currently running has already been a big improvement over ninja, where some task had stalled but NINJA_STATUS was showing some other command that happened to end afterwards. |
Yeah, my thinking here we could maybe show the most recent N lines from commands as they are running (and then their full output when they complete). The main question is whether it can be done in a way that isn’t visually overwhelming. |
The console pool in ninja has to run serially because stdin access is also possible, I think. Mixing output of multiple commands or truncating it does not work well with some tools that overwrite lines in-place, like ninja/n2.
If stdin access is ignored and only process output (stdout, stderr) is sufficient I think it's possible to have a nice linear log, a command at a time. IMHO interleaving or truncating the outputs of the console pool would make it less useful. Related reading "redo, buildroot, and serializing parallel logs": |
I guess there are a couple of different use cases and trade-offs here. The one I was focusing on was seeing the most up-to-date output from a command, to e.g. see if it has stalled. If output were delayed until completion so it could be displayed linearly, this case wouldn't be covered so well. For commands where you wanted the output non-interleaved, that's already the status-quo, isn't it? My thinking was that there'd be a way to opt-in to immediate output for a command, while otherwise keeping things as they are. If the full output from the command were buffered, it could be reprinted in full if the command failed, which would make debugging issues easier (or alternatively, logged to a file in the failure case) |
Thought about this more, thanks for the comments here. I think the Ninja behavior is:
I had thought I might try something clever around running console commands in parallel but I think existing ninja projects are likely too sensitive to work with it. In particular I think I don't want to get into pseudotty behavior. (I think the one tweak I will attempt is for non-console commands, while they are running, I might display their most recent line of output in the temporary progress.) |
While a task is running, we gather the last line of its (buffered) output and display it in the fancy progress. Part of #68.
Looks like this: https://asciinema.org/a/kZGBQLft3LdZwqSJRblZZShmx (The subcommand here print the literal string |
For adding better insight into long-running subtasks, this is a nice improvement! recording.mp4Normally, I run commands with a wrapper that hides all their output unless there's an error. To test out this new behaviour, I turned that off, but now the full output of the command is printed as it completes, even in the success case, which makes the build output somewhat messy. It would be great to have some sort of option like a build variable that could suppress the aggregated output in the success case, as while I love the extra insight during build, I'd miss the clean output if I had to decide on only one of the two. One other minor thing: when a build step invokes n2 in a subproject, n2 prints the descriptions of each step without any leading indent, so it becomes a bit hard to distinguish between the currently-running step in the outer invocation, and the step that's running in the inner invocation - for example ts:congrats is an inner step here: Would one or two spaces before the last output line, truncating the last one or two characters, make sense? |
Good idea, done. |
You're in a fairly unique situation in that most people only have .ninja files generated by a system they don't control. I like the idea though, let me think about it. (I recall in Ninja we had a lot of circular discussions around different verbosity-related command-line flags to either show task output or not, show task command lines or not, etc.) |
I was eager to try out the new functionality, so I've hacked in some options in a local branch. Only the Pretty progress path has been updated: One curious thing I noticed after being able to see the last line of output is that sometimes there's a fairly long period between when the command has finished, and when it disappears from the list of tasks. For example, 'build:minilints' in the following video. Can you think of a reason why that might be happening? recording.mp4 |
I've been noticing this happen fairly frequently - some commands, whether they have any output or not, appear to run for much longer than they should, and don't appear in |
Do the stuck tasks ever finish? Two guesses are:
To judge if it's the second one you might be able to look at (via like * Not a race in the threadsy sense, but rather a logic error related to the order of events. |
(I previously wrote that I might have been reusing task ids and getting the bookkeeping wrong, but that was a misread of the code. n2 uses unique types for each of the various "id" integers flying around so I think it's unlikely I'm mixing these up, but ...) |
The tasks do eventually complete, and it seems to coincide with the completion of some other long-running task. As far as I can tell, the tasks are no longer in the process list (well, top - I'll check ps next time) after their last output, so it just seems to be the status not updating. It happens with a moderate degree of frequency, but is sporadic - the next time it happens I'll see if I can get get it to trigger more reproducibly. |
I think this was a real bug. Its symptom would be that after a task produced output, if nothing else happened, we might wait a while before updating the UI. From looking at #68, but won't fix it.
Ninja does this. Not sure if it helps with #68.
I pushed 3 small fixes in this area that might help, but I'm not sure about any of them. I think if the bug is related to the process lurking around too long, it would likely be lurking in some sort of hung (like waiting to write to a pipe) or zombie state, so it wouldn't show up at the top of |
Thanks, I tried again with the latest changes. Looking at ps when I see a task that has printed its last line but appears stuck, I see one or more zombie processes: dae 2319878 0.0 0.0 0 0 pts/1 Z+ 08:08 0:00 [sh] I added some logging to run_command() try get better insight into what's going on, and it appears the read() call hangs for a while after the process has printed its final output:
I've figured out a way to reproduce this fairly consistently, so if you have any suggestions on things I should try, please let me know. |
Could it be that the pipes for one spawned process are leaking into other spawned processes? That would explain why a command seems to unblock when another happens to complete, and with the following change, the issue seems to disappear:
|
We want to open all our fds using CLOEXEC so they're not inherited by child processes. All Rust builtin APIs do this already, but we used libc::pipe() for our pipe. To fix this, we use pipe2() on non-Mac (which allows passing the flag) and Mac-specific POSIX_SPAWN_CLOEXEC_DEFAULT otherwise. Fixes leak found in #68.
Great find! I dug a bit and wrote I think a proper fix in 12652a1, hopefully. |
At least on this Mac, running the build in tests/manual/prints I would see the processes inheriting stuff like
and after the patch they only have pipe fds 0/1/2. |
Thanks for looking into it; I can confirm it's all working smoothly now, and CLOEXEC is much more elegant than closing 1000 FDs. |
Every time I run a build, being able to see the latest line of output makes me smile :-) It's such a nice improvement. Have you had any thoughts yet on whether output-suppressing options might be suitable for inclusion / what they might look like? |
Sorry for dropping this thread! I think we have a few different ideas mixed in here and I wanna make sure I address all of them.
|
In llvm there are a lot of build targets like
check-lld
which run a large suite of lit tests. It's useful to see the progress of these commands since they can be very long running, and if something fails in the middle you might want to ctrl-c out of the build and triage that specific failure rather than wait for them all to be done.The text was updated successfully, but these errors were encountered: