-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve display of Unix wait statuses, notably giving names for signals #82974
Conversation
Proper Unix terminology is "exit status" (vs "wait status"). "exit code" is imprecise and therefore unclear. This is a user-visible change to the behaviour of all Rust programs which run Unix subprocesses. Hopefully no-one is matching against the exit code string, except perhaps in tests. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
We are going to make this more complicated, so it makes sense not to expect to fall through the return value from a single call in each branch. No functional change. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
We are going to implement signal description which is not entirely predictable. No functional change. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
I'm going to want signal_describe in a moment for better reporting of signals in ExitStatus. API design principles dictate that this inner functionality should be exposed in its own right, rather than being available only via ExitStatus's Display impl. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Print the signal description, and name, if we can. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
Split this one out. It's special. No functional change if the tests pass. Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
r? @m-ou-se (rust-highfive has picked a reviewer for you, use r? to override) |
/// } | ||
/// ``` | ||
#[unstable(feature = "unix_signal_strings", issue = "none")] | ||
pub fn signal_abbrev(sig: i32) -> Option<&'static str> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming nit: avoid abbreviating words.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose this name because I was copying the glibc name for the same thing (sigabbrev
). I don't think calling it "abbrevation" is right. BSD calls it the signame
so name
would be a possibility but it seems rather generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is there a rule against abbreviating things? Ref
ptr
alloc
iter
mut
vec
env
args
.... SocketAddr
Dir
.... etc. etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is not a rule per se, but my understanding is that the preference tends to lean towards names that are not abbreviated as far as the recent additions to the standard library API surface go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding is that the preference tends to lean towards names that are not abbreviated as far as the recent additions to the standard library API surface go.
Fair enough. I think, though, that in this case the abbreviation is definitely justified both by the prior art in glibc (which feels very natural to me as an old Unix hand) and by the clumsiness of the unabbreviated word "abbreviation" :-).
If this is a blocker for this MR then I can of course respray this corner of the bikeshed. In any case, thanks for your attention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By way of hopefully sidestepping the bikeshed about abbreviations, I think it'd be fair to call this the signal_name
. I don't think we need to call it an "abbreviation" at all. (Yes, TERM
is itself an abbreviation, but it's the actual name of the signal.) And it makes sense to say we're mapping a number to a "name" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... I think it'd be fair to call this the
signal_name
. I don't think we need to call it an "abbreviation" at all. (Yes,TERM
is itself an abbreviation, but it's the actual name of the signal.) And it makes sense to say we're mapping a number to a "name" here.
name
is fine by me, so I will do that. (And yes the BSDs even have this in the sys_signame
array so that is precedent.)
By way of hopefully sidestepping the bikeshed about abbreviations, ...
:-).
This implementation makes me wonder if it wouldn't be easier and simpler to match signum {
libc::SIGSEGV => "Segmentation fault (SIGSEGV)",
libc::SIGKILL => "Killed (SIGKILL)",
...
} Is there any reason this wouldn't be an option? |
Simonas Kazlauskas writes ("Re: [rust-lang/rust] Improve display of Unix wait statuses, notably giving names for signals (#82974)"):
This implementation makes me wonder if it wouldn't be easier and simpler to
match the signal numbers as defined in libc and map them to our own
descriptions, as such:
match signum {
libc::SIGSEGV => "Segmentation fault (SIGSEGV)",
libc::SIGKILL => "Killed (SIGKILL)",
...
}
Is there any reason this wouldn't be an option?
While the lower signal numbers are conventional, the higher ones are
not. We would have to maintain a table for every Unix we port to.
Ian.
…--
Ian Jackson ***@***.***> These opinions are my own.
Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.
|
Co-authored-by: Tavian Barnes <tavianator@tavianator.com>
Co-authored-by: Tavian Barnes <tavianator@tavianator.com>
Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments inline. Overall, the concept of this seems reasonable, and having clearer information displayed about ExitStatus
should make diagnosing things easier.
let got = if let Some(sigfoo_np) = self.sigfoo_np.get() { | ||
sigfoo_np(sig) | ||
} else if let Some(use_sys_foolist) = | ||
// if let chaining would be nice here! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice in many places, but we don't normally leave these kinds of comments in the source. Once we have if let
chaining, we're very likely to update many bits of code over time.
impl Lookup for ViaStrSignal { | ||
// musl's strsignal is threadsafe: it just looks up in a static array (that we don't have | ||
// access to in any other way). I have spoken to Rich Felker (musl upstream author) on IRC | ||
// and confirmed that musl's strsignal is very unlikely to ever become non-thread-safe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable to rely on, but it would be helpful to have this documented somewhere other than an ephemeral second-hand conversation, if possible. If the author of musl feels comfortable stating this, perhaps it could go in the musl documentation somewhere?
(Alternatively, perhaps musl could provide sigdescr_np
as a symbol alias for strsignal
? That'd be a longer wait, though. Documentation we could link to would suffice.)
It looks like glibc doesn't do this because it generates strings for unknown and real-time signals on the fly, in addition to being locale-dependent.
It looks like musl has fixed strings for all the real-time signals. Is musl's handling of real-time and unknown signals included in the above commitment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that having this commitment documented would be better. But as far as I can tell musl does not have the kind of comprehensive documentation where this kind of thing would go. Otherwise I would have tried that already. Also AFAICT there isn't an issue tracker. I guess I could post to the mailing list. Would that satsify this concern?
I doubt very much that musl upstream would want to provide sigdescr_np
. They focus heavily on standards conformance and sigdescr_np
is an API glibc simply made up (understandably so, but, still).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like musl has fixed strings for all the real-time signals. Is musl's handling of real-time and unknown signals included in the above commitment?
My conversation was about the return value from strsignal
and did not distinguish the behaviour depending on the argument values. Such a distinction wouldn't make sense as an API commitment since the caller couldn't rely on the threadsafety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A public mailing list post with an ack from the maintainer would be helpful, yes.
And I didn't want to make any assumptions about the scope of that informal commitment; I could have imagined something like "if you pass a known signal number we'll guarantee to return a static, but we might change the behavior for unknown signal numbers in the future". That's still a commitment we could rely on, as long as we were careful to only pass known signal numbers. Hence trying to confirm the scope of the commitment.
fn lookup(&self, sig: i32) -> Option<&'static str>; | ||
} | ||
|
||
unsafe fn ptr_to_maybe_str(p: *const c_char) -> Option<&'static str> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally, functions returning Option
use opt
or option
in their name, not maybe
. (The only instances of to_maybe
in the Rust codebase related to a compiler-internal type named MaybeOwned
.)
/// Currently there are two deviations from the return value of `strsignal`: | ||
/// | ||
/// * `signal_describe` does not translate the string according to the locale; C `strsignal` | ||
/// usually does. Unfortunately a translated version is not currently available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of Rust's standard library functions currently include any kind of translation; this function is not unusual in that regard. (The first sentence, stating that it isn't translated, is helpful; the "Unfortunately" could be attached to many other functions, and is a known limitation of std.)
impl Lookup for Unspported { | ||
fn lookup(&self, sig: i32) -> Option<&'static str> { | ||
extern "C" fn strsignal(sig: c_int) -> *const *const c_char; | ||
ptr_to_maybe_str(strsignal(sig)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming I'm following this correctly: If this is unsupported, shouldn't it just return None
rather than duplicating? Also, wouldn't this result in duplicated output (the second time with a SIG
prepended)?
@@ -526,20 +527,22 @@ impl From<c_int> for ExitStatus { | |||
impl fmt::Display for ExitStatus { | |||
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | |||
if let Some(code) = self.code() { | |||
write!(f, "exit code: {}", code) | |||
write!(f, "exit status: {}", code)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this PR should change this particular bit of the output. We currently refer to this as code
and provide a .code()
function to retrieve it, so using the same consistent terminology seems preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is controversial I will split it out into a different MR. (The "consistent terminology" from the stdlib names is simply wrong for Unix, as I have recently documented properly.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not trying to make a case for or against that terminology, just that I don't think that change should be in this PR.
Regarding the change itself, my only concern is not introducing a source of confusion by being inconsistent between documentation and existing function names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see why you have this concern. I agree that the conversation would be better in a different MR, so yes, I will do that. Thanks. (I'll "resolve" this review comment with that other MR when it exists.)
@ijackson wrote:
The numbers aren't all conventional, but libc already handles the number-to-name mapping for every target, so The only thing we'd have to ensure is that we don't have an entry for any It's also not the end of the world if this starts out giving better output on glibc and musl (which is already the case) as long as it's functional everywhere and can be extended easily on additional platforms. Given all of that, I think it's worth at least considering the simpler solution. While it'd be a little bit of maintenance work when someone comes up with a new UNIX target (which doesn't happen especially often), I think that might be less painful than using
We could maintain one table, with some |
Co-authored-by: Josh Triplett <josh@joshtriplett.org>
Co-authored-by: Josh Triplett <josh@joshtriplett.org>
Hi. Thanks for the review. The other comments are mostly bug reports in my series and I'll fix them.
This is in relation to the suggestion of putting a signal table into Rust stdlib. Let me consider this in some more detail then: I think the suggestion is to avoid the use of One result would be that Rust would print different signal descriptions to the platform's standard C library. Since signal descriptions are already rather gnomic, and the precise descriptions are known to the administrators of the system in question, I think this is extremely undesirable. Also, on some platforms, depending on the completeness of Rust's table, Rust would be able to format only a subset of signal numbers. It would also be possible to introduce bugs in Rust's stdlib affecting individual signal numbers. Or, to put it another way, it would be really difficult to get good test coverage for the resulting implementation.
I also have an extremely strong philosophical objection to hardcoding a list of signals and their descriptions in the Rust stdlib. The usual approach to making programs portable to a variety of Unix variants, where no sufficient API is universally available, is to identify API variants that are commonly available and use some kind of platform configuration to select the right implementation. Rust stdlib doesn't seem to do conditional compilation for this; rather it takes the the
I'm not sure what you are referring to. Do you think there is something wrong with
So there shouldn't be any future compatibility issues with them. |
glibc sometimes uses its symbol versioning to deprecate, change, or tweak such functions, maintaining compatibility with old programs while preventing new ones from using it. (glibc doesn't always guarantee 100% source compatibility, just 100% binary compatibility.) We've had issues with that kind of symbol versioning in the past, because we try to maintain compatibility with a broad spectrum of glibc versions and not just the version we build with. Hence the added complexity of I'm not suggesting we can't do that, or that using such functions is always a problem, just that they can be slightly higher friction to use and maintain, and that we've had related issues in the past. That seemed like a good reason to entertain the notion of maintaining a table ourselves, and compare the two approaches. Whichever path we go, we'll end up updating things for new targets in order to get better descriptions; there's no portable function we can call that gives us the signal descriptions, so if a new platform arises that provides such descriptions, we'd need to update this code anyway.
I think if we were to do this that we'd want to use the standard descriptions; we certainly wouldn't want to gratuitously reword them. I wouldn't expect those descriptions to vary wildly across platforms, at least not in a way likely to cause confusion for an admin. (We might, for instance, end up using the glibc descriptions on musl targets, but I don't think anyone would be surprised to see the glibc descriptions where they'd otherwise get just a signal name.) And as you observed, it'd be rather surprising to see something other than "Terminated" as the description for
We're not just building a program; we're building a standard library that will, itself, serve as a portability layer across targets. That changes the balance somewhat. (And we don't have "a variety of Unix variants", we have a fairly small number that seems unlikely to see rapid increases; as long as it's clear what to change when adding a new target that doesn't seem likely to be a problem.) It's reasonable to evaluate what approach to portability will be easiest for us to maintain in the future (what will see the least changes, and the least complex changes when it does need changing), as well as what will give people the most user-friendly behavior. Sometimes we call functions from the underlying libraries; sometimes we implement things ourselves if using an underlying library function is more trouble than it's worth and the implementation is straightforward, or if the underlying library functions don't give the behavior we want. Also, in this case, it doesn't seem like there's any useful function we can call on some targets. On musl, for instance, this PR doesn't provide signal descriptions, because there's no library function we can call; we'll only ever get signal names. Likewise for uclibc. An advantage of providing the descriptions ourselves is that they'll always be available. It seems useful to make sure those descriptions are available even if the underlying C library doesn't have them. And if we want to make those descriptions available everywhere, we might as well use those descriptions even if the C library does have them, since that means calling less unsafe code and having less code to maintain. If we're calling the underlying C library functions, it should be because we don't want to maintain such a table at all, and we're willing to just not have signal descriptions on platforms without those functions available. Whichever path we go, we'll end up updating code for new targets in order to get better descriptions; there's no fully portable function we can call that gives us the signal descriptions, so if a new platform arises that provides such descriptions, we'd need to update this code anyway, either to call the same functions on more targets or to add more targets. I also think it's exceedingly likely that new UNIX implementations will use very similar signals and descriptions, which means I think on balance we'll have relatively little work to do to add a new UNIX target. (Also note that we have to update the SummaryIf we call the libc functions: [+] We don't have to maintain a table ourselves, with If we maintain a table of names and descriptions ourselves: [+] We get consistent results (both signal names and signal descriptions) even if the underlying libc doesn't provide such functions or provide useful descriptions. I don't think that's a clear-cut win for either approach; I think this is a set of tradeoffs that libs will have to evaluate. |
Summarising the situation is useful but I don't entirely agree with your characterisation. So here is my summary: Calling the libc APIs
Making our own table
Non-issues
|
It occurred to me to ask this: evidently, since people are suggesting it for Personally I think the |
The primary constraint motivating the use of (Rust does not completely solve this problem, though it would be nice if it did. But Rust tries to make it better where possible.) See #23628 for one bit of that history. |
Two things: First, I want to be clear that I'm not looking to advocate one or the other approach here; I just want to suggest considering both, and the tradeoffs between them. Whichever way the balance of those tradeoffs go, I'm happy to go with. I don't think there's value in arguing this further; there's plenty of information here to base a decision on. Second, I want to follow up on a specific point from my previous post, and suggest a third option, which I previously hadn't considered; I don't want to make this a false dichotomy between just two options, and I think I incorrectly did so. I mentioned that an advantage of providing the descriptions ourselves is that they'll always be available. It seems useful to make sure those descriptions are available even if the underlying C library doesn't have them. But we don't actually have to do better-than-the-C-library everywhere, and in particular, there's no requirement to do so on every platform, since this primarily exists to enhance error reporting. We can make individual tradeoffs between maintenance vs improved functionality. So, as a third option, which I'm finding myself favoring: we could call into the platform C library, using the code from this PR, and on specific popular targets that don't provide this information, we could additionally provide a list ourselves. (Concretely, we could invite folks who want better error reporting on musl or uclibc to submit a patch adding such a list.) This has the advantage of always giving information at least as good as the platform C library can provide; it additionally means programs using a less common C library will fit in better with platform conventions (e.g. descriptions), getting the same clear error output whether they use glibc or musl or uclibc. And it means that providing a list on a target just makes the baseline better, and we need not maintain such a list for every target. Personally, I do see the advantages of calling into the platform C library for this. I'm also fine if there's better error reporting on glibc than on musl/uclibc (insofar as not providing descriptions because the C library doesn't). But I'd also be happy to see a follow-up PR from musl/uclibc folks extending this to add a simple Given that, the approach I'd currently favor would be to accept this PR (with the planned changes), and invite follow-up additions for musl/uclibc and any other target that doesn't have a way of providing both names and descriptions.
Agreed. Though, as noted earlier, there might be value in the
Porting Rust to a new kernel is a great deal of work, but I agree that this PR wouldn't add any. :)
FWIW, I don't expect this to be likely, just possible. It's come up a few times, but more in areas of the C library that have higher flux; I'd expect less churn for this kind of function.
I believe we can and should do this in either case, though as noted above I think we should go with a third option. In particular, I think there's value in testing the full range of signal numbers (if we can get that programmatically from libc), whichever approach we go with.
This is not a concrete, actionable item, nor does it describe the nature of any specific problem this approach would cause. I'm assuming that the remainder of your summary covers the concrete concerns (all of which are valid). To be clear, I'm not currently trying to argue that one of these approaches is better than the other. I'm suggesting that the tradeoffs between these alternatives are worth serious consideration.
Each UNIX-compatible kernel, yes. That doesn't seem likely to be an especially long or rapidly-growing list, and we already have many other things in This does incur future maintenance effort, which I'm acknowledging, and which I don't want to either understate or overstate.
This will only happen if the platform provides such descriptions and we don't match those descriptions, which we can fix easily enough. Also, this is apparently not a problem for programs using musl as their libc, which doesn't provide those descriptions. (I'm talking about cases where a program uses musl on a platform where most programs use glibc.)
👍
We can write a comprehensive test case easily enough, if the
I didn't suggest that we can't use unsafe. I'm suggesting that we do take "amount of unsafe code we have to maintain" into account as a tradeoff. There are multiple potential correct solutions here, with tradeoffs in maintenance, behavior, and other factors. One of many factors we might consider is "how much unsafe code does this need".
Agreed, that was my intended point.
Entirely acknowledged, and I tagged each such point as such. I mentioned them for completeness only. |
For library additions/changes, we usually don't require an RFC, since the discussion is easily had on a PR directly. However, the discussion on this PR so far is only about the implementation details. I'd first like to focus on the question whether we want this functionality in the standard library, before discussing how it's implemented. This PR adds three (unstable) functions to the standard library: pub fn signal_describe(sig: i32) -> String;
pub fn signal_string_raw(sig: i32) -> Option<&'static str>;
pub fn signal_abbrev(sig: i32) -> Option<&'static str>; And makes changes to one implementation, which would directly be stable: impl fmt::Display for ExitStatus; Even though they're useful in some cases, I don't think the proposed We also don't expose The improved Display implementation for ExitStatus however does seem like an appropriate addition to With that in mind, now back to the implementation in this PR: I simply don't think it's worth it to add this much complexity just for nicer formatting of ExitStatus. If we are providing those |
I agree with @m-ou-se that the Regarding the implementation: While it would be nice to get signal names from libc, the standard API for this ( I had a look at other languages (Go, Swift) and they all roll their own tables rather than dealing with the mess of libc APIs. |
Thanks for your reviews.
I personally don't feel like implementing that. So I will close this MR. |
The motivations here are:
Segmentation fault (SIGSEGV)
) rather than numbers (egsignal 11
)WIFEXITED
to describe the value as theexit status
These two changes are behavioural changes to the existing APIs so insta-stable. But I hope they won't be controversial.
Providing signal names requires adding a facility (exposed as unstable in this MR) to do the number to string conversion. The implementation of that (and indeed the exposed API) is not entirely straightforward because of the lack of
strsignal_r
and non-portability ofsys_siglist
andsys_signame
.The test cases for
ExitStatus as Display
need to be updated to contain the new strings. Signal strings are very widely shared across Unices so I am hoping that this will work everywhere. However, it is quite possible that this test case will need adjustment on some platform. Additionally, I have not been able to test this myself on anything with a very recent glibc (2.32 or later), for which this code needs special handling. So it might be worth running this through@bors try
to spot any issues.