-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop probing for statx unless necessary #106661
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) soon. Please see the contribution instructions for more information. |
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
This comment has been minimized.
This comment has been minimized.
This is failing CI because this miri test rust/src/tools/miri/tests/pass-dep/shims/libc-fs-with-isolation.rs Lines 24 to 28 in 8ecaad8
checks the last OS error instead of the actual error returned by the metadata call. I'm not a team member, but that test can probably be changed in this PR.
|
thanks, I did not get an opportunity to look at the failing test yet. Looking at how the test runs into it, I suspect the safest way forward is to restore errno from the "real" call though. Any thoughts on this? From grepping around looks like sys::os::set_errno will do the trick. If patching the test is preferred, I have to say I don't know how to do make it test against EACCESS -- is the original errno recoverable from the value returned by fs::metadata? I'm very new to rust :) |
As per the documentation,
The error is recoverable through // test `stat`
let err = fs::metadata("foo.txt").unwrap_err();
assert_eq!(err.kind(), ErrorKind::PermissionDenied);
// check that it is the right kind of `PermissionDenied`
assert_eq!(err.raw_os_error(), Some(libc::EACCES));
Welcome and thank you for your contribution 😄! |
Ok. Since you effectively wrote the patch, can you submit it as a separate PR or send it somewhere so that I can apply it on my branch? Then I would I push both commits here while preserving your authorship of the change. Alternatively if you don't care I can steal it and give you credit in the commit message. I'm just trying to sort it out without stepping on any toes here. |
Stealing it is fine, but thanks for asking 😉. Otherwise, you can use co-authored-by for attribution by adding this line to the commit message:
|
The test relied on Error::last_os_error() coming from the stat call on the passed file, but there is no guarantee this will be the case. Instead extract errno from the error returned by the routine. Patch de facto written by joboet. Co-authored-by: joboet <jonasboettiger@icloud.com>
As is the current toy program: fn main() -> std::io::Result<()> { use std::fs; let metadata = fs::metadata("foo.txt")?; assert!(!metadata.is_dir()); Ok(()) } ... observed under strace will issue: [snip] statx(0, NULL, AT_STATX_SYNC_AS_STAT, STATX_ALL, NULL) = -1 EFAULT (Bad address) statx(AT_FDCWD, "foo.txt", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=0, ...}) = 0 While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether. Note this is still suboptimal as there still will be programs issuing it, but bulk of the problem is removed. Tested by forbidding the syscall for the binary and observing it correctly falls back to newfstatat. While here tidy up the commentary, in particular by denoting some problems with the current approach.
The Miri subtree was changed cc @rust-lang/miri |
// check that it is the right kind of `PermissionDenied` | ||
assert_eq!(Error::last_os_error().raw_os_error(), Some(libc::EACCES)); | ||
assert_eq!(err.raw_os_error(), Some(libc::EACCES)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this change. However, the raw_os_error
docs do indicate that this does return the error from the previous stdlib call, so isn't it strange that this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errno
is changed to EFAULT
because the statx
state is checked after the initial call returned an error. Since EFAULT
indicates that statx
is available, the error returned is the one by the first statx
call, which has a different error code (like EACCES
).
This check happens once during the entire runtime of the program. Why is that so bad? (I'm not fundamentally opposed to this PR, but the description does not give sufficient motivation IMO. It explains why we can avoid the check for some cases, but does not even attempt to explain why we should.) |
Quite frankly I thought this change would be a no-brainer -- for one syscalls are not exactly cheap these days on, most notably on amd64. What perhaps I should have mentioned is that I intend to post another PR in the area, this time making the change to only do the probe if EPERM got received. As indicated by comments I added in the code, stock routine has a bug which I temporarily kept and it boils down to considering the syscall unusable even when it in fact works. For example statx may transiently return ENOMEM due to memory pressure (statx -> getname_flags -> __getname ; see https://elixir.bootlin.com/linux/latest/source/fs/namei.c#L129) and with the current detection method (!= EFAULT) it will be incorrrectly considered not present. I consider going from slurping all errors to just EPERM a separate logical step, if only for bisectability. I can stack it as another change in this PR if preferred. |
Yeah that's definitely a good catch!
We're talking about a single syscall per program execution here, it is hard for me to imagine how that could be perf-relevant. OTOH the cost in extra code here is also small. So I was just wondering if there is something I missed here. Thanks for clarifying! Anyway I'm not a libs expert so we will see what the libs reviewer says. |
@bors r+ I don't think the "performance" benefit is worthwhile enough by itself to motivate this, but adding the FIXMEs and enum seems like a good set of changes to me. Thanks! |
Consider perhaps if someone was writing stat(1), as in this group of people, who are writing stat(1): https://github.com/uutils/coreutils#utilities. In such a program (and many others there), the change may in fact be material. |
Do you have a benchmark? I highly doubt a single syscall is relevant in a case like that. It will be dwarfed by all the other syscalls that need to happen plus the cost of setting up and tearing down the process.
|
…iaskrgr Rollup of 8 pull requests Successful merges: - rust-lang#105526 (libcore: make result of iter::from_generator Clone) - rust-lang#106563 (Fix `unused_braces` on generic const expr macro call) - rust-lang#106661 (Stop probing for statx unless necessary) - rust-lang#106820 (Deprioritize fulfillment errors that come from expansions.) - rust-lang#106828 (rustdoc: remove `docblock` class from notable trait popover) - rust-lang#106849 (Allocate one less vec while parsing arrays) - rust-lang#106855 (rustdoc: few small cleanups) - rust-lang#106860 (Remove various double spaces in the libraries.) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
if STATX_SAVED_STATE.load(Ordering::Relaxed) == STATX_STATE::Present as u8 { | ||
return Some(Err(err)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not quite correct to rely on this because syscall filters can be installed during program runtime. So the behavior can change. At that point ENOSYS could bubble up to user code because the flag now says all errors should be bubbled up.
If ENOSYS and EPERM can be reliably distinguished from other errors then updating the atomic every time such an error occurs is better, as we do for splice and sendfile
If EPERM can also convey a real error then the pre-probing behavior that existed before this PR is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a real issue here.
Detection only showed up because some deployments got rust utilizing statx running on kernels which did not have it yet or otherwise did not have configuration updated to cope with it, resulting in EPERM from seccomp.
To my understanding statx is the way to stat going forward and as such disabling it on purpose is a misconfiguration of the system, on par with disabling a syscall like open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any sane seccomp filters are deny-by-default, which means any filter list that doesn't explicitly opt into statx
can end up blocking the call.
Combine that with the fact that people run outdated software (which can contain equally outdated filter lists) it is possible that you end up with a situation where stat is allowed but statx isn't.
And what happens in the wild is not really relevant for correctness. For correctness it only matters that syscall availability can change at runtime.
People do all kinds of crazy stuff and we need to be defensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case I have to stress that with and without the patch the rust binary will ultimately detect statx (un)availabilty the same way, whether the kernel does not support statx to begin with or seccomp is configured to disable it, as long as the latter happened prior to running detection code.
I have to concede it is possible someone has outdated seccomp filters, which happen to be disabled when a rust binary starts execing and only get enabled later. In such a case errors from seccomp will indeed be returned. But this was already the case with the code prior to me patching it (read: no regression on that front).
I also feel inclined to note the sendfile example does not "update the atomic every time". The code speculatively executes the syscall, presumably to avoid paying specifically for detection, and the atomic store is a one time ordeal per exec -- any new calls past that fail to get there. Well, one may nitpick a multithreaded program can have several stores, but the point stands.
If anything my patch moved statx handling closer to that spirit, but in contrast it still does not handle such a failure at an arbitrary moment.
Now, legally encountered permission problems return EACCESS. The kernel has some hairy code to support all of this and I definitely would not stack my neck on EPERM being only returned by seccomp.
All that said, as mentioned in another comment, I plan to post a followup to fix the pre-existing bugs I mentioned in FIXME comments. If it does not get hairy, I may as well throw an extra check for seccomp if EPERM is seen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, totally agreed, it's an edge-case. But seccomp filters getting installed during the runtime of a program after doing some setup work is definitely a thing that happens in daemons that lock themselves down.
All that said, as mentioned in another comment, I plan to post a followup to fix the pre-existing bugs
👍
If availability changes at runtime, pre-checking would have TOCTOU issues though, right?
|
It's ok to precheck as a performance optimization to avoid doing an unnecessary call. But it's still necessary to handle the error and a fallback on the we-think-it's-available path when statx becomes unavailable |
As is the current toy program:
fn main() -> std::io::Result<()> {
use std::fs;
}
... observed under strace will issue:
[snip]
statx(0, NULL, AT_STATX_SYNC_AS_STAT, STATX_ALL, NULL) = -1 EFAULT (Bad address) statx(AT_FDCWD, "foo.txt", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=0, ...}) = 0
While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether.
Note this is still suboptimal as there still will be programs issuing it, but bulk of the problem is removed.
Tested by forbidding the syscall for the binary and observing it correctly falls back to newfstatat.
While here tidy up the commentary, in particular by denoting some problems with the current approach.