Stop probing for statx unless necessary #106661

mjguzik · 2023-01-10T02:26:21Z

As is the current toy program:
fn main() -> std::io::Result<()> {
use std::fs;

let metadata = fs::metadata("foo.txt")?;

assert!(!metadata.is_dir());
Ok(())

}

... observed under strace will issue:
[snip]
statx(0, NULL, AT_STATX_SYNC_AS_STAT, STATX_ALL, NULL) = -1 EFAULT (Bad address) statx(AT_FDCWD, "foo.txt", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=0, ...}) = 0

While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether.

Note this is still suboptimal as there still will be programs issuing it, but bulk of the problem is removed.

Tested by forbidding the syscall for the binary and observing it correctly falls back to newfstatat.

While here tidy up the commentary, in particular by denoting some problems with the current approach.

rustbot · 2023-01-10T02:26:28Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) soon.

Please see the contribution instructions for more information.

rustbot · 2023-01-10T02:26:31Z

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

Stabilizing library features
Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
Changing public documentation in ways that create new stability guarantees
Changing observable runtime behavior of library APIs

joboet · 2023-01-11T14:35:15Z

This is failing CI because this miri test

rust/src/tools/miri/tests/pass-dep/shims/libc-fs-with-isolation.rs

Lines 24 to 28 in 8ecaad8

    
               // test `stat` 
        
               assert_eq!(fs::metadata("foo.txt").unwrap_err().kind(), ErrorKind::PermissionDenied); 
        
               // check that it is the right kind of `PermissionDenied` 
        
               assert_eq!(Error::last_os_error().raw_os_error(), Some(libc::EACCES)); 
        
           }

checks the last OS error instead of the actual error returned by the metadata call. I'm not a team member, but that test can probably be changed in this PR.

mjguzik · 2023-01-11T15:08:11Z

This is failing CI because this miri test

rust/src/tools/miri/tests/pass-dep/shims/libc-fs-with-isolation.rs

Lines 24 to 28 in 8ecaad8

// test `stat`

assert_eq!(fs::metadata("foo.txt").unwrap_err().kind(), ErrorKind::PermissionDenied);

// check that it is the right kind of `PermissionDenied`

assert_eq!(Error::last_os_error().raw_os_error(), Some(libc::EACCES));

}

checks the last OS error instead of the actual error returned by the metadata call. I'm not a team member, but that test can probably be changed in this PR.

thanks, I did not get an opportunity to look at the failing test yet.

Looking at how the test runs into it, I suspect the safest way forward is to restore errno from the "real" call though. Any thoughts on this? From grepping around looks like sys::os::set_errno will do the trick.

If patching the test is preferred, I have to say I don't know how to do make it test against EACCESS -- is the original errno recoverable from the value returned by fs::metadata?

I'm very new to rust :)

joboet · 2023-01-11T15:58:49Z

Looking at how the test runs into it, I suspect the safest way forward is to restore errno from the "real" call though. Any thoughts on this? From grepping around looks like sys::os::set_errno will do the trick.

As per the documentation, errno values are not stable, so we do not need to restore it here, I think.

If patching the test is preferred, I have to say I don't know how to do make it test against EACCESS -- is the original errno recoverable from the value returned by fs::metadata?

The error is recoverable through raw_os_error, but that should be called on the error returned by fs::metadata instead of last_os_error. So the test should look like:

// test `stat`
let err = fs::metadata("foo.txt").unwrap_err();
assert_eq!(err.kind(), ErrorKind::PermissionDenied); 
// check that it is the right kind of `PermissionDenied`
assert_eq!(err.raw_os_error(), Some(libc::EACCES));

I'm very new to rust :)

Welcome and thank you for your contribution 😄!

mjguzik · 2023-01-11T16:16:22Z

Looking at how the test runs into it, I suspect the safest way forward is to restore errno from the "real" call though. Any thoughts on this? From grepping around looks like sys::os::set_errno will do the trick.

As per the documentation, errno values are not stable, so we do not need to restore it here, I think.

If patching the test is preferred, I have to say I don't know how to do make it test against EACCESS -- is the original errno recoverable from the value returned by fs::metadata?

The error is recoverable through raw_os_error, but that should be called on the error returned by fs::metadata instead of last_os_error. So the test should look like:
// test `stat`
let err = fs::metadata("foo.txt").unwrap_err();
assert_eq!(err.kind(), ErrorKind::PermissionDenied); 
// check that it is the right kind of `PermissionDenied`
assert_eq!(err.raw_os_error(), Some(libc::EACCES)); 

Ok.

Since you effectively wrote the patch, can you submit it as a separate PR or send it somewhere so that I can apply it on my branch? Then I would I push both commits here while preserving your authorship of the change.

Alternatively if you don't care I can steal it and give you credit in the commit message.

I'm just trying to sort it out without stepping on any toes here.

joboet · 2023-01-11T16:40:41Z

Ok.

Since you effectively wrote the patch, can you submit it as a separate PR or send it somewhere so that I can apply it on my branch? Then I would I push both commits here while preserving your authorship of the change.

Alternatively if you don't care I can steal it and give you credit in the commit message.

I'm just trying to sort it out without stepping on any toes here.

Stealing it is fine, but thanks for asking 😉. Otherwise, you can use co-authored-by for attribution by adding this line to the commit message:

Co-authored-by: joboet <REDACTED>

The test relied on Error::last_os_error() coming from the stat call on the passed file, but there is no guarantee this will be the case. Instead extract errno from the error returned by the routine. Patch de facto written by joboet. Co-authored-by: joboet <jonasboettiger@icloud.com>

As is the current toy program: fn main() -> std::io::Result<()> { use std::fs; let metadata = fs::metadata("foo.txt")?; assert!(!metadata.is_dir()); Ok(()) } ... observed under strace will issue: [snip] statx(0, NULL, AT_STATX_SYNC_AS_STAT, STATX_ALL, NULL) = -1 EFAULT (Bad address) statx(AT_FDCWD, "foo.txt", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_ALL|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFREG|0644, stx_size=0, ...}) = 0 While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether. Note this is still suboptimal as there still will be programs issuing it, but bulk of the problem is removed. Tested by forbidding the syscall for the binary and observing it correctly falls back to newfstatat. While here tidy up the commentary, in particular by denoting some problems with the current approach.

rustbot · 2023-01-11T17:10:20Z

The Miri subtree was changed

cc @rust-lang/miri

RalfJung · 2023-01-12T10:58:46Z

src/tools/miri/tests/pass-dep/shims/libc-fs-with-isolation.rs

    // check that it is the right kind of `PermissionDenied`
-    assert_eq!(Error::last_os_error().raw_os_error(), Some(libc::EACCES));
+    assert_eq!(err.raw_os_error(), Some(libc::EACCES));


I'm fine with this change. However, the raw_os_error docs do indicate that this does return the error from the previous stdlib call, so isn't it strange that this changed?

errno is changed to EFAULT because the statx state is checked after the initial call returned an error. Since EFAULT indicates that statx is available, the error returned is the one by the first statx call, which has a different error code (like EACCES).

RalfJung · 2023-01-12T11:01:12Z

While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether.

This check happens once during the entire runtime of the program. Why is that so bad?

(I'm not fundamentally opposed to this PR, but the description does not give sufficient motivation IMO. It explains why we can avoid the check for some cases, but does not even attempt to explain why we should.)

mjguzik · 2023-01-12T15:16:15Z

While statx is not necessarily always present, checking for it can be delayed to the first error condition. Said condition may very well never happen, in which case the check got avoided altogether.

This check happens once during the entire runtime of the program. Why is that so bad?

(I'm not fundamentally opposed to this PR, but the description does not give sufficient motivation IMO. It explains why we can avoid the check for some cases, but does not even attempt to explain why we should.)

Quite frankly I thought this change would be a no-brainer -- for one syscalls are not exactly cheap these days on, most notably on amd64.

What perhaps I should have mentioned is that I intend to post another PR in the area, this time making the change to only do the probe if EPERM got received.

As indicated by comments I added in the code, stock routine has a bug which I temporarily kept and it boils down to considering the syscall unusable even when it in fact works. For example statx may transiently return ENOMEM due to memory pressure (statx -> getname_flags -> __getname ; see https://elixir.bootlin.com/linux/latest/source/fs/namei.c#L129) and with the current detection method (!= EFAULT) it will be incorrrectly considered not present.

I consider going from slurping all errors to just EPERM a separate logical step, if only for bisectability. I can stack it as another change in this PR if preferred.

RalfJung · 2023-01-12T15:38:26Z

As indicated by comments I added in the code, stock routine has a bug

Yeah that's definitely a good catch!

for one syscalls are not exactly cheap these days on, most notably on amd64.

We're talking about a single syscall per program execution here, it is hard for me to imagine how that could be perf-relevant. OTOH the cost in extra code here is also small. So I was just wondering if there is something I missed here. Thanks for clarifying! Anyway I'm not a libs expert so we will see what the libs reviewer says.

Mark-Simulacrum · 2023-01-14T02:30:22Z

@bors r+

I don't think the "performance" benefit is worthwhile enough by itself to motivate this, but adding the FIXMEs and enum seems like a good set of changes to me. Thanks!

bors · 2023-01-14T02:30:24Z

📌 Commit b49aa8d has been approved by Mark-Simulacrum

It is now in the queue for this repository.

raggi · 2023-01-14T04:42:07Z

I don't think the "performance" benefit is worthwhile enough by itself to motivate this, but adding the FIXMEs and enum seems like a good set of changes to me. Thanks!

Consider perhaps if someone was writing stat(1), as in this group of people, who are writing stat(1): https://github.com/uutils/coreutils#utilities. In such a program (and many others there), the change may in fact be material.

RalfJung · 2023-01-14T09:25:40Z

Do you have a benchmark? I highly doubt a single syscall is relevant in a case like that. It will be dwarfed by all the other syscalls that need to happen plus the cost of setting up and tearing down the process.

…iaskrgr Rollup of 8 pull requests Successful merges: - rust-lang#105526 (libcore: make result of iter::from_generator Clone) - rust-lang#106563 (Fix `unused_braces` on generic const expr macro call) - rust-lang#106661 (Stop probing for statx unless necessary) - rust-lang#106820 (Deprioritize fulfillment errors that come from expansions.) - rust-lang#106828 (rustdoc: remove `docblock` class from notable trait popover) - rust-lang#106849 (Allocate one less vec while parsing arrays) - rust-lang#106855 (rustdoc: few small cleanups) - rust-lang#106860 (Remove various double spaces in the libraries.) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup

the8472 · 2023-01-15T11:07:40Z

library/std/src/sys/unix/fs.rs

+            if STATX_SAVED_STATE.load(Ordering::Relaxed) == STATX_STATE::Present as u8 {
+                return Some(Err(err));
+            }


It is not quite correct to rely on this because syscall filters can be installed during program runtime. So the behavior can change. At that point ENOSYS could bubble up to user code because the flag now says all errors should be bubbled up.

If ENOSYS and EPERM can be reliably distinguished from other errors then updating the atomic every time such an error occurs is better, as we do for splice and sendfile
If EPERM can also convey a real error then the pre-probing behavior that existed before this PR is correct.

I don't think there is a real issue here.

Detection only showed up because some deployments got rust utilizing statx running on kernels which did not have it yet or otherwise did not have configuration updated to cope with it, resulting in EPERM from seccomp.

To my understanding statx is the way to stat going forward and as such disabling it on purpose is a misconfiguration of the system, on par with disabling a syscall like open.

Any sane seccomp filters are deny-by-default, which means any filter list that doesn't explicitly opt into statx can end up blocking the call.
Combine that with the fact that people run outdated software (which can contain equally outdated filter lists) it is possible that you end up with a situation where stat is allowed but statx isn't.

And what happens in the wild is not really relevant for correctness. For correctness it only matters that syscall availability can change at runtime.

People do all kinds of crazy stuff and we need to be defensive.

Just in case I have to stress that with and without the patch the rust binary will ultimately detect statx (un)availabilty the same way, whether the kernel does not support statx to begin with or seccomp is configured to disable it, as long as the latter happened prior to running detection code.

I have to concede it is possible someone has outdated seccomp filters, which happen to be disabled when a rust binary starts execing and only get enabled later. In such a case errors from seccomp will indeed be returned. But this was already the case with the code prior to me patching it (read: no regression on that front).

I also feel inclined to note the sendfile example does not "update the atomic every time". The code speculatively executes the syscall, presumably to avoid paying specifically for detection, and the atomic store is a one time ordeal per exec -- any new calls past that fail to get there. Well, one may nitpick a multithreaded program can have several stores, but the point stands.

If anything my patch moved statx handling closer to that spirit, but in contrast it still does not handle such a failure at an arbitrary moment.

Now, legally encountered permission problems return EACCESS. The kernel has some hairy code to support all of this and I definitely would not stack my neck on EPERM being only returned by seccomp.

All that said, as mentioned in another comment, I plan to post a followup to fix the pre-existing bugs I mentioned in FIXME comments. If it does not get hairy, I may as well throw an extra check for seccomp if EPERM is seen.

Yes, totally agreed, it's an edge-case. But seccomp filters getting installed during the runtime of a program after doing some setup work is definitely a thing that happens in daemons that lock themselves down.

All that said, as mentioned in another comment, I plan to post a followup to fix the pre-existing bugs

👍

RalfJung · 2023-01-15T12:49:17Z

If availability changes at runtime, pre-checking would have TOCTOU issues though, right?

the8472 · 2023-01-15T12:56:07Z

It's ok to precheck as a performance optimization to avoid doing an unnecessary call. But it's still necessary to handle the error and a fallback on the we-think-it's-available path when statx becomes unavailable

rustbot assigned Mark-Simulacrum Jan 10, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 10, 2023

mjguzik force-pushed the linux_statx branch from a2f1e7e to 14c7623 Compare January 10, 2023 05:37

This comment has been minimized.

Sign in to view

mjguzik added 2 commits January 11, 2023 17:09

mjguzik force-pushed the linux_statx branch from 14c7623 to b49aa8d Compare January 11, 2023 17:10

RalfJung reviewed Jan 12, 2023

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 14, 2023

matthiaskrgr mentioned this pull request Jan 14, 2023

Rollup of 8 pull requests #106866

Merged

bors merged commit 4313471 into rust-lang:master Jan 15, 2023

rustbot added this to the 1.68.0 milestone Jan 15, 2023

the8472 reviewed Jan 15, 2023

View reviewed changes

Stop probing for statx unless necessary #106661

Stop probing for statx unless necessary #106661

Uh oh!

Conversation

mjguzik commented Jan 10, 2023

Uh oh!

rustbot commented Jan 10, 2023

Uh oh!

rustbot commented Jan 10, 2023

Uh oh!

This comment has been minimized.

joboet commented Jan 11, 2023

Uh oh!

mjguzik commented Jan 11, 2023

Uh oh!

joboet commented Jan 11, 2023

Uh oh!

mjguzik commented Jan 11, 2023

Uh oh!

joboet commented Jan 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Jan 11, 2023

Uh oh!

RalfJung Jan 12, 2023

Choose a reason for hiding this comment

Uh oh!

joboet Jan 12, 2023

Choose a reason for hiding this comment

Uh oh!

RalfJung commented Jan 12, 2023

Uh oh!

mjguzik commented Jan 12, 2023

Uh oh!

RalfJung commented Jan 12, 2023

Uh oh!

Mark-Simulacrum commented Jan 14, 2023

Uh oh!

bors commented Jan 14, 2023

Uh oh!

raggi commented Jan 14, 2023

Uh oh!

RalfJung commented Jan 14, 2023 via email

Uh oh!

the8472 Jan 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mjguzik Jan 15, 2023

Choose a reason for hiding this comment

Uh oh!

the8472 Jan 15, 2023

Choose a reason for hiding this comment

Uh oh!

mjguzik Jan 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

the8472 Jan 15, 2023

Choose a reason for hiding this comment

Uh oh!

RalfJung commented Jan 15, 2023 via email

Uh oh!

the8472 commented Jan 15, 2023

Uh oh!

Uh oh!

joboet commented Jan 11, 2023 •

edited

Loading

the8472 Jan 15, 2023 •

edited

Loading

mjguzik Jan 15, 2023 •

edited

Loading