Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved documentation of Path::exists #80979

Closed
wants to merge 6 commits into from

Conversation

Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Jan 13, 2021

This improves the documentation of Path::exists in several ways:

  • Clarifies that the method only returns true if the syscall succeeds
  • Doesn't make it look like there has to be something wrong with the
    directory for it to fail.
  • Adds "Caveats" section to make possible dangers more visible
  • Mentions race conditions which, while known by experienced programmers,
    is more welcoming towards newbies
  • Mentions that access() may be sometimes more appropriate
  • Adds an actual example of proper error handling so that users are not
    detracted from it by having to come up with complicated code.

I hope this makes sense. I'm non-native speaker, feel free to grammar nazi or suggest different wording. :)

I'm willing to file PRs for these further changes, but I expect that at least some of them require an RFC. Please let me know.

  • Add fn try_exists(&self) -> io::Result<bool>; - containing code from the latter example in this change.
  • Add accessible(&self) -> io::Result<()>; that calls access() or faccessat on Unix (need to decide which is better) and whatever is appropriate on Windows (does anyone know?) Return value should be bikesheded (perhaps io::Result<bool>).
  • Add fn metadata_accessible(&self) -> bool which does exactly what exists does today, just has a clearer name (feel free to suggest a better name). Probably should be accompanied by the next one
  • Deprecate Path::exists in favor of one of the previous ones

See internals thread for more discussion about what's wrong with Path::exists.

This improves the documentation of `Path::exists` in several ways:

* Clarifies that the method only returns true if the syscall succeeds
* Doesn't make it look like there has to be something wrong with the
  directory for it to fail.
* Adds "Caveats" section to make possible dangers more visible
* Mentions race conditions which, while known by experienced programmers,
  is more welcoming towards newbies
* Mentions that `access()` may be sometimes more appropriate
* Adds an actual example of proper error handling so that users are not
  detracted from it by having to come up with complicated code.
@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 13, 2021
@@ -2386,13 +2386,26 @@ impl Path {
fs::read_dir(self)
}

/// Returns `true` if the path points at an existing entity.
/// Returns `true` if the path points at an existing entity and syscall succeeds.
Copy link
Member

@nagisa nagisa Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm not sure if wording this in terms of syscalls is the best way to approach this, not to mention that it is not entirely clear what kind of syscall is being referenced here. This also, in some ways, leaks the implementation details which aren't necessarily true on all targets or in the future.

There's already some elaboration on additional cases where false can be returned in the next paragraph. I think that may be a better place for this note. For example:

This function will traverse symbolic links to query information about the destination file. In the case where the information cannot be queried successfully (e.g. there are broken symbolic links) this will return false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also felt it's not perfect but didn't come up with anything else. Now that I think of it again, maybe "and no internal failure occurs"?

This also, in some ways, leaks the implementation details which aren't necessarily true on all targets or in the future.

I don't think it's physically possible to check existence of a file without some kind of syscall and would be highly surprised if there was one guaranteed to never fail.

There's already some elaboration on additional cases where false can be returned in the next paragraph.

I believe it'd be valuable to have at least some small head up in the summary, because the method looks simpler than it actually is. Someone could just read the description and signature, think "ah, that's obvious" and go write the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @nagisa's comment and would prefer if this were not worded specifically around "if the syscall succeeds", but rather in terms of the state of the underlying system which std is providing the caller an API over.

Maybe like:

Returns true if we can tell that the path points at an existing entity. False if it doesn't or if we can't tell.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That wording sounds good.

///
/// This function will traverse symbolic links to query information about the
/// destination file. In case of broken symbolic links this will return `false`.
///
/// If you cannot access the directory containing the file, e.g., because of a
/// permission error, this will return `false`.
/// # Caveats
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't generally call out the caveats as a separate section elsewhere in the documentation of the standard library, as it is very often implied (and similarly, the paragraph about traversing the symbolic links could also be considered a caveat, but is outside of this section).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So just simple bold?

the paragraph about traversing the symbolic links could also be considered a caveat, but is outside of this section

I actually don't find this behavior surprising at all as all other applications behave exactly same. Willing to reconsider if there's a bunch of people who find it surprising.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am on board with a Caveats heading here, it seems appropriate.

@@ -2405,6 +2418,18 @@ impl Path {
///
/// This is a convenience function that coerces errors to false. If you want to
/// check errors, call [`fs::metadata`].
///
/// An example of proper error handling.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it perhaps make more sense to just link readers to metadata which will have its own examples?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. My intention was to provide a simple snippet people could copy-paste to do the job properly. So I'd agree to moving the example to metadata. I'm not entirely sure people will find it when linked from exists so maybe add a simple note but maybe that's excessive.

@Mark-Simulacrum
Copy link
Member

r? @dtolnay - I think this needs someone with a T-libs hat, perhaps some discussion.

I am sympathetic to exists returning false when really the answer is unknown being a problem in some cases, but I think most of the time my uses of exists (which are similar, I believe, to those in rust-analyzer as noted by @matklad here) do not really care for this other case, and handling it would lead to more verbose code at little benefit.

@Kixunil
Copy link
Contributor Author

Kixunil commented Jan 16, 2021

handling it would lead to more verbose code at little benefit.

I'm not convinced. I've spent insane amount of time debugging various software which would be much shorter if the messages were more reasonable. Maybe I should run everything against strace instead of trusting other developers but I'm not sure we want to live in society like this.

If this is used often, then one could create a helper method that just logs the error into the global logger and behaves as exists in other ways.

@Mark-Simulacrum
Copy link
Member

I don't usually error on non-exists, though, there I would agree - but checking for existence is usually more of a matter of "does the file this program created exist". I think for the use case of erroring if the file does not exist, it seems excellent to use something more along the lines of the access function or fs::metadata directly.

@Kixunil
Copy link
Contributor Author

Kixunil commented Jan 19, 2021

That makes sense. I still believe that logging a warning in case of a suspicious situation like directory inaccessible can be valuable in many cases but if most people think it's not worth deprecation I can understand that.

@Kixunil
Copy link
Contributor Author

Kixunil commented Feb 6, 2021

Now that #81822 is open, I suggest resolving that one first. I was thinking that we could link the implementation of unstable try_exists() in the docs instead of copy-pasting the same code but I admit it may be unusual.

Kixunil added a commit to Kixunil/rust that referenced this pull request Feb 6, 2021
This method is similar to the existing `exists()` method, except it
doesn't silently ignore the errors, leading to less error-prone code.

This change intentionally does NOT touch the documentation of `exists()`
nor recommend people to use this method while it's unstable.
Such changes are reserved for stabilization to prevent confusing people.

Apart from that it avoids conflicts with rust-lang#80979.
@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 23, 2021
@JohnCSimon
Copy link
Member

triage:
this seems to be not ready for review - still being worked on so I'm changing to waiting-on-author

@Kixunil
Copy link
Contributor Author

Kixunil commented Feb 23, 2021

@JohnCSimon I asked a few questions which were not answered yet, so I'm not sure how to continue. However, I don't mind waiting until #81822 gets resolved.

@crlf0710 crlf0710 added S-blocked Status: Blocked on something else such as an RFC or other implementation work. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 12, 2021
@crlf0710
Copy link
Member

Triage: Switched this to S-blocked on #81822. @Kixunil Maybe it's a better idea to ask questions on T-Libs stream on official Zulip instance though.

JohnTitor added a commit to JohnTitor/rust that referenced this pull request Mar 16, 2021
Added `try_exists()` method to `std::path::Path`

This method is similar to the existing `exists()` method, except it
doesn't silently ignore the errors, leading to less error-prone code.

This change intentionally does NOT touch the documentation of `exists()`
nor recommend people to use this method while it's unstable.
Such changes are reserved for stabilization to prevent confusing people.

Apart from that it avoids conflicts with rust-lang#80979.

`@joshtriplett` requested this PR in [internals discussion](https://internals.rust-lang.org/t/the-api-of-path-exists-encourages-broken-code/13817/25?u=kixunil)
@dtolnay dtolnay assigned nagisa and unassigned dtolnay Mar 30, 2021
Comment on lines 2422 to 2431
/// An example of proper error handling.
/// ```no_run
/// use std::path::Path;
/// use std::io::ErrorKind;
/// use std::fs::metadata;
/// let exists = metadata(Path::new("does_not_exist.txt"))
/// .map(|_| true)
/// .or_else(|error| if error.kind() == ErrorKind::NotFound { Ok(false) } else { Err(error) } )
/// .expect("failed to check existence of a file");
/// assert!(!exists);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be clearer with the 3 cases separated as such, rather than having to reason through these not-a-great-fit combinators.

use std::fs;
use std::io::ErrorKind;

let exists = match fs::metadata("does_not_exist.txt") {
    Ok(_) => true,
    Err(err) if err.kind() == ErrorKind::NotFound => false,
    Err(err) => panic!("failed to check existence of file: {}", err),
};

assert!(!exists);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separately, it strikes me as odd to refer to a panic (expect) as "proper error handling". Could this be reworded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, same change was made in the new try_exists() method. I'm not sure if/how the documentation should be changed taking into account there's a method in nightly. I guess referring to it from stable is not usual, right?

I assumed panic is OK in this context because of the assert! below, but changing to return Err(err) would make sense.

library/std/src/path.rs Outdated Show resolved Hide resolved
@@ -2386,13 +2386,26 @@ impl Path {
fs::read_dir(self)
}

/// Returns `true` if the path points at an existing entity.
/// Returns `true` if the path points at an existing entity and syscall succeeds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @nagisa's comment and would prefer if this were not worded specifically around "if the syscall succeeds", but rather in terms of the state of the underlying system which std is providing the caller an API over.

Maybe like:

Returns true if we can tell that the path points at an existing entity. False if it doesn't or if we can't tell.

Kixunil and others added 2 commits March 30, 2021 11:14
Co-authored-by: David Tolnay <dtolnay@gmail.com>
This makes the error handling example cleaner and rewords things.
@rust-log-analyzer

This comment has been minimized.

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the failing CI job.

---- src/path.rs - path::Path::exists (line 2485) stdout ----
error[E0308]: mismatched types
  --> src/path.rs:2492:24
   |
10 |     Err(err) => return Err(err),
   |                        ^^^^^^^^ expected `()`, found enum `Result`
   = note: expected unit type `()`
                   found enum `Result<_, std::io::Error>`

library/std/src/path.rs Show resolved Hide resolved
library/std/src/path.rs Outdated Show resolved Hide resolved
Co-authored-by: David Tolnay <dtolnay@gmail.com>
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Jun 5, 2021

☔ The latest upstream changes (presumably #86032) made this pull request unmergeable. Please resolve the merge conflicts.

@Dylan-DPC-zz
Copy link

the pr this was blocked on is merged.

@Kixunil if you can update this pr according to the changes requested we can push this forward. Thanks

@Dylan-DPC-zz Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-blocked Status: Blocked on something else such as an RFC or other implementation work. labels Aug 26, 2021
@the8472 the8472 added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Sep 8, 2021
@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 27, 2021
@JohnCSimon
Copy link
Member

ping again from triage:
@Kixunil - Can you address the merge conflicts?

@JohnCSimon JohnCSimon added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Oct 19, 2021
@JohnCSimon
Copy link
Member

Triage:
@Kixunil I'm closing this for inactivity, please feel free to reopen when you are ready to continue.

@JohnCSimon JohnCSimon closed this Oct 19, 2021
@dtolnay dtolnay self-assigned this Mar 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.