-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking Issue for Path::file_prefix #86319
Comments
How about a corresponding file_suffix? That completes the set. file_stem/extension and file_prefix/file_suffix.
|
@schwern that is indeed a nice idea. I guess it would be best to get the go ahead from someone in the libs team? I don't know how these tracking issues work. The PR for this feature was merged a while ago, but I have no idea if I'm supposed to do something with this issue? @dtolnay as you were the reviewer/merger of that PR are you able to answer this? |
Any progress on this? It's a very useful API when working with files like tarballs. |
Good point ! |
One of my concerns is that a I'd suggest adding this to the unresolved questions, if possible. |
While it might be impossible to have a 100% correct implementation in my opinion it would already be helpful to simply have all rust crates behave in a known manner. E.g., there are now hundreds of rust crates out there dealing with files, e.g., matching filenames, all behaving a little different. It is impossible to know how a rust crate that you're including will behave. Even having a method that might change over time and for which it is documented how it behaves in certain corner cases would be very, very useful. Compare it to the |
It's not actually relatively rare. We went over this in voice chat. I gave you multiple reasons why someone would want to do this. In practice,
It's very popular with javascript (where there is no magic byte and everything is js or ts) The current suggestion to get the functionality is, .file_name().and_then(OsStr::to_str).map_or(false, |name| name.ends_with(".tar.gz")) Which I think is a bit convoluted. I am down for doing this other ways, but I don't think something like this should is "rare" as far as path manipulation goes, or that it should require converting to-and-from
We could potentially find better ways to do all of this though. Like simply returning an iterator for
The bottom line here is that when it comes to the file name, people will logically think of splitting on Something about that API of returning an Iterator just seems simpler than arguing about whether an "extension" in |
There might've been some misunderstandings here. Let's take this file name .and_then(OsStr::to_str).map_or(false, |name| name.ends_with(".tar.gz")) IMO this is not optimal. Maybe we could add methods |
I can see the utility in better processing for
I believe the above would make such that
Gets parsed as
But it's probably still not sufficient because if the filename was instead
I agree with the |
I think this is probably the "correct" way to approach this. With such an iterator method we can easily implement any/all flavours of splitting up file names. In borrowing from python's pathlib (which is very ergonomic), it may be as simple as implementing something akin to Using the running example of the corner case file name use std::path::Path;
let path = Path::new("path/to/File Version 3.5.4 (Copy).tar.gz");
let mut exts = path.suffixes()
assert_eq!(path.file_prefix(), Some("File Version 3"));
assert_eq!(exts.next(), Some(".5"));
assert_eq!(exts.next(), Some(".4 (Copy)"));
assert_eq!(exts.next(), Some(".tar"));
assert_eq!(exts.next(), Some(".gz"));
assert_eq!(exts.next(), None); Although, I can also understand the argument that the following is more "complete" use std::path::Path;
let path = Path::new("path/to/File Version 3.5.4 (Copy).tar.gz");
let mut parts = path.file_parts()
assert_eq!(parts.next(), Some("File Version 3"));
assert_eq!(parts.next(), Some(".5"));
assert_eq!(parts.next(), Some(".4 (Copy)"));
assert_eq!(parts.next(), Some(".tar"));
assert_eq!(parts.next(), Some(".gz"));
assert_eq!(parts.next(), None); |
FYI, here's my attempt to implement let mut exts = Path::new("foo.tar.gz").extensions().unwrap();
assert_eq!(exts.next(), Some(OsStr::new("gz")));
assert_eq!(exts.next(), Some(OsStr::new("tar")));
assert_eq!(exts.next(), None);
assert_eq!(exts.visited(), Some(OsStr::new("tar.gz"));
assert_eq!(exts.remaining(), "foo"); With this implementation, we have |
Correct me if I'm wrong, but I believe that having a method that extracts the part of file name before or after the first dot is not a good idea. By splitting a file name at the first dot, one asserts that they won't have dots in the non-extension part of file name. However, there are actually a bunch of densely-dotted file names in the wild that go against this design, for example:
IMO, none of these file names benefits much from a split at the first dot unless it is guaranteed that we deny dots in the non-extension part of a file name. I'd prefer to go with the iterator solution suggested above and I'm willing to open a PR for this. |
Agreed, that's why I named the original suggestion
|
@EvanCarroll That might be a solution, but I think there are some disadvantages to it: The For this reason, a forward iterator seems not very practical. Also, I'm afraid it is not allowed to do a This is why I implemented
If it is really needed to do what
And if it is sometimes useful to return the extensions with preceding dots, we could have:
|
You are right, none of those listed file names benefit from a split at the first dot. But there are many more file names "in the wild" that do. Solution: anyone working with those file names uses some other method to extract the parts they want... |
Instead of creating new methods with new semantics that will not treat files with dots on parts of the filename not related to the extension, why not extend the current ones like: let path = Path::new("foo 1.2.3.tar.gz");
assert_eq!(path.file_stem_multi(2), Some("foo 1.2.3"));
assert_eq!(path.extension_multi(2), Some("tar.gz")); This will only have a different contract than the current ones, where:
is changed to;
Where Note: named with a better suffix than |
@thomcc apologies for pinging you directly but I couldn't find a tag for the library team anywhere. |
Opening a stabilization PR is trivial, you just need to change the That being said, I have to agree with the concerns that the algorithm here can be a bit bikeshed-y and this could be better served by pattern functions on Opening a stabilization PR to start libs-api discussion probably isn't a bad idea in any case. |
@tgross35 according to the stabilization docs you linked the first step is
Are you a member of said team? If I try and use that identifier in this comment it doesn't come up with anything... |
I am not, but team members can always be found in the relevant files on the teams repo https://github.com/rust-lang/team/blob/master/teams/libs-api.toml. Pinging team members isn't really the best way to get something on the radar though, so opening the stabilization PR first and requesting libs-api review is usually easier for trivial things like this (FCP just happens there). If you have an idea for an iterator-based API it may be worth proposing that too as an alternative. New API proposals go through the ACP process, which is an issue template at https://github.com/rust-lang/libs-team/issues. (also feel free to drop in on Zulip if you have any more specific questions) |
Thanks very much for you help @tgross35. I'll open the PR and we can discuss the FCP there and whether this implementation or an iterator API is preferable. |
Shall we stabilize this? @rfcbot merge |
Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
Concern: If we're already doing the work to parse the suffix we might as well return both a suffix and prefix, since they're just borrows. Possible names:
|
Do you mean adding another function with that functionality? Because the function being tracked here, I do agree that having such function(s) would be useful though. |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. This will be merged soon. |
Feature gate:
#![feature(path_file_prefix)]
This is a tracking issue for an implementation of the method
std::path::Path::file_prefix
. It is effectively a "left" variant of the existingfile_stem
method.Public API
Steps / History
Unresolved Questions
The text was updated successfully, but these errors were encountered: