Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correcting Path::components on non-"Unix" platforms #52331

Open
jackpot51 opened this issue Jul 13, 2018 · 24 comments
Open

Correcting Path::components on non-"Unix" platforms #52331

jackpot51 opened this issue Jul 13, 2018 · 24 comments
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` C-bug Category: This is a bug. O-nintendo3ds Target: ninTENdooooo O-UEFI UEFI O-windows Operating system: Windows T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@jackpot51
Copy link
Contributor

jackpot51 commented Jul 13, 2018

Path::components is incorrect on Redox. I would like to develop the solution
here: #51537.

The following is a description of the problem.

Suppose you have the following path:

file:/home/user/foo/bar.txt

You split the path into components using Path::components

Path::new("file:/home/user/foo/bar.txt").components().collect::<Vec<()>>()

In Linux, this would be equivalent to the following:

vec![
    Component::Normal("file:"),
    Component::Normal("home"),
    Component::Normal("user"),
    Component::Normal("foo"),
    Component::Normal("bar.txt"),
]

Joining the components with the current directory would give you a path such as
this:

./file:/home/user/foo/bar.txt

In Redox, we want to be able to get from the original path to components back to
the original path without any modifications. Here are examples of this usage of
Path::components:

https://github.com/uutils/coreutils/search?q=components

In Redox, we have the following options for the file: component:

  1. Component::Normal("file:")
  2. Component::Normal("file:") followed by Component::RootDir
  3. Component::Prefix(Prefix::Verbatim("file:"))
  4. Component::Prefix(Prefix::Scheme("file:"))

Option 1

Component::Normal("file:")

The path mentioned above would end up being the following after being rebuilt
from its components:

./file:/home/user/foo/bar.txt

This is the old way of doing things. It not only makes Path::components
useless on Redox. Canonicalizing a path will always add a scheme like file:
to the beginning, so it is likely that path handling will be incorrect.
Absolute paths would always be interpreted as relative.

❌ This option is unusable for Redox.

Option 2

Component::Normal("file:") followed by Component::RootDir

This would preserve the original meaning of the path, such that it could be
rebuilt from its components as follows:

file:/home/user/foo/bar.txt

However, this may require a large amount of work to handle, as it seems likely
that code exists that only checks the first component for being
Component::RootDir or Component::Prefix in order to identify an absolute
path.

The documentation for Prefix provides one such example, which has likely
inspired similar usage:

https://doc.rust-lang.org/std/path/enum.Prefix.html#examples

❌ This option would likely break the expectations of many consumers of the
Prefix enum.

Option 3

Component::Prefix(Prefix::Verbatim("file:"))

This option preserves the original meaning of the path, a rebuilt path would be
this:

file:/home/user/foo/bar.txt

This, however, is overloading a variant meant to be used on Windows, for a path
looking like this:

\\?\home\user\foo\bar.txt

This means different code will be needed when running on Redox to correctly
parse paths to components and turn components into paths.

✔️ This does leave the enum untouched, while allowing for the
correct parsing of paths on Redox. The only downside is a possible lack of
clarity due to overloading the meaning of the Prefix::Verbatim variant.

Option 4

Component::Prefix(Prefix::Scheme("file:"))

This option also preserves the original meaning of the path, a rebuilt path
would be this:

file:/home/user/foo/bar.txt

This is the most clear option, having separate code to handle specifically the
Redox scheme abstraction.

This has the downside of changing a stable enum, Prefix. There is, however,
the possibility of having the extra enum variant be
#[cfg(target_os = "redox")], so as to preserve the Prefix enum on other
platforms.

✔️ This option could be used to have correct path parsing
without affecting the stability of the Prefix enum on non-Redox platforms,
if a cfg attribute is used.

Conclusion

Potentially the Prefix enum would be done different after a major version bump,
perhaps using extension traits that are platform specific. I would think that
there would be an opaque Prefix struct, and something like .into_os_prefix()
would provide you with an os-specific enum to match against.

For the time being, options 3 and 4 seem to be possible, with some caveats, and
would allow code using stable Rust to quickly do-the-right-thing on Redox.

@kennytm kennytm added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-bug Category: This is a bug. O-redox Operating system: Redox, https://www.redox-os.org/ labels Jul 14, 2018
@kennytm
Copy link
Member

kennytm commented Jul 14, 2018

I would think that there would be an opaque Prefix struct, and something like .into_os_prefix() would provide you with an os-specific enum to match against.

The Prefix variant of the Component enum contains the opaque struct PrefixComponent, not the enum Prefix.

This means a backward-compatible option 5 is available:

impl<'a> PrefixComponent<'a> {
    /// On Windows, returns the parsed prefix data.
    ///
    /// On other platforms, the returned value is unspecified.
    #[cfg_attr(
        target_os = "redox",
        rustc_deprecated(
            since = "1.29.0", 
            reason = "The prefix component is meaningless on Redox. Use `.scheme()` instead",
        ),
    )]
    pub fn kind(&self) -> Prefix<'a> { ... }
}

mod redox {
    pub trait PrefixComponentExt<'a> {
        // Obtains the scheme of the Redox path (URL).
        fn scheme(&self) -> &'a OsStr;
    }

    impl<'a> PrefixComponentExt<'a> for PrefixComponent<'a> {
        fn scheme(&self) -> &'a OsStr { self.as_os_str() }
    }
}

@jackpot51
Copy link
Contributor Author

What should the return value of PrefixComponent::kind be on Redox?

@4lDO2
Copy link

4lDO2 commented Jul 13, 2020

Regarding option 5, shouldn't kind() ideally be deprecated on all platforms, with an extension trait for Windows instead? Even though there's probably a lot of software depending on the kind method, isn't it more optimal for it to be platform specific like everything else that's filesystem related.std::os::fs::symlink for example, which arguably is one of the more common unix syscalls, was made platform specific while the older soft_link function was deprecated.

@4lDO2
Copy link

4lDO2 commented Jul 13, 2020

@jackpot51 Maybe DeviceNS? Not that there is any enum variant that fits the purpose correctly at all on Redox, but based on my limited experience with file systems on Windows, AFAIK windows uses "devices" (IoCreateDevice) for filesystems, which would probably apply more than the alternatives, for a Redox scheme.

That said, the return value would not make any sense at all, but at least it would make parsing and serialization to and from Prefix consistent, even though it won't be the same format as real schemes.

@ian-h-chamberlain
Copy link
Contributor

@rustbot label +I-needs-decision

I hope this is the right way to ask for some input from the libs-api team. This issue has stagnated for a while without a clear path forward, and it would be nice if we had a way to support multiple platforms that use a path prefix scheme like this other than just Windows.

@rustbot rustbot added the I-needs-decision Issue: In need of a decision. label Apr 4, 2022
@m-ou-se m-ou-se added I-libs-api-nominated Nominated for discussion during a libs-api team meeting. and removed I-needs-decision Issue: In need of a decision. labels Apr 13, 2022
@joshtriplett
Copy link
Member

One proposed design:

  • Make kind panic
  • Add new methods for windows_kind and redox_kind, which return an Option of a non-exhaustive enum.
  • Possibly mark kind as deprecated, if that wouldn't produce too many warnings in the ecosystem. (If it would, mark it as deprecated_in_future.)

@joshtriplett
Copy link
Member

Also, we're going to need someone willing to do the design and implementation work here.

@jackpot51
Copy link
Contributor Author

@joshtriplett I will implement whichever design has the highest chance of being upstreamed.

@m-ou-se m-ou-se removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label May 11, 2022
@jackpot51
Copy link
Contributor Author

@joshtriplett can you please help to get upstream consensus on which option from the original post would be best and I will implement it?

@joshtriplett
Copy link
Member

@jackpot51 I don't think attempting to use the existing Prefix will work; we can't add new variants, and none of the existing variants really fit (and will break expectations of software written for Windows).

Given that, that leaves us with:

  • Make kind panic on non-Windows platforms.
  • Add a redox_kind method to PrefixComponent, returning Option<RedoxPrefix>, which should be a non_exhaustive enum.
  • Add a windows_kind method as well. That should either return Option<WindowsPrefix> as another non_exhaustive enum, or Option<Prefix> for compatibility with the existing type if we're more confident we'll never want another variant on Windows. (Advice from the Windows team would be helpful here. I would guess we want the latter, since the existing prefixes include extensible variants.)
  • On some timeline, mark kind as deprecated or deprecated_in_future.

Windows folks: should we pay the transition cost to allow for new path prefixes, or is that exceedingly unlikely given that the existing prefixes include extensible variants?
@rustbot ping windows

@rust-lang/libs-api: Does the above plan sound reasonable?
@rfcbot merge

(This is orthogonal to questions about how we should handle non-native path types in the future.)

@rfcbot
Copy link

rfcbot commented Jul 29, 2022

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Jul 29, 2022
@rustbot rustbot added the O-windows Operating system: Windows label Jul 29, 2022
@rustbot
Copy link
Collaborator

rustbot commented Jul 29, 2022

Hey Windows Group! This bug has been identified as a good "Windows candidate".
In case it's useful, here are some instructions for tackling these sorts of
bugs. Maybe take a look?
Thanks! <3

cc @arlosi @ChrisDenton @danielframpton @gdr-at-ms @kennykerr @luqmana @lzybkr @nico-abram @retep998 @rylev @sivadeilra @wesleywiser

@rfcbot rfcbot added the disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. label Jul 29, 2022
@jackpot51
Copy link
Contributor Author

@jackpot51 I don't think attempting to use the existing Prefix will work; we can't add new variants, and none of the existing variants really fit (and will break expectations of software written for Windows).

Given that, that leaves us with:

* Make `kind` panic on non-Windows platforms.

* Add a `redox_kind` method to `PrefixComponent`, returning `Option<RedoxPrefix>`, which should be a non_exhaustive enum.

* Add a `windows_kind` method as well. That should either return `Option<WindowsPrefix>` as another non_exhaustive enum, or `Option<Prefix>` for compatibility with the existing type if we're more confident we'll never want another variant on Windows. (Advice from the Windows team would be helpful here. I would _guess_ we want the latter, since the existing prefixes include extensible variants.)

* On some timeline, mark `kind` as `deprecated` or `deprecated_in_future`.

Windows folks: should we pay the transition cost to allow for new path prefixes, or is that exceedingly unlikely given that the existing prefixes include extensible variants? @rustbot ping windows

@rust-lang/libs-api: Does the above plan sound reasonable? @rfcbot merge

(This is orthogonal to questions about how we should handle non-native path types in the future.)

If deprecating kind is on the table, maybe move this to an os-specific trait like PathExt?

@ian-h-chamberlain
Copy link
Contributor

If deprecating kind is on the table, maybe move this to an os-specific trait like PathExt?

I would like to express support for this as an option, because I think it's conceivable there would be use cases for other OSes expanding on the API in the future (for example, the 3DS horizon OS is another place where we'd like something similar). It doesn't seem ideal in my opinion to add a new {os}_kind() method for every new OS that wants something like this, but an extension trait could allow specific targets to add such functionality as needed?

@jackpot51
Copy link
Contributor Author

jackpot51 commented Dec 22, 2023

I will be trying to solve this using the recommendations above. It has been a while, but we are still running into issues with path parsing on Redox and solving this issue would fix most of them.

@jackpot51
Copy link
Contributor Author

We've decided to provide, and use by default, a path format that works without needing prefixes. scheme_name:path/to/file can now be represented as /scheme/scheme_name/path/to/file.

@rfcbot rfcbot removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Jan 31, 2024
@kennytm
Copy link
Member

kennytm commented Jan 31, 2024

apparently even with Redox switching to the Unix path format, this is still a problem for 3DS Horizon OS (#52331 (comment))?

@jackpot51
Copy link
Contributor Author

That is correct.

@Ayush1325
Copy link
Contributor

This is also a problem for UEFI.

@workingjubilee workingjubilee changed the title Correcting Path::components on Redox Correcting Path::components on non-"Unix" platforms Oct 26, 2024
@workingjubilee workingjubilee added O-windows Operating system: Windows O-UEFI UEFI O-nintendo3ds Target: ninTENdooooo and removed O-windows Operating system: Windows O-redox Operating system: Redox, https://www.redox-os.org/ labels Oct 26, 2024
@Ayush1325
Copy link
Contributor

Oh, I was working on creating a new issue, but I guess since this has been reopened, I can put my proposal here.

UEFI Prefix Types

Shell Prefix

If UEFI shell is present, it creates mappings for all partitions/devices that implement SIMPLE_FILESYSTEM_PROTOCOL. It follows the following naming <MTD><HI><CSD> (eg: FS0:).

Device Path

EFI_DEVICE_PATH_PROTOCOL is the normal UEFI way to represent devicepaths. It also has unique conversion to and from UTF-16 strings. The text representation of path looks as follows: PciRoot(0x0)/Pci(0x1,0x1)/Ata(Secondary,Slave,0x0)/\run.efi

Proposal

I wanted to propose a 2-step plan to improve the handling of such paths, and creating the respective PRs.

Step 1

Create a private GenericPrefix opaque struct. This will be used to replace all the private APIs that currently rely on Prefix such as determining absolute/relative paths, etc. Initially, I was planning to use PrefixComponent for this. However, since PartialEq, Ord and Hash implementations on PrefixComponents are stable and rely on Prefix internally, any changes to that will probably cause problems.

This step will allow isolating Prefix and PrefixComponent to the std surface. Additionally, it will also allow most path operations to work for platforms like UEFI with no API-breaking changes.

Step 2

Start adding new public APIs for new prefix capabilities and deprecating Prefix. This is much more delicate the Step 1 and will require discussion when the time comes. Hopefully, step 1 will make the design of any such API much clearer.

Does this seem good? Or maybe there is a better way to go about it?

@workingjubilee workingjubilee added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Oct 26, 2024
@Amanieu Amanieu added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Nov 2, 2024
@m-ou-se
Copy link
Member

m-ou-se commented Nov 18, 2024

@Ayush1325 We discussed this in a recent libs-api meeting. We're not sure what the right path forward is with Path::components(), but we're willing to try out your proposal as an unstable feature to see where we end up. You're welcome to open a tracking issue for it and send PRs to make your changes. (As long as they don't affect stable Rust, of course.)

Separately from that, I have some thoughts about another solution, which I will post in a moment.

@m-ou-se
Copy link
Member

m-ou-se commented Nov 18, 2024

I have been wondering for quite a while if a (better) solution could be to fully deprecate Path::components(). A year or two ago, I did a search through GitHub Code Search for all uses of Path::components() in the wild, and noticed that (at the time), effectively 100% of the usages were all ad-hoc (sometimes buggy) implementations of common operations like finding the common prefix of two paths, iterating over parent directories (to find e.g. a Cargo.toml file), path canonicalization without file system access (removing .. and . components), etc.

It gave me the impression that by adding a few of those methods to Path (which we should probably do regardless), there might not be much of a reason to keep maintaining Path::components(), allowing us to deprecate it, and/or to worry less about its behaviour on new platforms.

(After all, if you think about it, Path::components() is quite a weird API. It's an iterator that produces elements of which a certain variant can only ever appear as the first item, even though the signature/types would allow for something invalid like a Prefix to appear in the middle. If designed from scratch today, I wonder if we'd go for (Option<Prefix>, Iterator<Target=&str>) rather than Iterator<Target=Component>, which seems more fitting to me.)

@m-ou-se m-ou-se removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Nov 19, 2024
@Ayush1325
Copy link
Contributor

@m-ou-se I agree, it should be deprecated. The problem is that Path internally also uses Component internally to check absolute paths, which in turn affects how join and other path manipulation APIs work. So those also need to be updated in a way that does not depend on Component but does not break the current behavior.

@retep998
Copy link
Member

A year or two ago, I did a search through GitHub Code Search for all uses of Path::components() in the wild, and noticed that (at the time), effectively 100% of the usages were all ad-hoc (sometimes buggy) implementations of common operations like finding the common prefix of two paths, iterating over parent directories (to find e.g. a Cargo.toml file), path canonicalization without file system access (removing .. and . components), etc.

We absolutely should come up with a solid list of these usages, and figure out how to design a path API that is designed to actually handle these use cases solidly, instead of just providing a bunch of lower level access and expecting people to solve basic problems every time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` C-bug Category: This is a bug. O-nintendo3ds Target: ninTENdooooo O-UEFI UEFI O-windows Operating system: Windows T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests