Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHubPath not working after 0.1.0 #154

Closed
juftin opened this issue Oct 12, 2023 · 3 comments · Fixed by juftin/universal_pathlib#1
Closed

GitHubPath not working after 0.1.0 #154

juftin opened this issue Oct 12, 2023 · 3 comments · Fixed by juftin/universal_pathlib#1
Labels
bug 🐛 Something isn't working compatibility 🤝 Compatibility with stdlib pathlib enhancement 🚀 New feature or request
Milestone

Comments

@juftin
Copy link
Contributor

juftin commented Oct 12, 2023

I've got a couple projects, browsr and textual-universal-directorytree, that leverage universal-pathlib for a TUI file browser. I started working on updating dependencies yesterday and noticed an incompatibility with the new version of universal-pathlib: my custom implementation of UPath for fsspec's GithubFileSystem stopped working after the release of 0.1.0. In particular, the method is_dir() is what broke on instances of the GitHubPath:

The following code, I'm calling it example.py, will correctly know that upath is a directory prior to 0.1.0 but will get this wrong after:

import upath
from upath import UPath, registry
from upath.core import _FSSpecAccessor


class GitHubPath(UPath):
    """
    GitHubPath supporting the fsspec.GitHubFileSystem
    """

    _default_accessor = _FSSpecAccessor

    @property
    def path(self) -> str:
        """
        Paths get their leading slash stripped
        """
        return super().path.strip("/")

    @property
    def name(self) -> str:
        """
        Override the name for top level repo
        """
        if self.path == "":
            org = self._accessor._fs.org
            repo = self._accessor._fs.repo
            sha = self._accessor._fs.storage_options["sha"]
            github_name = f"{org}:{repo}@{sha}"
            return github_name
        else:
            return super().name


if upath.__version__.startswith("0.1"):
    registry.register_implementation("github", GitHubPath)
else:
    registry._registry.known_implementations[
        "github"
    ] = "example.GitHubPath"

if __name__ == "__main__":
    print(upath.__version__)

    github = upath.UPath("github://fsspec:universal_pathlib@main")
    contents = {item.name: item for item in github.iterdir()}
    source_code = contents["upath"]

    print(source_code)
    print(source_code.is_dir())

0.0.23 Output:

0.0.23
github://fsspec:universal_pathlib@main/upath
True

0.1.3 Output:

0.1.3
github://fsspec:universal_pathlib@main/upath
False

I know that GitHub isn't a supported upath implementation yet so I understand if you're unable to support this issue, I'd also like to get it added if that's something you're interested in. I'm happy to share code or submit a PR where I can - right now I'm just stuck on trying to figure out exactly what changed.

@juftin
Copy link
Contributor Author

juftin commented Oct 12, 2023

Looking at this closer I think I've got the issue figured out, there were leading slashes being passed down to the GithubFilesystem that it didn't like. The following code looks to have resolved the issue without the need to customize the GitHubPath further:

import upath
from upath import UPath, registry
from upath.core import _FSSpecAccessor


class _GitHubAccessor(_FSSpecAccessor):
    """
    FSSpec Accessor for GitHub
    """

    def _format_path(self, path: UPath) -> str:
        """
        Remove the leading slash from the path
        """
        return path._path.strip("/")


class GitHubPath(UPath):
    """
    GitHubPath supporting the fsspec.GitHubFileSystem
    """

    _default_accessor = _GitHubAccessor

With that being said, are you interested in a GitHub implementation for upath and are there any issues you see with the example in this comment? I'm happy to contribute if so. Thanks again for the fantastic library

@ap--
Copy link
Collaborator

ap-- commented Oct 12, 2023

Hi @juftin

Thanks for reporting the issue!
We're keeping up with the changes in pathlib, and will probably reach a stable interface with python 3.13 that can then be back-ported. In the meantime we're trying the best to not break external UPath subclasses, but can't fully guarantee.

That being said, the best way to ensure your custom implementation will keep on working is to add it to universal_pathlib 😃 So yes, it would be great if you could make a PR.

I wrote some instructions here on how to add new implementations: #117 (comment)

In addition to your suggested changes above, you would need to add a test that derives from the BaseTests class and does the setup. I would probably try to use the universal_pathlib repo for the tests.

Let me know if you need any further advice. If you get stuck, feel free to open a draft PR for discussion.

Cheers,
Andreas

@ap-- ap-- added enhancement 🚀 New feature or request compatibility 🤝 Compatibility with stdlib pathlib bug 🐛 Something isn't working labels Oct 12, 2023
@juftin juftin reopened this Oct 12, 2023
@juftin juftin mentioned this issue Oct 12, 2023
@ap-- ap-- added this to the v0.2 milestone Feb 8, 2024
@ap--
Copy link
Collaborator

ap-- commented Feb 8, 2024

Closed via 8314a65

@ap-- ap-- closed this as completed Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working compatibility 🤝 Compatibility with stdlib pathlib enhancement 🚀 New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants