-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change UPath.__new__ behavior #125
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. Thanks for putting in the time to make all the tests work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code that you wrote or is this coming from the original pathlib implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's all code from CPython's test.support
and test.support.os_helper
module that is required to run the pathlib tests. There are some small changes in there to handle minor refactoring that happened in those modules between 3.8 and 3.12, so that we only need to vendor one version.
We should think about how we want to roll this out since this is a major breaking change. |
I agree. The actual impact in this case is very minor though, since code that expects old behavior (i.e. But I think it would be nice to move In any case: poetry projects with "^0.0.24" caret version restrictions are prevented from upgrading. |
This is true. However, for example, we have code that distiguishes between local paths and remote paths by checking I agree that this should get a bump in the minor version.
I'm leaning towards a minor bump. Do you have other features or refactorings lined up? |
I see, that makes sense. Having read more and more of the discussions on the cpython issue tracker, discuss.python.org and the recent PRs regarding pathlib I now believe that the correct way to determine local paths would be to check if
Minor version increase sounds good to me 😃
And ideally we would add support for 3.12 before the actual release. But it dropped Anyways, I'll prepare the |
Wow great stuff! Thank you guys for your hard work! |
] | ||
|
||
|
||
class LocalPath(UPath): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is LocalPath
for? Should WindowsUPath
and PosixUPath
subclass LocalPath
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LocalPath handles the "file://" uri style LocalPaths.
The minor difference between the two is:
- LocalPath provides local file access through fsspec.
- WindowsUPath and PosixUPath via pathlib.
For your usecase do you just need to determine if a path is local, or if the user provided a non-uri local path?
And yes, WindowsUPath and PosixUPath could derive from LocalPath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For your usecase do you just need to determine if a path is local, or if the user provided a non-uri local path?
For local paths, I'd rather not use fsspec. So, I want a way of determining whether I have local paths (essentially pathlib Paths). It doesn't matter to me whether the user omitted the protocol or put in file://
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't cover that use case right now. And I think it didn't work this way previously either.
in <=0.0.24
:
- with
UPath("file:///local/file.txt")
you received aUPath
instance that managed file access throughfsspec
- with
UPath("/local/file.txt")
you received aPosixPath
instance.
in >=0.1.0
:
- with
UPath("file:///local/file.txt")
you receive aLocalPath
instance that manages file access throughfsspec
- with
UPath("/local/file.txt")
you received aPosixUPath
instance.
workaround
A current workaround to force pathlib handling might be:
pth = UPath(...)
if isinstance(obj, LocalPath):
obj = UPath(obj.path) # workaround to force pathlib handling for "file://" uris
assert isinstance(obj, (PosixUPath, WindowsUPath))
if isinstance(obj, (PosixUPath, WindowsUPath)):
... # local.
else:
... # cloud, etc.
but this isn't elegant at all.
options
we'd need a way to determine if file uris should be handled via fsspec or not. Maybe this should actually be an option that should go both ways?
class LocalFileBackend(str, Enum):
PATHLIB = "pathlib"
FSSPEC = "fsspec"
AUTO = "auto"
# --- via a new constructor kwarg ---
obj_a = UPath("/local/file.txt")
obj_b = UPath("file:///local/file.txt")
assert type(obj_a) is PosixUPath and type(obj_b) is LocalPath
obj_a = UPath("/local/file.txt", localfile_backend="pathlib")
obj_b = UPath("file:///local/file.txt", localfile_backend="pathlib")
assert type(obj_a) is PosixUPath and type(obj_b) is PosixUPath
obj_a = UPath("/local/file.txt", localfile_backend="fsspec")
obj_b = UPath("file:///local/file.txt", localfile_backend="fsspec")
assert type(obj_a) is LocalPath and type(obj_b) is LocalPath
# --- via a utility function ---
def ensure_local(pth: UPath, backend: LocalFileBackend) -> PosixUPath | WindowsUPath | LocalPath:
"""raise ValueError if not local, ensure backend if local"""
...
After writing, I would almost prefer an ensure_local
helper function. But we can go either route.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Probably, I didn't run into this issue because I stripped the file://
prefix.
Tbh for my purposes, isinstance(obj, (PosixUPath, WindowsUPath))
is fine. If it were isinstance(obj, LocalUPath)
it would be easier. But no big deal.
This PR changes the way
UPath.__new__
works for local paths.Closes #90.
Currently we return
pathlib.PosixPath
orpathlib.WindowsPath
instances dependent on the OS. This caused issues for users, because the UPath class is not guaranteed to return an instance or subclass instance of itself:For more details, see: #90
To ensure
universal_pathlib
'sUPath
can be used as a drop in forpathlib.Path
instances, this PR now implements the following:Three new
UPath
subclasses are introduced:upath.implementations.local.PosixUPath
(subclass ofpathlib.PosixPath
)upath.implementations.local.WindowsUPath
(subclass ofpathlib.WindowsPath
)upath.implementations.local.LocalPath
The first two are returned by
UPath.__new__
whenever apathlib
compatible local path is provided toUPath()
. They are extensively tested against the official CPython test suite for pathlib, to ensure compatibility with their respective python version pathlib equivalent.The third class is returned whenever a file URI path is provided to
UPath()
. Local file access is then handled through fsspec.This PR should also resolve pydantic typing issues #112.
Cheers,
Andreas 😃