You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This would allow to implement a "PurePath" equivalent for path manipulation, that would not require the specific dependencies to be installed.
For known_implementations we can vendor the two required methods (_strip_protocol and _get_kwargs_from_urls) for each protocol.
That allows for example, to do all kinds of path operations on s3 without s3fs to be installed.
The text was updated successfully, but these errors were encountered:
I think only operations that do something on the fs should invoke the backend. String operations like is_relative_to ,stripping protocols or just str(upath) should not. This is equivalent how Pathlib handles it currently.
In Airflow's case the DAG processor would need to have present and instantiate every backend that often does additional eager loading. That is something we really don't want as this can add significant compute time for simple operations.
For now we derive ObjectStoragePath from CloudPath for strict object storage behavior (buckets, keys, etc). So also file would behave the same way (I'm not sure if it does now with upath > 0.2.1). We might additional ones if we can make them behave the same way so users have a uniform and predictable experience.
Summarizing: I think upath should not forward anything to the backend if _protocol_dispatch is False for non filesystem operations.
This would allow to implement a "PurePath" equivalent for path manipulation, that would not require the specific dependencies to be installed.
For
known_implementations
we can vendor the two required methods (_strip_protocol
and_get_kwargs_from_urls
) for each protocol.That allows for example, to do all kinds of path operations on s3 without s3fs to be installed.
The text was updated successfully, but these errors were encountered: