-
Notifications
You must be signed in to change notification settings - Fork 380
Stop inheriting from pyarrow.filesystem for pyarrow>=2.0 #411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop inheriting from pyarrow.filesystem for pyarrow>=2.0 #411
Conversation
@@ -68,7 +69,10 @@ def __call__(cls, *args, **kwargs): | |||
# TODO: it should be possible to disable this | |||
import pyarrow as pa | |||
|
|||
up = pa.filesystem.DaskFileSystem | |||
if LooseVersion(pa.__version__) < LooseVersion("2.0"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to use an else
clause for this, rather than putting it in the try
block, so that the code in the try
block is minimized?
Great to see this. It would be nice not to import pyarrow at all, to improve fsspec import times. Can the version check be done without importing? This works, but needs distutils: import pkg_resources
pkg_resources.get_distribution('pyarrow').version
Can you please link? |
The pyarrow PR is here: apache/arrow#8149 |
In my development environment, I get an error for this:
Maybe that doesn't work with editable installs? (as in another environment it does work) Based on a quick non-scientific test ( |
Apparently importlib.metadata does this, but that only exists in py38. A backport exists, and is really fast import importlib_metadata
importlib_metadata.distribution('pyarrow').version Does this work with your dev env? |
Nope, that gives Now, I could also add another layer of try/except, and first try the pkg_resources way, and only if that fails, still try an actual import. Then at least for the majority that has a normal pyarrow install and not a dev install, it will work faster? |
Asking the anaconda crowd, someone must have a nice way. This would be a nice little function in utils, rather than all the logic in this import/don't import block. |
I am only getting regex suggestions, so I would say: Make a function in utils which, for a given package name, in order
|
There's no reason you can't make importlib_metadata an optional dependency, as described in #5 of https://packaging.python.org/guides/single-sourcing-package-version/. Then you never need to try pkg_resources or the import. |
If it's optional, it might not be there - so importing fsspec would fail unless it's a hard dependency. |
Sorry, I meant "conditional" rather than "optional".
|
I am uncertain, as fsspec currently has no dependencies at all. |
Could that potentially be left for a separate PR? |
OK fine, I'll try to get around to doing that myself. |
Yeah, I don't really know if the and we actually use it in one place in pyarrow (to create a new directory): https://github.com/apache/arrow/blob/7a532edeabc6f30838e5a53dfef35f37fdf99737/python/pyarrow/parquet.py#L1718-L1723 From checking the other methods, this |
OK, I'll take your word for it :) |
Good idea, I only ran the pyarrow parquet tests with it (but those have only limited tests with fsspec filesystems), will run the dask parquet tests as well |
I intend to release soon, so please let me know how the dask tests go. |
Tests seem to pass |
"Seem" !! |
To be honest, I would be more confident I was correctly testing it if there was a failure to fix .. ;-) |
See discussion in #295
Starting with pyarrow 2.0, the legacy
pyarrow.filesystem
FileSystems will be officially deprecated. As preparation, I think that fsspec can also stop inheriting from it.Almost all methods on the base class from pyarrow are actually overridden / implemented in fsspec, so no longer inheriting from it shouldn't have a big impact, except for
isinstance
checks. I am doing some changes in pyarrow to allow fsspec filesystems instead of doing a strictisinstance
.