Skip to content

ARROW-9645: [Python] Deprecate pyarrow.filesystem in favor of pyarrow.fs #8149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Sep 9, 2020

No description provided.

@github-actions
Copy link

github-actions bot commented Sep 9, 2020

@jorisvandenbossche
Copy link
Member Author

The main remaining items is pa.hdfs.connect(). It feels a bit annoying to deprecate this again, after we let people migrate from HdfsClient to connect (eg https://stackoverflow.com/questions/47400987/hdfs-connect-vs-hdfsclient-in-pyarrow).

I was thinking to keep pa.hdfs.connect, but letting it return the new filesystem? But since the returned object has a different API, that would also break a lot of code ..

@jorisvandenbossche jorisvandenbossche force-pushed the ARROW-9645-deprecate-legacy-filesystems branch 2 times, most recently from fc5b7fd to 8c57769 Compare September 30, 2020 16:43
@pitrou
Copy link
Member

pitrou commented Sep 30, 2020

I think we should deprecate pa.hdfs.connect. We cannot return something else from it...

@jorisvandenbossche
Copy link
Member Author

We could in theory also add a deprecation in each method on the legacy HadoopFileSystem, so that any use outside of passing it to one of our own functions accepting a filesystem gets deprecated, but not the construction of the filesystem object itself (and then we can eventually return a different object, and for now convert it internally).

But, even if we do the effort for that, we are still left with this pyarrow.hdfs namespace which only has the connect() function long term ..

@jorisvandenbossche jorisvandenbossche force-pushed the ARROW-9645-deprecate-legacy-filesystems branch from d45a0fe to dfdc622 Compare October 7, 2020 08:52
@jorisvandenbossche
Copy link
Member Author

@github-actions crossbow submit -g integration

@github-actions
Copy link

github-actions bot commented Oct 7, 2020

Revision: dfdc622

Submitted crossbow builds: ursa-labs/crossbow @ actions-607

Task Status
test-conda-python-3.6-pandas-0.23 Github Actions
test-conda-python-3.7-dask-latest Github Actions
test-conda-python-3.7-hdfs-2.9.2 Github Actions
test-conda-python-3.7-kartothek-latest Github Actions
test-conda-python-3.7-kartothek-master Github Actions
test-conda-python-3.7-pandas-latest Github Actions
test-conda-python-3.7-pandas-master Github Actions
test-conda-python-3.7-spark-branch-3.0 Github Actions
test-conda-python-3.7-turbodbc-latest Github Actions
test-conda-python-3.7-turbodbc-master Github Actions
test-conda-python-3.8-dask-master Github Actions
test-conda-python-3.8-jpype Github Actions
test-conda-python-3.8-pandas-latest Github Actions
test-conda-python-3.8-spark-master Github Actions

@jorisvandenbossche
Copy link
Member Author

@pitrou @kszucs more feedback here?

return _LocalFileSystem
elif name == "HadoopFileSystem":
_warnings.warn(_msg.format("HadoopFileSystem", "HadoopFileSystem"),
DeprecationWarning, stacklevel=2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use FutureWarning elsewhere.


if _sys.version_info >= (3, 7):
def __getattr__(name):
if name == "localfs":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you want to write a loop and/or helper function to avoid all this copy/pasting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the info in a dict, and simplified this

warnings.warn(
"pyarrow.filesystem.LocalFileSystem is deprecated as of 2.0.0, "
"please use pyarrow.fs.LocalFileSystem instead",
DeprecationWarning, stacklevel=2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put a helper in pyarrow.util? (and use FutureWarning?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a _deprecate_class in pyarrow.util, but that serves a different purpose (it aliases the old name to the new class, but with warning).

And if it is just a helper for raising the warning, I am not sure that is worth it (compared to using a template message that can be filled in).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved a template to pyarrow.util. Can also make it a function raising the warning if you prefer

return LocalFileSystem._get_instance()

try:
import fsspec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fsspec seems to take some time to import, we should probably lookup directly in sys.modules instead.

@pitrou pitrou closed this in 25d131b Oct 7, 2020
@jorisvandenbossche jorisvandenbossche deleted the ARROW-9645-deprecate-legacy-filesystems branch October 9, 2020 16:09
emkornfield pushed a commit to emkornfield/arrow that referenced this pull request Oct 16, 2020
Closes apache#8149 from jorisvandenbossche/ARROW-9645-deprecate-legacy-filesystems

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
Closes apache#8149 from jorisvandenbossche/ARROW-9645-deprecate-legacy-filesystems

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants