-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of URL quoting (for file://
URLs)
#1168
Comments
What happens if you do |
>>> fs, urlpath = url_to_fs(str(p))
>>> urlpath
'/tmp/URL--https&c%%zenodo.org%record%68331'
>>> fs.stat(urlpath)
{'name': '/tmp/URL--https&c%%zenodo.org%record%68331',...} |
OK, so |
Can you please clarify what the generally recommended behavior would be? I have posted the simplest case of what I believed to be the key issue. However, I am having this issue in a use case where I am accessing specific members of archive files (that are either remote or local) via a chained "URL" that I pass to
This is all working great, expect for this special case of |
See fsspec/filesystem_spec#1168 Handle it by not generating `file://` URLs ourselves, but by passing naked platform paths in. This is subject to a trial on windows still.
It seems to me that the issue is with Path.to_uri - don't use that, use str() instead. Then you don't get any quoting which you need to undo. Path is specific to local files anyway, so not much use in the context of fsspec (unlike universal_pathlib, which maybe does a good job of this). |
I have now changed my code to stop using For the record I'd like to mention that I believe that handling
From that statement I would conclude that (also) percent-encoding is to be expected in any file URL. |
OK, food for thought. I'd rather not do anything about it for now, especially since typical posix tools (like bash) don't use such paths. In some cases, fsspec has to handle a large number of paths, the expense of yet another encoding step is better avoided. |
See fsspec/filesystem_spec#1168 Handle it by not generating `file://` URLs ourselves, but by passing naked platform paths in. This is subject to a trial on windows still.
I have a local file at
/tmp/URL--https&c%%zenodo.org%record%68331
that I am trying to access via FSSPEC. This fails due to a missing unquoting step, as far as I can tell.Here is a small demo to show the essence of the issue:
I was confused by the above behavior, as I expect the
urlpath
returned byurl_to_fs()
to match the needs of the simultaneously returnedfs
object. My expectation was that fsspec would assume all input URL to be adequately quoted, and do any necessary unquoting to match the nature ofurlpath
to the needs offs
internally.Do you consider this behavior a bug?
The text was updated successfully, but these errors were encountered: