Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URI query component is ignored when opening a file #92

Closed
joouha opened this issue Mar 30, 2023 · 5 comments · Fixed by #190
Closed

URI query component is ignored when opening a file #92

joouha opened this issue Mar 30, 2023 · 5 comments · Fixed by #190
Labels
bug 🐛 Something isn't working
Milestone

Comments

@joouha
Copy link
Contributor

joouha commented Mar 30, 2023

When opening a resource using universal pathlib, the query component of the URI is dropped, resulting in the wrong resource being opened.

For example, running the following:

>>> UPath("https://duckduckgo.com/?q=upath").read_text()

makes a GET request for https://duckduckgo.com/ without the query parameter, giving unexpected output.

@ap--
Copy link
Collaborator

ap-- commented Apr 2, 2023

Hi @joouha

Currently we throw away query and fragments here:

args_list.insert(0, parsed_url.path)

But since only the first argument to UPath is parsed, there's one way to construct the path right now.

>>> upath.UPath("http://example.com/a")
HTTPPath('http://example.com/a')

>>> upath.UPath("http://example.com/", "a?q=123")
HTTPPath('http://example.com/a?q=123')

but joining on this doesn't work correctly, since it just treats it like a filename.

I think the correct way to fix this is adding query and fragment in

def _format_parsed_parts(
if they are present.

And then there needs to be some code that updates query and fragments whenever we join paths.
(to do joins correctly this will also require fixing #88)

Would you like to work on a PR?
I'll only have time to work on this in about 2-3 weeks from now.

Cheers,
Andreas

@drernie
Copy link

drernie commented Jul 12, 2023

@ap-- I have an urgent use case for this.

We have a commercial project that tracks different versions of S3 objects using a query parameter::

"s3://udp-spec/manual/force/README.md?versionId=B0zNSMELW__87yfYtZcfcdBw3qxHkFhm"

I know s3.open can handle it:

fo_old_version = s3.open('versioned_bucket/object', version_id='SOMEVERSIONID')

The naive solution is to pass any query parameters as kwargs to open, but I don't know if that might cause problems for others.

Should I just create a PR against this issue solve my use case, then work with you to figure out how to generalize it?

Or would it be wiser to define a general mechanism for passing "open" parameters, and manually extract the query on my side first?

@ap--
Copy link
Collaborator

ap-- commented Jul 12, 2023

Hi @drernie,

The issue you are linking to seems to be in a private repository.

upath should already allow you to do this, if you are willing to handle the query parameter yourself:

from upath import UPath

pth = UPath("s3://udp-spec/manual/force/README.md", version_aware=True)
with pth.open(version_id="B0zNSMELW__87yfYtZcfcdBw3qxHkFhm") as f:
    data = f.read()

Let me know if that helps!

I'm currently working on a larger refactor to address a few other important issues, and would then address a common way for handling query parameters. If you'd like to contribute, I would ping you once that is ready.

Cheers,
Andreas

@drernie
Copy link

drernie commented Jul 14, 2023

Ouch. I just realized that
UPath("s3://udp-spec/manual/force/README.md?versionId=B0zNSMELW__87yfYtZcfcdBw3qxHkFhm")
drops the query string.

This seems to imply I have to manually track the version ID everywhere. Is that correct, or is there a way to attach attributes to the UPath?

@ap-- ap-- added the bug 🐛 Something isn't working label Aug 2, 2023
@drernie
Copy link

drernie commented Oct 2, 2023

Did this get somehow addressed in #125 ?

@ap-- ap-- added this to the v0.2.1 milestone Feb 15, 2024
@ap-- ap-- closed this as completed in #190 Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants