Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base_url strips relative path #846

Closed
dmfigol opened this issue Mar 4, 2020 · 16 comments · Fixed by #1130
Closed

base_url strips relative path #846

dmfigol opened this issue Mar 4, 2020 · 16 comments · Fixed by #1130
Labels
bug Something isn't working user-experience Ensuring that users have a good experience using the library
Milestone

Comments

@dmfigol
Copy link

dmfigol commented Mar 4, 2020

Hi,
I've noticed that if I do something like:

http_client = https.AsyncClient(base_url="https://my-service/api/v1")

for further requests (e.g. /messages) /api/v1 will be stripped.
Is it expected behavior?
Tested on httpx 0.11.1
Thanks!

@StephenBrown2
Copy link
Contributor

StephenBrown2 commented Mar 4, 2020

The docs are missing a specific reference to that, but you need to make sure any subpaths you're using don't have a leading slash, else they will be interpreted as absolute paths. There's a PR to update the docs for that: #843

Personally I generally use path.lstrip("/") before I do http_client.verb(path, **kwargs) in my calls.

@florimondmanca
Copy link
Member

florimondmanca commented Mar 4, 2020

I’d personally like HTTPX to be smart enough to not strip any prefix path too, and allow using a leading / on subsequent requests. Mostly because that’s the UX that test clients in web frameworks have been acustoming folks to. I’d consider the current behavior a bug, somewhat.

@florimondmanca florimondmanca added the user-experience Ensuring that users have a good experience using the library label Mar 4, 2020
@tomchristie
Copy link
Member

@florimondmanca Yeah I think that's a reasonable observation actually. It's not that our .join(...) behaviour is broken, but that we shouldn't quite be using URL joinging semantics for clients with a base URL.

@StephenBrown2
Copy link
Contributor

But what of the case where someone wants to clear the prefix? Should they create a new client, or perhaps pass an absolute_path=True arg to the client.verb call?

@florimondmanca
Copy link
Member

@StephenBrown2 Note that that sounds like a different use case/feature to me (that we don’t even support in the current state of things).

@StephenBrown2
Copy link
Contributor

StephenBrown2 commented Mar 5, 2020

Just tested a client with a base_url will append without a leading slash, and replace with one, and was surprised that it was not what either of us were talking about:

Python 3.8.1 (default, Jan 10 2020, 09:36:37)
[Clang 11.0.0 (clang-1100.0.33.16)] on darwin
>>> import httpx
>>> http_client = httpx.Client(base_url="https://example.com/api/v1")
>>> append = http_client.get("endpoint")
>>> append.url
URL('https://example.com/api/endpoint')
>>> replace = http_client.get("/api/v2/new")
>>> replace.url
URL('https://example.com/api/v2/new')
>>> httpx.__version__
'0.11.1'

And confirmed in iPython for AsyncClient:

Python 3.8.1 (default, Jan 10 2020, 09:36:37)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.13.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import httpx
In [2]: async with httpx.AsyncClient(base_url="https://example.com/api/v1") as http_client:
   ...:     append = await http_client.get("endpoint")
   ...:     print(f"{append.url=}")
   ...:     replace = await http_client.get("/api/v2/new")
   ...:     print(f"{replace.url=}")
   ...: 
append.url=URL('https://example.com/api/endpoint')
replace.url=URL('https://example.com/api/v2/new')

where I expected append.url==URL('https://example.com/api/v1/endpoint'), replace.url was as expected. So, a bug? This is not the case if the base_url has a trailing slash.

In [5]: http_client_ts = httpx.Client(base_url="https://example.com/api/v1/")
   ...: append = http_client_ts.get("endpoint")
   ...: append.url
   ...:
Out[5]: URL('https://example.com/api/v1/endpoint')
In [6]: replace = http_client_ts.get("/api/v2/new")
   ...: replace.url
   ...:
Out[6]: URL('https://example.com/api/v2/new')

Here's the result table:

base=/api/v1/ base=/api/v1
path=/endpoint /endpoint /endpoint
path=endpoint /api/v1/endpoint /api/endpoint

So either we document this (which I think is the proper solution), or "fix" it in some way.

@florimondmanca
Copy link
Member

To be clear, the only usage I'd be expecting from base_url is…

>>> base_url = "https://example.com/api/v1"  # Trailing slash is irrelevant
>>> client = httpx.Client(base_url=base_url)
>>> response = client.get("/endpoint")  # Leading slash is *always* required
>>> response.url
URL("https://example.com/api/v1/endpoint")

(Right now this is wrongly yielding

URL("https://example.com/endpoint")

as you noted too.)

Formally: request.url = client.base_url.rstrip("/") + url — no need for any other smarts or advanced URL semantics.

From an UX perspective, this is the most sensible thing to do in my POV as it is an extension of the default case of using base_url="https://example.com" and eg .get("/api/v1/endpoint") (which we currently correctly implement).

@StephenBrown2
Copy link
Contributor

Personally, now that I see the results, I think that "correct" trumps "surprising" in this case. There are three results currently:

  • If you want to use a new base path, prepend the request path with a slash.
  • If you want to append path segments to the base_url, be sure the base_url ends with a slash and the request path does not begin with one.
  • If you want to replace the last segment of the path with a new one, don't put a slash at the end or beginning of the base_url or request path.

The only case that is "surprising" to me at first glance is the last one, while the first two make intuitive sense to me and are what I would want. Though the last one makes sense after consideration, I currently work around it in the way I said above, with:

client = httpx.Client(base_url="http://example.com/prefix/") # Use a trailing slash
client.verb(path.lstrip("/")) # Enforce no leading slash

This also works with no base_url prefix:

path = "/endpoint"
client = httpx.Client(base_url="http://example.com") # Use no trailing slash
request = client.get(path.lstrip("/")) # Enforce no leading slash
request.url == URL('http://example.com/endpoint')

@florimondmanca
Copy link
Member

I'm just saying we shouldn't even have to think about "what are the possible cases?". Set a base_url, it'll be prepended to all requested URL paths, with automatic handling of trailing slash madness. Wouldn't that be ideal? Also in line with what @dmfigol originally posted.

@dmfigol
Copy link
Author

dmfigol commented Mar 5, 2020

I'd expect to see just merge, no magic required:

  • base_url="https://my-service/api/v1" and .get("/messages") will result in https://my-service/api/v1/messages
  • base_url="https://my-service/api/v1/" and .get("messages") will result in https://my-service/api/v1/messages
  • base_url="https://my-service/api/v1/" and .get("https://my-other-service/api/v1/me") will result in https://my-other-service/api/v1/me

@StephenBrown2
Copy link
Contributor

StephenBrown2 commented Mar 5, 2020

So then to follow:

  • base_url="https://my-service/api/v1" and .get("v2/messages") will result in https://my-service/api/v1/v2/messages
  • base_url="https://my-service/api/v1" and .get("/api/v2/messages") will result in https://my-service/api/v1/api/v2/messages

requiring a v1_client = httpx.Client(base_url=".../v1") and v2_client = httpx.Client(base_url=".../v2") is probably an acceptable compromise.

@florimondmanca
Copy link
Member

@StephenBrown2

  • IMO cases when the requested URL doesn't start with either a scheme or / (as in your first example) should be forbidden. Or naively result in https://my-service/api/v1v2/messages.
  • Agreed, if we need multiple base_url it's a sign we are talking to multiple services and so we should use multiple separate clients.

@StephenBrown2
Copy link
Contributor

As long as those restrictions are documented with proper error messages (and the naive case prevented), I'm good with that.

@tomchristie tomchristie added this to the v1.0 milestone Jul 30, 2020
@tomchristie
Copy link
Member

Milestoning to review for 1.0, since there's a potential behavioural change for us to introduce here.

@james-emerton
Copy link

I found this change in behaviour quite surprising; I expected urljoin() semantics and consistency with how a browser treats relative paths. ie: For a given base_url with a path, relative urls beginning with a / should resolve to a path at the server root and those without should be appended to the base_url.

I don't expect that my opinion effect any change, but I don't feel the current behaviour is adequately documented.

@florimondmanca
Copy link
Member

I don't feel the current behaviour is adequately documented

True, and if we have an open issue tracking adding further docs for this behavior: #1139 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working user-experience Ensuring that users have a good experience using the library
Projects
None yet
5 participants