-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local address support. #100
Conversation
The typing issue has been fixed upstream (python/typeshed#4133). |
Any comments on this approach? |
I'm a bit far away from discussions that happened in #88 but this looks sensible enough to me. :-) |
Another option while keeping the tuple form could be to mark the port there as |
Making the port an |
Hmm, a general design philosophy I've seen in Eg I'm not sure what the meaning of Which is why I was suggesting the |
Fair enough. If passing in a tuple from higher layers is ok, there's no reason for the port to be optional; callers can pass in 0. |
These tests are failing with an IPv6 address and asyncio. Not sure why yet.
There are some tests now, although it's hard to test too much in an automated fashion, since proper testing requires the system to have multiple routable addresses and working IPv6, and testing explicit source port numbers requires that the ports are unused. I've done manual testing for all of these. Setting local_addr requires setting family, as asyncio requires that. Maybe it would be better to use something like a (family, address, port) tuple, instead of separate fields for family and address. This could allow address=None, or require that address be set to the appropriate wildcard address for the family if the address isn't important. The bogus error flagged by mypy is now suppressed; hopefully the typeshed fix will be picked up at some point. I think this is ready for a more formal review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bwelling! 💯
Left some minor comments on the current implementation.
Setting local_addr requires setting family, as asyncio requires that. Maybe it would be better to use something like a (family, address, port) tuple, instead of separate fields for family and address. This could allow address=None, or require that address be set to the appropriate wildcard address for the family if the address isn't important.
I think it might be best to stay with the current approach, follow httpcore's "you know what you're doing" philosophy and leave ergonomics to HTTPX. I'd love to hear other's thoughts on this though.
httpcore/_async/connection_pool.py
Outdated
@@ -76,6 +76,9 @@ class AsyncConnectionPool(AsyncHTTPTransport): | |||
* **keepalive_expiry** - `Optional[float]` - The maximum time to allow | |||
before closing a keep-alive connection. | |||
* **http2** - `bool` - Enable HTTP/2 support. | |||
* **family** - `Optional[int]` - Address family to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
* **family** - `Optional[int]` - Address family to use | |
* **family** - `int` - Address family to use, defaults to 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the type of the family
parameter be socket.AddressFamily
rather than just int
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all of occurrences of 0
were replaced with socket.AF_UNSPEC
, then sure. I don't want to make this change yet, though, since if there's consensus to combine family and address into a single field, then there wouldn't be a need for either one, because the entire field would be None
.
tests/async_tests/test_interfaces.py
Outdated
async def test_http_request_local_addr(local_addr: str) -> None: | ||
family = socket.AF_INET6 if ':' in local_addr else socket.AF_INET | ||
async with httpcore.AsyncConnectionPool(family=family, | ||
local_addr=(local_addr, 0)) as http: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I would've expected Black to reformat this 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must have forgotten to run the reformatting script before my last commit; it's done now.
tests/async_tests/test_interfaces.py
Outdated
@pytest.mark.asyncio | ||
# This doesn't run with trio, since trio doesn't support local_addr. | ||
async def test_http_request_local_addr(local_addr: str) -> None: | ||
family = socket.AF_INET6 if ':' in local_addr else socket.AF_INET |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems leftover from when there was more than one value in the parametrization, might be worth removing it altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test really should be trying more than one case, but when I tried testing IPv6, it failed due to the test system not supporting IPv6. I'd rather leave this to make it more clear that it should be parametrized, but if it's important, it can be removed.
tests/async_tests/test_interfaces.py
Outdated
@@ -204,6 +205,46 @@ async def test_http_proxy( | |||
assert reason == b"OK" | |||
|
|||
|
|||
@pytest.mark.parametrize("family", [socket.AF_INET]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth just setting it below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this originally tested AF_INET6 also, but the test machine doesn't support IPv6.
I agree with this, but I'm not sure which approach matches "you know what you're doing" more; having 2 linked parameters or 1 more complicated parameter? That is: a) Have separate family (int) and local_addr (tuple of address and port) parameters (as the current patch does) b) Have a single optional parameter (maybe called c) Have a single optional parameter containing (family, address, port), where all 3 fields must not be None (that is, address would be set to the appropriate wildcard address for the family if the specific address was unimportant but family was) I don't think there's a reason to support more than 1 of those choices, but I don't know which of them best matches the current philosophy. |
Is specifying the source port useful? I wasn't planning to support that in Trio, since it's kind of inherently incompatible with any kind of happy-eyeballs support... (And similarly, Twisted's version of this feature only lets you specify the source host, not the source port.) |
In the case where you care about happy-eyeballs support, probably not. But if you're specifying a single IPv4 or IPv6 source address (which this patch does), you explicitly don't, and being able to specify the source port is a pretty reasonable thing to do. While Twisted might not support a port, the stdlib socket and asyncio routines do, and curio at least appears to. |
Happy eyeballs is still useful if you're restricting to just IPv4 or just IPv6. Lots of website DNS names resolve to multiple IPv4/IPv6 addresses. E.g.:
I didn't ask if it was reasonable, I asked if it was useful :-). Supporting source ports isn't trivial: the exact desired semantics aren't obvious, and writing good tests is tricky. It can be done, obviously, but if you want us to do the work then it'd be nice to have more of a justification. I've looked for use cases, and I haven't found any yet that care about source ports. (See: python-trio/trio#275 (comment)) An interesting case is the requests-toolbelt |
After thinking about this some more, you're probably right. My use case here is DoH (dns-over-https) queries from dnspython, and dnspython does allow specifying source port. That's mostly historical in this case, though; while it's important for UDP in some cases, it's only really important in archaic situations for TCP, and probably not at all useful for HTTPS. Would it be better to remove source-port support from this PR altogether, or have the trio backend raise an exception if a nonzero port is specified? |
I guess that's a judgement call for Tom to make, but if it were my project I would definitely remove the support entirely until a compelling use case comes along. For this kind of infrastructure project, it's way easier to add features later than it is to claw back features you added in a burst of enthusiasm and then later ended up regretting. |
That sounds reasonable to me. If someone who's willing to merge these changes could make a decision on what the new parameters to AsyncConnectionPool should be, I'll make those changes, and fix all of the underlying code to do the same thing. |
Just to document the current state of things in trio on this: @njsmith merged a commit in lightning speed that (python-trio/trio#1644) that added I suppose that this means that the family parameter (if it is kept) should map to default This also creates an interesting edge-case: If both an address family and a local address are given and their kinds do not agree, what should happen. For backends that support both parameters, such a case will reliably cause an error down the line (as none of the resolved names will be compatible with the given socket type), but with previously mentioned implementation for trio the given local address would win instead. Maybe there should be some explicit error checking in place to catch this condition if both parameters are supplied? Or maybe its better to leave this edge-case explicitly undefined (ie: documented, but no special handling)? Some dev should probably decide this for completeness sake, although its just a minor thing… |
Just following up, is there anything else I need to do here? We need this functionality in dnspython for asynchronous DNS-over-HTTPS and HTTP/2 support. |
httpcore/_types.py
Outdated
@@ -9,3 +9,4 @@ | |||
URL = Tuple[bytes, bytes, Optional[int], bytes] | |||
Headers = List[Tuple[bytes, bytes]] | |||
TimeoutDict = Dict[str, Optional[float]] | |||
SocketAddress = Tuple[Union[str, bytes, bytearray], int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be reasonable to keep this as tightly constrained as possible?
Presumably Tuple[bytes, int]
is sufficient right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If source-port is removed, this type isn't needed at all, as the address could just be a bytes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, yup.
Just getting back into this...
100% yup. Having separate arguments seems to make sense to me, but it might make sense to keep this PR as tightly scoped as possible and just start with adding the local address support? |
I just pushed an update that removes address family and source-port support; this now just allows setting a local address, and passes it through. Note that this still doesn't support trio, as the trio change that @ntninja mentioned earlier hasn't been released yet. |
It's been about 3 weeks since I addressed the last set of comments, and nothing's happened since then. Is there anything I can do to get this merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fantastic stuff, and thanks for the reminder yup.
We'll make sure this is landing for our imminent httpcore
0.10 release, alongside httpx
0.14.
I've got some tiny suggestions, but can probably also pick them up myself if needed.
Thank you so much for this - it's a really great addition! 🎉
@@ -23,11 +23,13 @@ def __init__( | |||
http2: bool = False, | |||
ssl_context: SSLContext = None, | |||
socket: AsyncSocketStream = None, | |||
local_addr: bytes = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to check - Do we want local_addr
or source_addr
here?
What are trio, asyncio, and the sync stdlib using for their naming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asyncio uses local_addr
, trio uses local_address
, and the stdlib socket module uses source_address
(and curio uses source_addr
). There's really no consistency.
httpcore/_async/connection_pool.py
Outdated
@@ -76,6 +76,7 @@ class AsyncConnectionPool(AsyncHTTPTransport): | |||
* **keepalive_expiry** - `Optional[float]` - The maximum time to allow | |||
before closing a keep-alive connection. | |||
* **http2** - `bool` - Enable HTTP/2 support. | |||
* **local_addr** - `Optional[bytes]` - Local address to connect from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Miniature nitpick. Let's use a full stop to match the other cases. "Local address to connect from." 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
httpcore/_backends/asyncio.py
Outdated
) -> SocketStream: | ||
host = hostname.decode("ascii") | ||
connect_timeout = timeout.get("connect") | ||
exc_map = {asyncio.TimeoutError: ConnectTimeout, OSError: ConnectError} | ||
with map_exceptions(exc_map): | ||
local_addrport = None | ||
if local_addr: | ||
local_addrport = (local_addr, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local_addrport = None if local_addr is None else (local_addr, 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
httpcore/_backends/auto.py
Outdated
) -> AsyncSocketStream: | ||
return await self.backend.open_tcp_stream(hostname, port, ssl_context, timeout) | ||
return await self.backend.open_tcp_stream( | ||
hostname, port, ssl_context, timeout, local_addr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reckon we ought to use .open_tcp_stream(..., local_addr=local_addr)
here.
Just makes it super obvious visually that it's an optional extra.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
) -> AsyncSocketStream: | ||
if local_addr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a comment that it is currently supported in trio
master, and should be expected from version 0.16.1 onwards?
Also, what will our implementation change look like once trio support does land here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm adding a comment.
I believe the code would just add something like local_address=local_addr
to the open_tcp_stream
call, although it's possible that there would be a type mismatch, as the local_addr
parameter is None
or a bytes
, and trio claims to want None
or a str
(although it just passes it into socket.bind
, which should accept a bytes
).
Referencing the commit in trio that adds their support for this, mostly just because they do a great job on docs, and we might want to follow their lead here. https://github.com/python-trio/trio/pull/1644/files |
Great stuff - I've made some tiny amendments in #134. As soon as that's reviewed we'll get it in, and look at rolling our 0.10 release. |
I might be missing something, but this was linked to by the httpx 0.14 release PR (encode/httpx#1083). These changes were merged into #134 and pulled up from there, but nothing was actually added to httpx 0.14, and there's nothing in that release PR or any other PR/issue indicating why not. |
@bwelling Indeed - that's because we don't actually need any code changes in httpx to support this. For "local_address=...", and for our planned "uds=..." support, we'll require users to drop down to the low-level I've drafted a PR with some docs notes that should hopefully clear this up... encode/httpx#1165 |
The doc referenced explains how to use it, but that seems like an especially fragile interface. If httpcore isn't designed to be user-friendly (which is fine), then requiring users to use it seems suboptimal. Having to create a separate transport adapter to set custom and somewhat uncommon options is a perfectly reasonable API design, but requiring users to also pass in a bunch of unrelated configuration options with no defaults is a recipe for disaster, as it means that users who want the standard client behavior, except for one thing, need to poll httpx over time to see if its defaults change, or if new fields are added or removed from the default set. Providing wrappers in httpx around |
We could certainly think about switching our policy on httpcore defaults, yup. It's not really been a very exposed issue until just now. |
Should I open an issue for that? |
Welcome to, yup. There's a bunch of stuff to think through. (Eg. What should our default SSLConfig be & how do we plan to keep things in sync between httpcore/httpx) |
Based on the discussion in #88 .
This isn't really complete yet. The open issues are (at least).
There's no testing. I haven't even done any manual testing; I mostly wanted to verify that the changes look like a good start.
The raised exceptions should be improved.
mypy reports a bogus error. See socket.create_connection source_address should be optional python/typeshed#4116
It would be nice to support passing a raw string as local_addr (if you just want to specify an address, and don't care about port). I haven't done that yet.