Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host port should default to 80 if the Host header has been set #106

Closed
wkirschbaum opened this issue Mar 7, 2023 · 5 comments · Fixed by #228
Closed

Host port should default to 80 if the Host header has been set #106

wkirschbaum opened this issue Mar 7, 2023 · 5 comments · Fixed by #228

Comments

@wkirschbaum
Copy link

Assuming we want to keep compatibility with cowboy:

When the host header is set with cowboy without an explicit port, cowboy will default to port 80.

To reproduce:

curl -H "Host: foo.co.za" http://localhost:4000

Should set conn.port to 80

curl http://localhost:4000

Should set conn.port to 4000

and

curl -H "Host: foo.co.za:9999" http://localhost:4000

Should set conn.port to 9999

This is currently a issue with ueberauth when generating the callback_url.

@mtrudel
Copy link
Owner

mtrudel commented Mar 7, 2023

Bandit explicitly falls back to the port of the underlying transport if not set in the Host header (that is, it will be 4000 in all of the above examples). This is strictly more correct than Cowboy's blind assumption of port 80, and not something I'll entertain changing just for the sake of compatibility. Cowboy is wrong here, not us.

References:

  • do: {:ok, to_string(host), port || local_info[:port]}
  • test "derives port from underlying transport if no port specified in host header", context do
    client = SimpleHTTP1Client.tcp_client(context)
    SimpleHTTP1Client.send(client, "GET", "/echo_components", ["host: banana"])
    assert {:ok, "200 OK", _headers, body} = SimpleHTTP1Client.recv_reply(client)
    assert Jason.decode!(body)["port"] == context[:port]
    end
  • test "derives port from underlying transport if no port specified in host header", context do
    socket = SimpleH2Client.setup_connection(context)
    headers = [
    {":method", "GET"},
    {":path", "/echo_components"},
    {":scheme", "https"},
    {"host", "banana"}
    ]
    SimpleH2Client.send_headers(socket, 1, true, headers)
    assert SimpleH2Client.successful_response?(socket, 1, false)
    {:ok, 1, true, body} = SimpleH2Client.recv_body(socket)
    assert Jason.decode!(body)["port"] == context.port
    end

Do you have details on the ueberauth issue so we can see if it can be fixed there?

@mtrudel mtrudel closed this as completed Mar 7, 2023
@wkirschbaum
Copy link
Author

@mtrudel I glanced over the referenced RFC7540§8.1.2.3 section and don't see anything clearly stating the above ( i might have missed something). I do see the default port for scheme http is 80 and https is 443, but this does not seem to be respected by cowboy as it just defaults to 80, even if the x-forwarded-proto is set to https.

My concern was more around packages with the plug dependency potentially relying on this "issue". I will submit a patch for ueberauth and link this issue, as ueberauth blindly assumes https with port 80 should normalise to port 443.

@mtrudel
Copy link
Owner

mtrudel commented Mar 7, 2023

To my (at least somewhat learned) understanding, there isn't really an explicitly correct answer to this in the specs. I think the closest thing you're going to find to a normative answer is RFC9110§7.2.

A couple of other places weigh in:

  • RFC9112§3.2.3 suggests to degrade to the protocol's default port, but with the very important caveat that this is in the context of a proxied CONNECT request (ie: the semantics in this case are closer to those of a client interpreting an unadorned URI).
  • RFC9112§9 strongly implies that the client's decision as to how to interpret an unadorned URI is specifically implementation-dependent (really just a reinforcement of RFC9110§7.2)

So I don't think that the specs have a specific answer here. The more I think about this though, the less I think that this is really super important when it all comes down to it.

Ultimately, this comes down to what we want %Plug.Conn{port: ...} to mean.

If it's what the user thinks they're connecting to, then Cowboy defaulting to port 80 is probably the right thing (setting aside that this makes some assumptions about the client's intent that, as above, seem to be undefined).

If it's what the user is actually connected to, then we do the right thing today.

I believe the second of these options to be the more correct one. One of Bandit's fundamental goals is to codify a minimum of policy, and I see option 2 as furthering this goal more effectively. There is no way to not editorialize, at least somewhat, an answer to the first question, whereas there is a clear & objectively correct answer to the second. I'll take that option every time. If the user wants to figure out an answer to the first question, they can do so with full view of the inherent editorializing that goes into answering it. I don't think that's Bandit's job here.

Of course, all of the above is premised on the RFCs not being specifically normative on this point. If there's a clause somewhere that spells this behaviour out, I will happily tack that way (as I've said before, being correct is Bandit's literal prime directive).

@wkirschbaum
Copy link
Author

I just want to use bandit and ueberauth 😆 and do see why defaulting to 80 can be seen as a hack. Here is a draft PR for ueberauth which solves the issue: ueberauth/ueberauth#181.

Thanks for the detailed response and a great library.

@mtrudel
Copy link
Owner

mtrudel commented Sep 12, 2023

Of course, all of the above is premised on the RFCs not being specifically normative on this point. If there's a clause somewhere that spells this behaviour out, I will happily tack that way (as I've said before, being correct is Bandit's literal prime directive).

As @wkirschbaum thoughtfully lays out the sketch for here, I think the specs actually are normative on this:

  • RFC9110§4.3.1 spells out in as many words that the default port is to be used when building the origin for a request. This on its own is enough for me to reconsider the behaviour of absolute request URIs to use the protocol default instead of the transport (ie: this test should actually be asserting port 80).

  • RFC9110§7.2 says 'The "Host" header field in a request provides the host and port information from the target URI', which isn't the exact same thing as saying it encodes the host and port intended origin, but it's pretty close. However, shortly following this in §7.3.3, we have some wording that says to handle the 'target URI' with the heuristic described in RDC9110§4.3.2 (which itself directly refers back to §4.3.1 in the case of a missing port). To my mind, this connects the dots enough to say that 4.3.x ought to be the same heuristic used when interpreting the Host header as well.

All of which is to say that I'm now convinced that the RFCs have a normative stance on this, and we should change to accommodate this. Note that this means that Bandit will now have to adopt the notion that '%Plug.Conn{port: ...} [means] what the user thinks they're connecting to'.

I'll cut a Bandit PR in the next couple of days to adopt this behaviour. Specifically, the change will be to no longer consult the underlying transport when ports are not explicitly provided in the request URI or host headers, but rather to use the scheme's default. No other behaviour in terms of absolute URI vs host header precedence will change. I think this will resolve the issue in ueberauth/ueberauth#187.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants