-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: enhanced ServeMux routing #61410
Comments
Really enthusiastic about this, seems like an excellent improvement to me!
Is the status of setting PathValues still as mentioned here: #60227 (comment) - omitted for minimal proposal? (I convinced myself something like a one-shot |
@AndrewHarrisSPU, yes, I deliberately omitted a way to set path values to be minimal. But I do see what you mean about having no way to set it. At the very least, tests should be able to set path values on a request without going through the matching process. I don't think that exposing the parsing logic should be done as a method on Request; I'm not sure it should be done at all. I would go with something simpler. Perhaps:
|
Could you expose the pro and con about adding |
@flibustenet, the pros of being able to set values are that it would be easier to write tests, and it would make it possible for middleware to transmit information via the path values. That second pro is also a con: once we allow setting, then we have introduced this bag of key-value pairs to |
As someone who's played around with various Go routing techniques, I really like this, and think it'll more or less eliminate the need for third-party routers, except for very specialized cases. I've tested it using the reference implementation in my go-routing suite, and it was very easy to use. I just copied the chi version and tweaked it a bit. Lack of regex matching like A few comments/questions:
|
In Caddy, we have the concept of request matchers. I'm seeing isn't covered by this proposal is how to route requests based on headers. For example, something commonly done is multiplexing websockets with regular HTTP traffic based on the presence of Another learning from Caddy is we decided to go with "path matching is exact by default" and we use Finally, there's an experimental https://developer.mozilla.org/en-US/docs/Web/API/URL_Pattern_API which has a lot of great ideas for matching URL patterns. I'd suggest considering some of the syntax from there, for example |
Does that apply for "..." wildcards as well? If so, would that mean the handler couldn't differentiate literal slashes from escaped slashes in "..." wildcard values? |
@muirdm: Yes, in a "..." wildcard you couldn't tell the difference between "a%2Fb" and "a/b". If a handler cared it would have to look at |
@francislavoie: Thanks for all that info. The Caddy routing framework seems very powerful. But I think routing by headers is out of scope for this proposal. Same for many of the features Caddy routing and the URL pattern API you link to. I think we expect that people will just write code for all those cases. By the usual Go standards, this proposal is already a big change. |
@benhoyt, thanks for playing around with this.
|
This proposal has been added to the active column of the proposals project |
I see two others usages of To can start with standard router and switch to an improved one if needed without changing all the methods to retrieve the PathValue in the handlers. @jba you said that it's anyway a big work to switch the routers but I don't think so, I every time begin by standard router and only switch when needed without so much works, and we'll do even more with this proposal. There is also the case where we want to route from a handler to an other handler or middleware and adjust the PathValues. For example with a complex parsing It will prevent the temptation to make mux overcomplicated when we can improve it easily by adding/setting PathValues. We already do this with Form. |
One small addition to this I would really like to see is the option to read back from a request which pattern it was matched with. We find this functionality in other frameworks really helpful for logging and metrics, and for resolving ambiguity in how a request reached a handler. Where it doesn't exist, we add it by using a middleware to add a request tag string to the context, but this causes a stutter wherever it's declared and pollutes logical concerns (routing) with cross-cutting ones (observability) in a way that makes otherwise-simple routing code harder to read and write. Currently this doesn't exist in net/http, but could sometimes be fudged if a handler separately knew the path prefix it was registered under. With this change the template could be even more different from the URL's Path, so being able to explicitly pull it out becomes more important. My proposed API would be only adding a Similar functionality exists already in gorilla/mux and chi. I realise this change is kind of orthogonal to the routing additions, although I think it is made more pressing by them. Happy to open this as a separate issue if that's preferred. |
@flibustenet, good point about |
@treuherz, I see the usefulness for retrieving the pattern, but it feels out of scope for this issue. I'd suggest a separate proposal. |
Regarding response codes, would a 405 response automatically include the "Allow" header with appropriate methods? |
What is the state of the request's method? package http
func (*Request) PathValue(wildcardName string) string I can see that in reference implementation it's part of the ServeMux package muxpatterns
func (mux *ServeMux) PathValue(r *http.Request, name string) string IMO, it should stay this way. Alternatively, if the method is still intended to be a request method, this proposal should make a commitment to allow setting the path values by 3rd party routers. |
It's part of If it stayed there then every handler would have to have some way to access its We are leaning to adding a Set method. As was pointed out, it wouldn't be the only bag of values in |
@iamdlfl, apparently the spec for a 405 says we MUST do it, so we will. I'm just not sure what to put there if there is a matching pattern that has no method. I guess all the valid HTTP methods? |
Sounds good. And if someone doesn't define the optional method when registering the route, I don't think we'll have to worry about responding with a 405, right? If I'm understanding right, it will respond normally to any method type. |
@phenpessoa The burden of proof is typically on the side that claims that something matters: that a piece of software affects performance, that a drug affects the outcome of a disease, that adding a substance speeds up a reaction, and so on. The null hypothesis is always that nothing has any effect. The evidence here, besides what @jub0bs mentioned, is that no one has come forward with any evidence to the contrary for real systems. You will certainly find that router speed matters for a toy server that does very little work, but what about production servers? I've asked several times without any response. It sounds like you may have some real examples, though. Would you mind doing some performance measurements and posting them here? |
Double checking some behavior reported in Gophers slack. Given this program, should a PUT to |
The behaviour you're observing is in accordance with the spec:
mux.HandleFunc("/", /* omitted */)
mux.HandleFunc("GET /foo", /* omitted */) Pattern A PUT request to |
👍 thanks for the double check. Here's what is said about 405:
Are we sure 405 should only be returned if no other pattern matches? That does seem to be different from at least chi. For example this program returns a 405 for a PUT to I see mentions of customizing 405 responses but best I can tell nothing specifically about the order of precedence for when to return it. Apologies if I've missed that. |
I believe this is WAI. By writing the pattern If the author wants |
So if my understanding is correct, in order to serve a custom 404 for any unmatched request, I can't use the catch-all "/" route. |
I guess the 404 handler would need to be at GET /? Does seem unfortunate if there’s not a good way to set the 404 handler and still have 405 messages. |
My understanding is that by setting the 404 handler to /, we lose the ability to handle 405 at all. All requests will end up at / even when we have a 'matching' route, albeit with a different method. |
Will this change impact r.Handle("/{id}/*", http.StripPrefix("/{id}", idHandler)) For what it's worth, this replacement for // stripDynamicPrefix adjusts the URL Path of the request by dynamically
// stripping the specified prefix.
func stripDynamicPrefix(prefix string, h http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Split the request path and prefix into segments.
pathSegments := strings.Split(r.URL.Path, "/")
prefixSegments := strings.Split(prefix, "/")
matched := true
for i, segment := range prefixSegments {
// Skip dynamic segments.
if strings.Contains(segment, "{") && strings.Contains(segment, "}") {
continue
}
// Check for segment match.
if i >= len(pathSegments) || segment != pathSegments[i] {
matched = false
break
}
}
if !matched {
http.NotFound(w, r)
return
}
newPath := "/" + strings.Join(pathSegments[len(prefixSegments):], "/")
// Create a new request with the updated path.
r2 := new(http.Request)
*r2 = *r
r2.URL = new(url.URL)
*r2.URL = *r.URL
r2.URL.Path = newPath
h.ServeHTTP(w, r2)
})
} Additionally, I was surprised that this did not route:
I wanted to align my routes neatly, e.g.
|
I think it's a good sign that keeping the ServeMux rules minimal still allows flexibility on top. The minimal approach may more frequently need some translation - starting from how one would like to structure everything about route configuration, resulting in ServeMux registration calls - but is a good fit for static-plus-path-variables kinds of routing. |
Allowing spaces in the handle route string and understanding placeholders in StripPrefix both seem like pretty good ideas to me, but they should probably be new proposals in the issue tracker so people can debate whether there's some downside to them. I don't think either should be a blocker to Go 1.22. They can be add ons for Go 1.23. |
Signed-off-by: Joe Tsai <joetsai@digital-static.net>
And add a missing code tag wrap elsewhere. Updates #61410 Updates #61422 Change-Id: I70a9c4ecaf4056af2e88d777b8db892a45dfcb9f Reviewed-on: https://go-review.googlesource.com/c/go/+/552195 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Dan Peterson <danp@danp.net> Reviewed-by: Jonathan Amsterdam <jba@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
And add a missing code tag wrap elsewhere. Updates golang#61410 Updates golang#61422 Change-Id: I70a9c4ecaf4056af2e88d777b8db892a45dfcb9f Reviewed-on: https://go-review.googlesource.com/c/go/+/552195 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Dan Peterson <danp@danp.net> Reviewed-by: Jonathan Amsterdam <jba@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
„GET /users“ feels really awkward and now that it landed like this, I think it is a missed opportunity to get the API right. People were complaining about this in some comments with a good amount of positive reactions but still it landed with this awkward API. |
@alsms You should explain why you find the current API "awkward". In defence of the changes brought to net/http by Go 1.22:
|
10 October 2023: Updated to clarify escaping: both paths and patterns are unescaped segment by segment, not as a whole. We found during implementation that this gives the behavior we would expect.
7 August 2023: updated with two changes:
We propose to expand the standard HTTP mux's capabilities by adding two features: distinguishing requests based on HTTP method (GET, POST, ...) and supporting wildcards in the matched paths.
See the top post of this discussion for background and motivation.
Proposed Changes
Methods
A pattern can start with an optional method followed by a space, as in
GET /codesearch
orGET codesearch.google.com/
. A pattern with a method is used only to match requests with that method,with one exception: the method GET also matches HEAD.
It is possible to have the same path pattern registered with different methods:
Wildcards
A pattern can include wildcard path elements of the form
{name}
or{name...}
. For example,/b/{bucket}/o/{objectname...}
. The name must be a valid Go identifier; that is, it must fully match the regular expression[_\pL][_\pL\p{Nd}]*
.These wildcards must be full path elements, meaning they must be preceded by a slash and followed by either a slash or the end of the string. For example,
/b_{bucket}
is not a valid pattern. Cases like these can be resolved by additional logic in the handler itself. Here, one can write/{bucketlink}
and parse the actual bucket name from the value ofbucketlink
. Alternatively, using other routers will continue to be a good choice.Normally a wildcard matches only a single path element, ending at the next literal slash (not %2F) in the request URL. If the
...
is present, then the wildcard matches the remainder of the URL path, including slashes. (Therefore it is invalid for a...
wildcard to appear anywhere but at the end of a pattern.) Although wildcard matches occur against the escaped path, wildcard values are unescaped. For example, if a wildcard matchesa%2Fb
, its value isa/b
.There is one last, special wildcard:
{$}
matches only the end of the URL, allowing writing a pattern that ends in slash but does not match all extensions of that path. For example, the pattern/{$}
matches the root page/
but (unlike the pattern/
today) does not match a request for/anythingelse
.Precedence
There is a single precedence rule: if two patterns overlap (have some requests in common), then the more specific pattern takes precedence. A pattern P1 is more specific than P2 if P1 matches a (strict) subset of P2’s requests; that is, if P2 matches all the requests of P1 and more. If neither is more specific, then the patterns conflict.
There is one exception to this rule, for backwards compatibility: if two patterns would otherwise conflict and one has a host while the other does not, then the pattern with the host takes precedence.
These Venn diagrams illustrate the relationships between two patterns P1 and P2 in terms of the requests they match:
Here are some examples where one pattern is more specific than another:
example.com/
is more specific than/
because the first matches only requests with hostexample.com
, while the second matches any request.GET /
is more specific than/
because the first matches only GET and HEAD requests while the second matches any request.HEAD /
is more specific thanGET /
because the first matches only HEAD requests while the second matches both GET and HEAD requests./b/{bucket}/o/default
is more specific than/b/{bucket}/o/{noun}
because the first matches only paths whose fourth element is the literal “default”, while in the second, the fourth element can be anything.In contrast to the last example, the patterns
/b/{bucket}/{verb}/default
and/b/{bucket}/o/{noun}
conflict with each other:/b/k/o/default
./b/k/a/default
while the second doesn’t./b/k/o/n
while the first doesn’t.Using specificity for matching is easy to describe and preserves the order-independence of the original ServeMux patterns. But it can be hard to see at a glance which of two patterns is the more specific, or why two patterns conflict. For that reason, the panic messages that are generated when conflicting patterns are registered will demonstrate the conflict by providing example paths, as in the previous paragraph.
The reference implementation for this proposal includes a
DescribeRelationship
method that explains how two patterns are related. That method is not a part of the proposal, but can help in understanding it. You can use it in the playground.More Examples
This section illustrates the precedence rule for a complete set of routing patterns.
Say the following patterns are registered:
In the examples that follow, the host in the request is example.com and the method is GET unless otherwise specified.
API
To support this API, the net/http package adds two new methods to Request:
PathValue
returns the part of the path associated with the wildcard in the matching pattern, or the empty string if there was no such wildcard in the matching pattern. (Note that a successful match can also be empty, for a "..." wildcard.)SetPathValue
sets the value ofname
tovalue
, so that subsequent calls toPathValue(name)
will returnvalue
.Response Codes
If no pattern matches a request, ServeMux typically serves a 404 (Not Found). But if there is a pattern that matches with a different method, then it serves a 405 (Method Not Allowed) instead. This is not a breaking change, since patterns with methods did not previously exist.
Backwards Compatibility
As part of this proposal, we would change the way that
ServeMux
matches paths to use the escaped path (fixing #21955). That means that slashes and braces in an incoming URL would be escaped and so would not affect matching. We will provide the GODEBUG settinghttpmuxgo121=1
to enable the old behavior.More precisely: both patterns and paths are unescaped segment by segment. For example, "/%2F/%61", whether it is a pattern or an incoming path to be matched, is treated as having two segments containing "/" and "a". This is a breaking change for both patterns, which were not unescaped at all, and paths, which were unescaped in their entirety.
Performance
There are two situations where questions of performance arise: matching requests, and detecting conflicts during registration.
The reference implementation for this proposal matches requests about as fast as the current ServeMux on Julien Schmidt’s static benchmark. Faster routers exist, but there doesn't seem to be a good reason to try to match their speed. Evidence that routing time is not important comes from gorilla/mux, which is still quite popular despite being unmaintained until recently, and about 30 times slower than the standard
ServeMux
.Using the specificity precedence rule, detecting conflicts when a pattern is registered seems to require checking all previously registered patterns in general. This makes registering a set of patterns quadratic in the worst case. Indexing the patterns as they are registered can significantly speed up the common case. See this comment for details. We would like to collect examples of large pattern sets (in the thousands of patterns) so we can make sure our indexing scheme works well on them.
The text was updated successfully, but these errors were encountered: