-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: clarify the use of "/" as a prefix_rewrite #3160
Conversation
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
@zuercher can you review this one? |
It's unfortunate that we can't distinguish between prefix_rewrite = "" and unset because the real problem is that you can't rewrite a prefix to the empty string. |
source/common/config/utility.cc
Outdated
std::string Utility::normalizePath(absl::string_view path) { | ||
std::vector<absl::string_view> segments = StringUtil::splitToken(path, "/", false); | ||
|
||
// Removes dot and double-dot segments https://tools.ietf.org/html/rfc3986#section-5.2.4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The algorithm described at that URL does more than simply delete ".." segments.
I think if you're going to do more than handle the case where the user wants to remove a prefix from the path (which just amounts to dealing with "//" at the start of the rewritten path, then the normalization should handle ".." correctly.
In either case, the docs for the prefix_rewrite field should be updated to describe what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zuercher I think I will remove that partial support for now.
A side question/clarification,
It's unfortunate that we can't distinguish between prefix_rewrite = "" and unset because the real problem is that you can't rewrite a prefix to the empty string.
Do you think it is better to allow users to set prefix_rewrite as "" rather than "/"?
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Can you also update api/envoy/api/v2/route/route.proto
with this new behavior?
source/common/config/utility.h
Outdated
@@ -263,6 +263,13 @@ class Utility { | |||
factoryForGrpcApiConfigSource(Grpc::AsyncClientManager& async_client_manager, | |||
const envoy::api::v2::core::ApiConfigSource& api_config_source, | |||
Stats::Scope& scope); | |||
|
|||
/** | |||
* Normalize URI path. Currently it only handles to remove repeated slashes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "Currently it only handles removing repeated slashes."
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
@zuercher I added an explanation on this as an example. If you think we need more please let me know. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from a functionality perspective, wondering if we can do better with perf?
source/common/config/utility.cc
Outdated
@@ -228,5 +228,11 @@ Grpc::AsyncClientFactoryPtr Utility::factoryForGrpcApiConfigSource( | |||
return async_client_manager.factoryForGrpcService(grpc_service, scope, false); | |||
} | |||
|
|||
std::string Utility::normalizePath(absl::string_view path) { | |||
std::vector<absl::string_view> segments = StringUtil::splitToken(path, "/", false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: const here and next line
source/common/config/utility.cc
Outdated
std::string Utility::normalizePath(absl::string_view path) { | ||
std::vector<absl::string_view> segments = StringUtil::splitToken(path, "/", false); | ||
bool has_trailing_slash = path.back() == '/'; | ||
return absl::StrCat("/", absl::StrJoin(segments, "/"), has_trailing_slash ? "/" : ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: I feel like this is a pretty heavy hammer from a perf perspective for solving a specific issue (duplicate '/' after rewrite). Could we detect the duplicate '/' and copy strings in a way in which this would perform better? cc @jmarantz who likes these types of things. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Probably most of the time the use-case is to have duplicates '/' at the beginning of the path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I have functional concerns about this PR first.
source/common/config/utility.cc
Outdated
std::string Utility::normalizePath(absl::string_view path) { | ||
std::vector<absl::string_view> segments = StringUtil::splitToken(path, "/", false); | ||
bool has_trailing_slash = path.back() == '/'; | ||
return absl::StrCat("/", absl::StrJoin(segments, "/"), has_trailing_slash ? "/" : ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I have functional concerns about this PR first.
test/common/config/utility_test.cc
Outdated
// Maintain the trailing slash. | ||
EXPECT_EQ("/ok/a/b/c/", Utility::normalizePath("/ok/a/b/c/")); | ||
EXPECT_EQ("/ok/a/b/c/", Utility::normalizePath("///ok/a/b/c///")); | ||
EXPECT_EQ("/ok/a/b/c/", Utility::normalizePath("///ok////a////b////c///")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think normalization should be this aggressive, messing with the slashes in the middle of the path.
I think the beginning of the path is a special case, and "scheme://origin/" should be treated the same as "scheme://origin". But I don't think that URL normalizers usually mess with anything after the "/".
The motivation force in the bug the fixes was really the behavior of replacing prefix "/foo" with "/". WDYT of handling this at the parsing of that command, where a trailing slash is appended to src or dest if they don't have them? Then the swap can be done fairly cleanly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, that's basically my thinking as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for overthinking it. What do you think of handling this “/“ specifically: treating that as replacing with empty string? https://github.com/dio/envoy/blob/fix-2956/source/common/router/config_impl.cc#L375
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
In the description you describe a scenario:
The reason that produces unexpected results is that the match string doesn't end with a slash and the redirect does. You could solve this problem by either adding a slash to the match-string or removing the slash from the redirect string. Which you choose depends on what you want the result of a URL "/ignore-this-maybe/foo" to be. Do you want that to map to "/-maybe/foo"? If the answer is you'd prefer that not trigger the matching logic at all, then you want to make the match more specific by appending a slash. |
Another alternative is to just change doc (and not code) to suggest that if you want to match a whole segment, then include the trailing slash in the match string. |
@jmarantz thanks for clearing that up. Sorry for not catching that earlier! Will keep the code and convert this PR to just contain doc and additional test input samples for that |
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
@jmarantz updated. If you have time please help to take a look. Thanks! Side note: for now to accomplish that common use-case (when you want to rewrite e.g. |
api/envoy/api/v2/route/route.proto
Outdated
// it rewrites :code:`/prefix` to :code:`/` and :code:`/prefix/foo` to :code:`/foo`. | ||
// it rewrites :code:`/prefix` to :code:`/`, but :code:`/prefix/ok` to :code:`//ok`, which | ||
// most of the time is not the intended result. However, it can be useful for the case of | ||
// replacing substring of a path segment. For example to rewrite :code:`/prefixok` to :code:`/ok`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/replacing substring/replacing a substring/
// prefix: "/prefix" | ||
// route: | ||
// prefix_rewrite: "/" | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the examples below will be clearer if the two prefix-substitutions above use different match strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of this is to tell that to achieve such behavior (rewriting by replacing a segment of a URL, in this case: "prefix" and attach the rest to the root path) is by declaring two match rules (with and without trailing "/", in this case "/prefix" and "/prefix/"). The below example tells that if we remove the rule with "match" value with trailing slash, it gives a different result (rewrites "/prefix/ok" to "//ok").
api/envoy/api/v2/route/route.proto
Outdated
// - match: | ||
// prefix: "/prefix" | ||
// redirect: | ||
// prefix_rewrite: "/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't have the whole context for this file, but I wonder if you could point to the first instance of this example rather than repeating it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me see
redirect->rewritePathHeader(headers); | ||
EXPECT_EQ("http://redirect.lyft.com/", redirect->newPath(headers)); | ||
} | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth some comments here to describe what how these two cases are different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will add some comments here.
api/envoy/api/v2/route/route.proto
Outdated
// | ||
// .. note:: | ||
// | ||
// Take a careful attention to the use of trailing slash in match prefix value. A common |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you write something like the following, you can delete the examples.
"Pay careful attention to the use of trailing slashes in the RouteMatch's prefix value. Stripping a prefix from a path requires multiple Routes to handle all cases. For example, rewriting /prefix to / and /prefix/etc to /etc cannot be done in a single Route."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zuercher, I have updated this PR as suggested.
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
api/envoy/api/v2/route/route.proto
Outdated
// :ref:`route's match <envoy_api_field_route.Route.match>` prefix value. | ||
// Stripping a prefix from a path requires multiple Routes to handle all cases. For example, | ||
// rewriting */prefix* to */* and */prefix/etc* to */etc* | ||
// cannot be done in a single :ref:`Route <envoy_api_msg_route.Route>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we give an explicit example of what the user would configure? I think that would be helpful.
Also, Q: I lost track of the conversation but isn't there a targeted fix possible here where we just adjust/normalize the slash where the prefix ends and the rest of the route begins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we give an explicit example of what the user would configure? I think that would be helpful.
I added an example in my previous commit, https://github.com/dio/envoy/blob/d46c26e4ede389971fbfeca7bd2d2e48716ea2d4/api/envoy/api/v2/route/route.proto#L369-L393 if you think this is good enough I can add it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, Q: I lost track of the conversation but isn't there a targeted fix possible here where we just adjust/normalize the slash where the prefix ends and the rest of the route begins?
We have this case like the following:
{
Http::TestHeaderMapImpl headers =
genRedirectHeaders("redirect.lyft.com", "/ignore-this/use/", false, false);
const DirectResponseEntry* redirect = config.route(headers, 0)->directResponseEntry();
redirect->rewritePathHeader(headers);
EXPECT_EQ("http://redirect.lyft.com/use/", redirect->newPath(headers));
}
Where the config is:
- match: { prefix: "/ignore-this"}
redirect: { prefix_rewrite: "/" }
We decided to accommodate this behavior. Hence no code change for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah my thinking here is that slashes have no special role in the URL after the first 3. E.g.
scheme://origin/foo/bar
and
scheme://origin/foo//bar
are both different valid URLs and origin servers can interpret them differently. Moreover, proxies may see arbitrary slashes in URL parameters. If they originate from a browser they'd be escaped, but not if they originate from some other client.
You could normalize the match strings, but you only lose generality doing so, and it's conceivable someone may want a URL rewriter that strips off a prefix that does not necessarily end with a slash.
Authoritative spec is https://www.w3.org/Addressing/URL/url-spec.txt which has this (IMO) somewhat ambiguous statement:
The path is interpreted in a manner dependent on the scheme being
used. Generally, the reserved slash "/" character (ASCII 2F hex)
denotes a level in a hierarchical structure, the higher level part
to the left of the slash.
So in typical use (e.g. mapping to a file-system) might correspond to a file-system where normalization makes sense, that's not required by the spec and in practice I think it's better not to mess with URLs too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I'm suggesting is to add an explicit example above in the docs where you talk about how it cannot be done in a single route rule. Can you list out the two rules that cover the example case so the user better understands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattklein123 OK. I have it here: a4c60d5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @dio! Very clear.
Signed-off-by: Dhi Aurrahman <dio@rockybars.com>
docs: clarify the use of "/" as a prefix_rewrite This patch clarifies the use of "/" as a prefix_rewrite in route and redirect prefix rewriting. And also a note on the use of trailing slashes as match value. Risk Level: Low Testing: add more input samples to RedirectPrefixRewrite test. Docs Changes: Update route.proto doc regarding path_rewrite both for redirect and route. Release Notes: N/A Fixes envoyproxy#2956 Signed-off-by: Dhi Aurrahman <dio@rockybars.com> Signed-off-by: Rama <rama.rao@salesforce.com>
docs: clarify the use of "/" as a prefix_rewrite
This patch clarifies the use of "/" as a prefix_rewrite in
route
andredirect
prefix rewriting. And also a note on the use of trailing slashes asmatch
value.Risk Level: Low
Testing: add more input samples to RedirectPrefixRewrite test.
Docs Changes:
route.proto
doc regardingpath_rewrite
both forredirect
androute
.Release Notes:
N/A
Fixes #2956