-
Notifications
You must be signed in to change notification settings - Fork 38.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spring is inconsistent in the encoding/decoding of URLs [SPR-16860] #21399
Comments
Rossen Stoyanchev commented
Do you mean |
J. Pablo Fernández commented Yes, |
Rossen Stoyanchev commented Note that the treatment of "+" in Spring (e.g. RestTemplate, UriTemplate, UriComponents) is intentional. It was a change made in 5.0, see #19394, to comply with RFC 3986. There are options for having the "+" encoded, but I won't go into the details here because the documentation has been recently expanded and revised majorly to discuss the various scenarios and options available in 5.0. Please review the URI Links chapter. |
J. Pablo Fernández commented I'm still unsure where the boundaries of each thing are and I'm not an expert in Spring nor Java servlets and all that. What I know is that I created a Spring RestController that I'm trying to talk to Spring RestTemplate and it's impossible. |
Rossen Stoyanchev commented Basically "+" has meaning in a URL (space) but it's otherwise legal. A client can choose to encode it (suppress its meaning) or not. It's a choice, and not an automatic decision.
|
J. Pablo Fernández commented Is plus-encoding-a-space part of the URI RFC? I think it's RFC3986. It's been a while since I read it all and I cannot find any mention of that (but maybe I'm missing something). |
Rossen Stoyanchev commented From an RFC 3986 perspecitve, "+" is a legal character. By default the RestTemplate leaves it as is. However if you want to encode everything in a URI variable that could have meaning, you can switch to a different encoding mode. See the part in the above referenced chapter that talks about customizing encoding options. |
Christophe Levesque commented Rossen Stoyanchev: sorry for reopening this issue but I don't think the behavior introduced with #19394 is correct. I understand and agree that a "+" is a valid character in a query param value, it will be interpreted as a space by pretty much any server (including Spring apps). For example:
While your change in f2e293aadf98cb64ba7ef3044b59f469efd21503 might fix a parsing issue when a + appears in query param, it also introduces a change that breaks the building of URLs. For example: UriComponentsBuilder.fromHttpUrl("http://localhost:8080").queryParam("param1", "foo+bar").queryParam("param2", "foo&bar").toUriString(); Before your change this would produce the string: http://localhost:8080?email1=foo%2Bbar@gmail.com&email2=foo%26bar@gmail.com When sent to a server (including a Spring Boot server with Now after your change, the code above produces the string: http://localhost:8080?param1=foo+bar¶m2=foo%26bar The first parameter was incorrectly encoded as a consequence of your change. Now the server will get The "+" sign in a query parameter should be encoded as %2B when calling toUriString(). We just upgraded from Spring Boot 1.5.x to 2.0 and this is causing major regressions. I have a simplistic fix for this which I'm not 100% sure is correct:
|
Rossen Stoyanchev commented Christophe Levesque, please take a look at the "URI Encoding" section in the reference. |
Christophe Levesque commented I did have a look at the reference and assume you refer about this bit:
I guess here the "+" sign is not illegal but has a reserved meaning (it's decoded as a space). So I understand that the behavior is somewhat explained by the documentation. What is really unfortunate, is that the UriComponentsBuilder.fromHttpUrl(baseUrl).queryParam("foo", foo).toUriString() You could throw pretty much any query param value at it (with or without "+" sign or special character) and it would just work (the server receiving the request would get the param with the same value as what you passed to the Now, the new
A simple one-liner becomes an odyssey. :P Rossen Stoyanchev: I hope you'll consider addressing this! |
Brian Clozel commented Christophe Levesque, I understand your frustration - especially since this was working a way that matched your needs for quite a while. Looking at #19394, that's actually the issue violating that principle of least astonishment: We don't have many options here:
It's quite unfortunate, but you (and many others) were relying on a bug, or at least an implementation out of sync with the RFC. URL encoding is a very complex matter and we're trying to make things developer-friendly while maintaining RFC compliance. The DefaultUriBuilderFactory factory = new DefaultUriBuilderFactory();
factory.setEncodingMode(DefaultUriBuilderFactory.EncodingMode.VALUES_ONLY);
URI uri = factory.uriString("https://spring.io/")
.queryParam("query", "{query}")
.build("spring+framework"); // http://spring.io/?query=spring%2Bframework |
Rossen Stoyanchev commented I hear you about the odyssey .. To expand on what Brian said, the category of related encoding issue is broader. Besides "+" there are other characters that are common and legal but have reserved meaning, e.g. slash is legal in a path, so is semicolon, etc. The UriComponentsBuilder has always behaved the same way about those as it does for "+" now. The reference explains the approach to encoding, which for good or for bad is modeled after, and consistent with java.net.URI. We can't change how it works now but it shouldn't be an odyssey either. The reference explains switching to a different encoding mode, which for RestTemplate or WebClient is straight forward as there are built-in options to change the behavior. The challenge with UriComponentsBuilder is that it's used statically and there is no place to configure options. You need something like So at this point the main gap that remains is with direct use of UriComponentsBuilder. The only options I see there are some overloaded methods on UriComponents related to expanding and encoding, that would rely on a different way of encoding. Or explicit methods on UriComponentsBuilder itself to customize its behavior, which would have to be invoked every time to deviate from the default behavior. |
Mike Placentra commented Rossen pointed out to me in #21473 that UriUtils#encode(String, Charset) has the behavior we're looking for, that's a good workaround in cases where an URL builder isn't needed. That being said, in the time leading up to submitting that ticket and until now, I've been unsuccessful in convincing myself that having the encoder encode "+" as "%2B" by default would "break" compliance with RFC 3986. (This is separate from decoding "+" as a space by default – I agree that's bad.) I've looked through the RFC several times (basically each time I stub my toe on this issue in Spring Web) and it certainly gives us these points:
It goes far enough to ensure that 1) once a URI component is formed it is composed of only characters that make it distinguishable from other components in the URI, 2) a (percent) encoding mechanism is available for escaping other characters, and 3) URI normalizers (for comparing URIs) treat (potentially) implementation-specific delimiters as first-class delimiters. Otherwise, the RFC largely leaves the implementation-specific encoding of data and data structures in a URI component out-of-scope. As far as I can tell, there's no rule against overzealously encoding data "abcd" as "%61%62%63%64" while forming the URI, and likewise there's no rule against encoding data "+" as "%2B" while forming the URI. In fact, the RFC says this in section 2.2:
It could be argued, given that a vanilla Spring Web application relies on Tomcat to decode request URIs, and Tomcat has an implementation-specific purpose for "+" as a delimiter (it decodes it as a space, I suppose because it assumes the value is a phrase or prose and we're delimiting the words), and a URI generated by a Spring Web application is likely to be subsequently decoded by the same or another Spring Web application, +data "+" must be encoded as "%2B" according to RFC 3986+ because the data "would conflict with a reserved character's purpose as a delimiter". Also, I don't know how much weight it holds, but here's a W3C recommendation that says to encode "+" as "%2B". I couldn't imagine encoding "+" as "%2B" by default would cause grief for anyone because any URI parser will properly decode that back to a "+". It could be an issue if someone is designing their own implementation-specific delimiter system for the URI component (encoding complex data structures), but then I probably wouldn't use UriComponentsBuilder, or I would research its functionality first and discover that DefaultUriBuilderFactory is more configurable. |
Christophe Levesque commented First off, thanks Rossen Stoyanchev, Brian Clozel and Mike Placentra for taking the time to reply! :) I think the current implementation is mixing validation and encoding of query params. I completely agree that #19394 was describing a buggy behavior: when calling Now the confusion comes with the
From my (limited) understanding, when it comes to the query parameters, passing
IMHO, when I cannot think of a use case where you would not want the "+" sign to be URL-encoded when calling |
Rossen Stoyanchev commented This isn't about breaking compliance with the RFC 3986 and it's not about encoding like URLEncoder (which encodes form data, not URIs). Simply, the approach to encoding is modeled after System.out.println(new URI("http", "example.org", "/foo bar", null));
System.out.println(new URI("http", "example.org", "/foo+bar", null));
// http://example.org/foo%20bar
// http://example.org/foo+bar The space is encoded because it's not legal. The "+" is not because it's legal, even if it has reserved meaning in form data as a space. It's possible to want either the effect of "+" or not in which case it has to be encoded. I don't think one is valid and the other is not. This is more about providing the options to express which you want. |
Christophe Levesque commented
It's true but in practice it works and is used for both. If you want to pass a parameter safely in a query string and make sure the server receives what you sent, the easiest way is to encode it with
There are 2 construtors in the My point is that you are modelling Ironically, the JDK isn't even consistent with itself. Trying to create a URI with the single string constructor or the static URI.create("http://localhost:8080/foo bar?test=abc xyz") fails with: Exception in thread "main" java.lang.IllegalArgumentException: Illegal character in path at index 25: http://localhost:8080/foo bar?test=abc xyz
at java.net.URI.create(URI.java:852)
at com.tripactions.core.util.UriComponentsBuilder.main(UriComponentsBuilder.java:70)
Caused by: java.net.URISyntaxException: Illegal character in path at index 25: http://localhost:8080/foo bar?test=abc xyz
at java.net.URI$Parser.fail(URI.java:2848)
at java.net.URI$Parser.checkChars(URI.java:3021)
at java.net.URI$Parser.parseHierarchical(URI.java:3105)
at java.net.URI$Parser.parse(URI.java:3053)
at java.net.URI.<init>(URI.java:588)
at java.net.URI.create(URI.java:850)
... 1 more This would have been a much more reasonable behavior for the other URI constructors. As a workaround I've extended the If there is no way you can be convinced, feel free to close this ticket. :( |
Rossen Stoyanchev commented I'm not trying to defend how it works, to be honest. Only explaining how it was designed, a long time ago. The behavior you want is available when using the RestTemplate and the WebClient. What's missing is something when using UriComponentsBuilder directly. I will give it some further thought. This is what we would need to expose, and which you can use as a workaround: String value = "A+B=C";
value = UriUtils.encode(value, StandardCharsets.UTF_8);
URI uri = UriComponentsBuilder.newInstance().queryParam("test", value).build(true).toUri(); |
Rossen Stoyanchev commented Christophe Levesque, I've made some progress on providing a solution in UriComponents. Please follow #21577. I would appreciate feedback from as many as possible once the solution is available early next week. |
Christophe Levesque commented Thanks Rossen Stoyanchev! |
I also got this problem (originally from #14464), and my workaround is to use Apache URIBuilder instead: https://stackoverflow.com/a/50959115/122441 URIBuilder ub = new URIBuilder("http://example.com/query");
ub.setParameter("q", "foo+bar"); // foo%2Bbar
URI uri = ub.build(); I hope this is useful for other people having the same problem. |
J. Pablo Fernández opened SPR-16860 and commented
I have a Spring Boot server application and a client that uses RestTemplate to talk to it. I'm finding that RestTemplate doesn't encode plus signs ("+") while Spring decodes them as spaces.
For example, this snippet of code:
will make a request to
which means, inside Spring, the variable foo will contain "fo o" and the variable bar will contain "ba r". Using UriTemplate directly, it's easier to reproduce the generation of that URL:
which prints
Not even UriComponentsBuilder encodes the plus sign:
That prints:
all of which are incorrect, not encoding the plus sign.
I have posted questions about this on StackOverflow that further point to other bug reports and there's a lot of confusion around this subject:
https://stackoverflow.com/questions/50270372/why-is-spring-de-coding-the-plus-character-on-application-json-get-requests
https://stackoverflow.com/questions/50432395/whats-the-proper-way-to-escape-url-variables-with-springs-resttemplate-when-ca
Affects: 5.0.5
Issue Links:
0 votes, 6 watchers
The text was updated successfully, but these errors were encountered: