Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid encoding of '?' in query parameter values by Encode.encodeQueryParam #41060

Closed
thomasdarimont opened this issue Jun 7, 2024 · 8 comments · Fixed by #41062
Closed
Labels
area/rest kind/bug Something isn't working
Milestone

Comments

@thomasdarimont
Copy link
Contributor

thomasdarimont commented Jun 7, 2024

Describe the bug

org.jboss.resteasy.reactive.common.util.Encode.encodeQueryParam(..) does not encode ? characters in query parameter values.

Expected behavior

A query parameter value like foo?a=b should be encoded as foo%3Fa%3Db by Encode.encodeQueryParam(..).

Actual behavior

A query parameter value like foo?a=b is encoded as foo?a%3Db by Encode.encodeQueryParam(..).
Note the unencoded ? which should be encoded as %3F.

How to Reproduce?

String rawUrlValue = "foo?a=b";
String encoded1 = Encode.encodeQueryParam(rawUrlValue); //foo?a%3Db (wrong)
String encoded2 = URLEncoder.encode(rawUrlValue, StandardCharsets.UTF_8); //foo%3Fa%3Db (correct)

Output of uname -a or ver

No response

Output of java -version

No response

Quarkus version or git rev

3.11.1 but also 3.8.5

Build tool (ie. output of mvnw --version or gradlew --version)

No response

Additional information

This affects all users of Encode, e.g. org.jboss.resteasy.reactive.common.jaxrs.UriBuilderImpl.

The following will produce the incorrectly (partially) encoded URL:

import org.jboss.resteasy.reactive.common.jaxrs.UriBuilderImpl;

class Scratch {
    public static void main(String[] args) {
        var uri = new UriBuilderImpl()
                .encode(true)
                .scheme("https")
                .host("somehost")
                .port(1234)
                .path("/path")
                .queryParam("client_id", "{client_id}")
                .build("https://somehost:1234/path?param=value");
        System.out.println(uri);
    }
}

Yields:

https://somehost:1234/path?client_id=https%3A%2F%2Fsomehost%3A1234%2Fpath?param%3Dvalue

Instead of:

https://somehost:1234/path?client_id=https%3A%2F%2Fsomehost%3A1234%2Fpath%3Fparam%3Dvalue

Just stumbled upon this because a Keycloak user reported strange behaviour when a SAML client_id looked like this https://somehost:1234/saml/consume?action=login (thank you legacy software...) and caused issues when the client ID was used as is in URLs generated by Keycloak through UriBuilderImpl.

Note: It would be great to have this backported to quarkus 3.8.x to be picked up by Keycloak.

@thomasdarimont thomasdarimont added the kind/bug Something isn't working label Jun 7, 2024
@quarkus-bot quarkus-bot bot added the area/rest label Jun 7, 2024
Copy link

quarkus-bot bot commented Jun 7, 2024

/cc @FroMage (resteasy-reactive), @geoand (resteasy-reactive), @stuartwdouglas (resteasy-reactive)

@thomasdarimont thomasdarimont changed the title Invalid encoding of '?' in query parameter values by org.jboss.resteasy.reactive.common.util.Encode Invalid encoding of '?' in query parameter values by Encode.encodeQueryParam Jun 7, 2024
thomasdarimont added a commit to thomasdarimont/quarkus that referenced this issue Jun 7, 2024
…ram(..)

Previously `?` in query parameter values where encoded as is which caused
invalid URL values. We now replace `?` characters in query parameter values with
`%3F`.

Fixes quarkusio#41060

Signed-off-by: Thomas Darimont <thomas.darimont@googlemail.com>
thomasdarimont added a commit to thomasdarimont/quarkus that referenced this issue Jun 7, 2024
…ram(..)

Previously `?` in query parameter values where encoded as is which caused
invalid URL values. We now replace `?` characters in query parameter values with
`%3F`.

Fixes quarkusio#41060

Signed-off-by: Thomas Darimont <thomas.darimont@googlemail.com>
gsmet pushed a commit to thomasdarimont/quarkus that referenced this issue Jun 7, 2024
…ram(..)

Previously `?` in query parameter values where encoded as is which caused
invalid URL values. We now replace `?` characters in query parameter values with
`%3F`.

Fixes quarkusio#41060

Signed-off-by: Thomas Darimont <thomas.darimont@googlemail.com>
geoand added a commit that referenced this issue Jun 8, 2024
…lid-query-param-encoding

Fix encoding of '?' in query parameter values by Encode.encodeQueryParam(..)
@quarkus-bot quarkus-bot bot added this to the 3.12 - main milestone Jun 8, 2024
@FroMage
Copy link
Member

FroMage commented Jun 10, 2024

I went and checked because I was sure I remembered that ? was allowed unescaped in a URL query part, and yes indeed it's not a reserved char within a query part: https://www.talisman.org/~erlkonig/misc/lunatech%5Ewhat-every-webdev-must-know-about-url-encoding/#Thereservedcharactersaredifferentforeachpart

This is pretty evident in https://datatracker.ietf.org/doc/html/rfc3986#section-3.4 although they have a note about compatibility with older software in some very specific cases.

So, not sure about this fix, TBH. This feels wrong.

@geoand
Copy link
Contributor

geoand commented Jun 10, 2024

What harm could it do?

@FroMage
Copy link
Member

FroMage commented Jun 10, 2024

Good question. I'm not sure. But in any case, this is probably hiding a bug elsewhere:

Just stumbled upon this because a Keycloak user reported strange behaviour when a SAML client_id looked like this https://somehost:1234/saml/consume?action=login (thank you legacy software...) and caused issues when the client ID was used as is in URLs generated by Keycloak through UriBuilderImpl.

I mean, probably there's one part in there that has a bug in query decoding, and probably Quarkus is not the only place that would generate such (valid) query parts, so it will keep failing for other valid URIs produced by non-Quarkus.

@thomasdarimont
Copy link
Contributor Author

thomasdarimont commented Jun 10, 2024

That's interesting, I was under the impression that Encode.encodeQueryParam(..) should produce the same result as the JDK URLEncoder.encode(...) method. Without that fix URLs which contain parameterized URIs like shown above will always be broken unless they are explicitly encoded before setting the value. However, this opens up the possibility for duplicate encoding problems.

Browser APIs also encode the ? in the URL, see:

encodeURIComponent("https://somehost:1234/path?param=value")

yields:
'https%3A%2F%2Fsomehost%3A1234%2Fpath%3Fparam%3Dvalue'

@gsmet gsmet modified the milestones: 3.12 - main, 3.11.2 Jun 10, 2024
@FroMage
Copy link
Member

FroMage commented Jun 10, 2024

That's interesting, I was under the impression that Encode.encodeQueryParam(..) should produce the same result as the JDK URLEncoder.encode(...) method

URLEncoder is not made for encoding URLs, contrary to what its name indicates, it's very confusing… https://www.talisman.org/~erlkonig/misc/lunatech%5Ewhat-every-webdev-must-know-about-url-encoding/#Donotuse%7B%7Bjava.net.URLEncoder%7D%7Dor%7B%7Bjava.net.URLDecoder%7D%7DforwholeURLs

Without that fix URLs which contain parameterized URIs like shown above will always be broken unless they are explicitly encoded before setting the value

I don't think that's true. I think the only thing it has to encode, when placing a value in a query part value are the # char (end of query part), and &= (query key/value separators) and % (percent encoder) and + possibly, I don't remember the details.

However, this opens up the possibility for duplicate encoding problems

Yeah, that's what I'm worried about. Incorrectly encoding things always ends up making decoding wrong somewhat, in ways that are hard to predict.

But really, you should check what code actually has an issue with ? unencoded in URI queries, because it will come up again from other sources.

@thomasdarimont
Copy link
Contributor Author

thomasdarimont commented Jun 10, 2024

Thanks for the references! However, in the examples I was not using java.net.URLEncoder to encode the full query but only the parameter value. Also the Browser WEB API function encodeURIComponent("https://somehost:1234/path?param=value") also encodes the ? to %3F.

Unfortunately, since this Encode class is used interally all over the place in Quarkus applications, or more specifically in Keycloak it is very hard to securly ensure the proper encoding in all the places without raising duplicate encoding issues.

gsmet pushed a commit to gsmet/quarkus that referenced this issue Jun 10, 2024
…ram(..)

Previously `?` in query parameter values where encoded as is which caused
invalid URL values. We now replace `?` characters in query parameter values with
`%3F`.

Fixes quarkusio#41060

Signed-off-by: Thomas Darimont <thomas.darimont@googlemail.com>
(cherry picked from commit f244de0)
@FroMage
Copy link
Member

FroMage commented Jun 10, 2024

Also the Browser WEB API function

Yeah, this encodes pretty much everything, even when not required.

I think encoding extra chars is fine:

  • http://localhost/?foo=? will decode as scheme=http host=localhost path=/ query-param(foo=?)
  • http://localhost/?foo=%3F will decode as scheme=http host=localhost path=/ query-param(foo=?) (every percent-encoded pair must be decoded, even when not required by the part's encoding)

However, it is definitely a bug to reject or not properly decode http://localhost/?foo=? because it's valid and other sources will produce it. So, extra encoding is probably fine. Bad decoding is not.

holly-cummins pushed a commit to holly-cummins/quarkus that referenced this issue Jul 31, 2024
…ram(..)

Previously `?` in query parameter values where encoded as is which caused
invalid URL values. We now replace `?` characters in query parameter values with
`%3F`.

Fixes quarkusio#41060

Signed-off-by: Thomas Darimont <thomas.darimont@googlemail.com>
@gsmet gsmet modified the milestones: 3.11.2, 3.8.6 Aug 7, 2024
gsmet pushed a commit to gsmet/quarkus that referenced this issue Aug 7, 2024
…ram(..)

Previously `?` in query parameter values where encoded as is which caused
invalid URL values. We now replace `?` characters in query parameter values with
`%3F`.

Fixes quarkusio#41060

Signed-off-by: Thomas Darimont <thomas.darimont@googlemail.com>
(cherry picked from commit f244de0)
danielsoro pushed a commit to danielsoro/quarkus that referenced this issue Sep 20, 2024
…ram(..)

Previously `?` in query parameter values where encoded as is which caused
invalid URL values. We now replace `?` characters in query parameter values with
`%3F`.

Fixes quarkusio#41060

Signed-off-by: Thomas Darimont <thomas.darimont@googlemail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rest kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants