Skip to content

UriComponentsBuilder silently drops some Registry-based Naming Authorities #27774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fedorka opened this issue Dec 6, 2021 · 4 comments
Closed
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket)

Comments

@fedorka
Copy link

fedorka commented Dec 6, 2021

In spring-web 5.3.13, UriComponentsBuilder.fromUri / UriComponentsBuilder.uri silently drops some RFC 2396 3.2.1 Registry-based Naming Authorities even when they are accepted as RFC 3986 3.2 authorities.

    // I tested java 17.0.1-zulu and 8.0.312-zulu - with the same results.

    @Test
    public void testUriComponentBuilder() throws URISyntaxException {
        // Generic scheme to avoid any assumptions around HTTP
        String scheme = "scheme";

        // Matches
        // - RFC 2396 3.2.1. "reg_name" Registry-based Naming Authority
        // - RFC 3986 3.2.2 "host" Host
        // NOT an RFC 2396 3.2.2. "server" Server-based Naming Authority
        String authority = "rfc_2396_authority";


        String uriString = scheme + ":" + "//" + authority ;

        URI testUri = new URI(uriString);

        UriComponents uriComponentsFromUriString = UriComponentsBuilder.fromUriString(uriString).build(true);
        UriComponents uriComponentsFromUri = UriComponentsBuilder.fromUri(testUri).build(true);

        // Demonstrate that the URI is well formed according to RFC 2396 and parses as expected
        assertAll("testUriFromString should have a RFC 2396 Registry-based Naming Authority (3.2.1.)",
                () -> assertEquals(authority, testUriFromString.getAuthority()),
                () -> assertNull(testUriFromString.getUserInfo()),
                () -> assertNull(testUriFromString.getHost()),
                () -> assertEquals(-1, testUriFromString.getPort())
        );

        // Demonstrate that the URI is parseable under RFC 3986 from string form
        assertAll("uriComponentsFromUriString should be populated ",
                () -> assertEquals(authority, uriComponentsFromUriString.getHost())
        );

        // Demonstrate that the parsed URI is equivalent (according to java.net.URI)
        assertEquals(testUriFromString, uriComponentsFromUriString.toUri());

        // Unexpectedly fails as getHost() return a null even though testUri has an RFC 3986 host
        assertAll("uriComponentsFromUri should have populated host ",
                () -> assertEquals(authority, uriComponentsFromUri.getHost())
        );

    }

When java.net.URI encounters an RFC 2396 3.2.1 authority (identified as any correct authority not parseable under RFC 2396 3.2.2), it returns null values for the server-based naming authority components, causing UriComponentBuilder to assume they are unset and silently dropping the data. However, this data is available via URI.getAuthority().

I would expect that UriComponentBuilder would attempt to reparse the authority from Uri.getAuthority under the RFC 3986 3.2 specification.

The fact that these classes are written against different versions of the spec is obviously a complicating factor. The most straightforward option is to reparse the URI from scratch under the RFC 3986 spec, though I think that it would be preferable to only parse the authority component to limit the blast radius of this change. I am happy to contribute a PR, but would appreciate thumbs up on the approach before investing the time as this would substantially change existing behavior if UriComponentBuilder.uri is being used after the authority components have been set via another method.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Dec 6, 2021
@jhoeller jhoeller added the in: web Issues in web modules (web, webmvc, webflux, websocket) label Dec 7, 2021
@jhoeller jhoeller added this to the Triage Queue milestone Dec 7, 2021
@rstoyanchev rstoyanchev self-assigned this Dec 15, 2021
@rstoyanchev
Copy link
Contributor

rstoyanchev commented Dec 15, 2021

I'm not quite sure what your are proposing or what you mean with re-parsing.

If understand, when the source is a URI, we could mirror the properties of java.net.URI, adding a new getAuthority method, and returning null from getHost. I'm less clear on what happens when parsing from a String. Our regex seems to pick up the given authority as the host, but the host in that case should probably be null.

Overall, If the current spec doesn't support the concept of a non-server authority, I'm not sure we need to support it UriComponentsBuilder.

@rstoyanchev rstoyanchev added status: waiting-for-feedback We need additional information before we can continue and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Dec 15, 2021
@spring-projects-issues
Copy link
Collaborator

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

@spring-projects-issues spring-projects-issues added the status: feedback-reminder We've sent a reminder that we need additional information before we can continue label Dec 22, 2021
@fedorka
Copy link
Author

fedorka commented Dec 22, 2021

This is certainly complicated! Thank you for being patient while I try to explain my understanding.

  1. I believe that scheme://rfc_2396_authority is a valid RFC 3986 URI with host = "rfc_2396_authority" (which matches reg-name = *( unreserved / pct-encoded / sub-delims )).
  2. Given (1), I believe that UriComponentsBuilder::fromUriString handles this case correctly as it populates the host field of the builder (and ultimately the HierarchicalUriComponents).
  3. Also given (1), I believe that UriComponentsBuilder::uri handles this case incorrectly as it does not populate the host field.

My analysis is that the uri method assumes that RFC 2396 hosts are equivalent to RFC 3986 hosts, but that does not align with the RFCs. The RFC 2396 host (as defined in RFC 2396 3.2.2) is only valid in a subset of RFC 2396 authorities and so leaves the RFC 2396 3.2.1 case uncovered.

I attempted to demonstrate this through showing how fromUriString parses the URI, as that appears to be correctly aligned with RFC 3986 and I believes demonstrates the correct parsing of the URI under the newer spec.

As far as "re-parsing", I was simply suggesting a solution. If you agree that the current behavior is unexpected then I can write a PR and we can more directly discuss the possible solution.

adding a new getAuthority method
This would be a useful (to me) enhancement to UriComponents and I'd be happy to contribute it, but I feel it is decoupled from the above behavior.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue status: feedback-reminder We've sent a reminder that we need additional information before we can continue labels Dec 22, 2021
@rstoyanchev
Copy link
Contributor

It is surprising that URI#getHost returns null for what should be a valid host under RFC 3986 syntax, and that there is not a mention of it in the Javadoc. We could consider an improvement in UriComponentsBuilder#uri to catch this case and attempt to align with UriComponentsBuilder#fromUriString.

Closing this one but feel free to submit a PR and reference this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants