Skip to content

[Bug] IRI References with URL encoded '[' and ']' incorrectly fail validation #1050

@4naesthetic

Description

@4naesthetic

IRIs containing %-encoded [ and ] characters incorrectly fail validation due to the following code in the IriFormat and IriReference introduced in #983:

String query = uri.getQuery();
if (query != null) {
// [ and ] must be percent encoded
if (query.indexOf('[') != -1 || query.indexOf(']') != -1) {
return false;
}
}

uri.getQuery() decodes the query string, so this scenario fails validation even though [ and ] were correctly escaped. Potentially could use uri.getRawQuery() instead. This wasn't picked up in the original PR as there was no test case written that would pass when these characters were escaped, only that they were disallowed.

Additional test case that demonstrates the issue:

@Test
void queryWithEncodedBracketsShouldPass() {
    String schemaData = "{\r\n"
            + "  \"format\": \"iri-reference\"\r\n"
            + "}";

    SchemaValidatorsConfig config = new SchemaValidatorsConfig();
    config.setFormatAssertionsEnabled(true);
    JsonSchema schema = JsonSchemaFactory.getInstance(VersionFlag.V202012).getSchema(schemaData, config);
    Set<ValidationMessage> messages = schema.validate("\"https://test.com/assets/product.pdf?filter%5Btest%5D=1\"",
            InputFormat.JSON);
    assertTrue(messages.isEmpty()); // Fails
}

Test file references:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions