Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE-3181] Schema Registry: Normalization #22519

Merged
merged 9 commits into from
Jul 26, 2024

Conversation

BenPope
Copy link
Member

@BenPope BenPope commented Jul 26, 2024

TODO: Add some more tests

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.2.x
  • v24.1.x
  • v23.3.x

Release Notes

Features

  • Schema Registry: Support normalize=true

Improvements

  • Schema Registry: Improve AVRO Normalization

@BenPope BenPope added the area/schema-registry Schema Registry service within Redpanda label Jul 26, 2024
@BenPope BenPope added this to the v24.2.1-final-RC milestone Jul 26, 2024
@BenPope BenPope requested a review from a team July 26, 2024 10:18
@BenPope BenPope self-assigned this Jul 26, 2024
@BenPope BenPope requested a review from a team as a code owner July 26, 2024 10:18
@BenPope BenPope requested review from oleiman and removed request for a team July 26, 2024 10:18
@BenPope BenPope requested a review from andijcr July 26, 2024 10:18
Copy link
Contributor

@andijcr andijcr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good so far

src/v/pandaproxy/schema_registry/types.h Show resolved Hide resolved
tests/rptest/tests/schema_registry_test.py Outdated Show resolved Hide resolved
Comment on lines +1444 to +1485
case rapidjson::Type::kArrayType: {
for (auto& v : val.GetArray()) {
sort(v);
}
break;
}
Copy link
Contributor

@andijcr andijcr Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some arrays should be sorted i think. looking at draft7

        "allOf": { "$ref": "#/definitions/schemaArray" },
        "anyOf": { "$ref": "#/definitions/schemaArray" },
        "oneOf": { "$ref": "#/definitions/schemaArray" },
        "required": { "$ref": "#/definitions/stringArray" },
        "enum": {
            "type": "array",
            "items": true,
            "minItems": 1,
            "uniqueItems": true
        },
        "type": {
            "anyOf": [
                { "$ref": "#/definitions/simpleTypes" },
                {
                    "type": "array",
                    "items": { "$ref": "#/definitions/simpleTypes" },
                    "minItems": 1,
                    "uniqueItems": true
                }
            ]
        },

(note that "items" should not be sorted)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll have to wait.

src/v/pandaproxy/schema_registry/handlers.cc Show resolved Hide resolved
result_raw = self._get_schemas_ids_id(id=v1_id)
assert result_raw.status_code == requests.codes.ok
assert result_raw.json()['schema'] == dataset.schema_canonical

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you rebase to dev, I've added a test with a TODO for the confluent SR client for normalization that might be nice to uncomment/implement now: https://github.com/redpanda-data/redpanda/pull/22328/files

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will have to wait

BenPope added 5 commits July 26, 2024 18:08
Specifically, accept `True` by using case-insensitive comparison.

Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm so far

Comment on lines +796 to +797
"type": "boolean"
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I think we're not in the habit of this generally, but can we include the default value here?

@BenPope BenPope force-pushed the schema_registry/normalize_schema branch from 280bc06 to fb4f5f0 Compare July 26, 2024 17:47
BenPope added 3 commits July 26, 2024 18:48
Fixes CORE-5704

TODO: Add tests

Signed-off-by: Ben Pope <ben@redpanda.com>
No functional changes

Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
@BenPope BenPope force-pushed the schema_registry/normalize_schema branch from fb4f5f0 to b0058a1 Compare July 26, 2024 17:52
@BenPope BenPope requested review from andijcr and oleiman July 26, 2024 17:53
@BenPope
Copy link
Member Author

BenPope commented Jul 26, 2024

Changes in force-push

  • rebase

Changes in force-push

  • More tests

@BenPope BenPope requested a review from pgellert July 26, 2024 17:56
Signed-off-by: Ben Pope <ben@redpanda.com>
@BenPope BenPope force-pushed the schema_registry/normalize_schema branch from b0058a1 to a453390 Compare July 26, 2024 18:15
Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@BenPope BenPope merged commit 0f4000b into redpanda-data:dev Jul 26, 2024
20 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.2.x

@vbotbuildovich
Copy link
Collaborator

/backport v24.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-22519-v23.3.x-120 remotes/upstream/v23.3.x
git cherry-pick -x 66216133c351c2cb32955a7c2a93824249d2ec19 7a9fcfa8b2260d4b6542bb3d8e0553a2bf97c02f 26dfa333d7057cc088968bde314b6293e32db3b1 a2a24d7d243a08c76f5e8c128802aefe7185ee12 fabfbc5ae69ebbf90ebd354d5f262ec8ad71a429 0772f65c063fe0a541ff2bdb73859c574371b9c3 f06cbaf63d501a2a3d42b7ea3797812f6188edd7 86feb1392a1a99a138f7f2bb7fd6d72d3c8e33e9 a4533901edc18a8ade573403500fb984be6ea34b

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.2.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-22519-v24.2.x-193 remotes/upstream/v24.2.x
git cherry-pick -x 66216133c351c2cb32955a7c2a93824249d2ec19 7a9fcfa8b2260d4b6542bb3d8e0553a2bf97c02f 26dfa333d7057cc088968bde314b6293e32db3b1 a2a24d7d243a08c76f5e8c128802aefe7185ee12 fabfbc5ae69ebbf90ebd354d5f262ec8ad71a429 0772f65c063fe0a541ff2bdb73859c574371b9c3 f06cbaf63d501a2a3d42b7ea3797812f6188edd7 86feb1392a1a99a138f7f2bb7fd6d72d3c8e33e9 a4533901edc18a8ade573403500fb984be6ea34b

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.1.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-22519-v24.1.x-412 remotes/upstream/v24.1.x
git cherry-pick -x 66216133c351c2cb32955a7c2a93824249d2ec19 7a9fcfa8b2260d4b6542bb3d8e0553a2bf97c02f 26dfa333d7057cc088968bde314b6293e32db3b1 a2a24d7d243a08c76f5e8c128802aefe7185ee12 fabfbc5ae69ebbf90ebd354d5f262ec8ad71a429 0772f65c063fe0a541ff2bdb73859c574371b9c3 f06cbaf63d501a2a3d42b7ea3797812f6188edd7 86feb1392a1a99a138f7f2bb7fd6d72d3c8e33e9 a4533901edc18a8ade573403500fb984be6ea34b

Workflow run logs.

@BenPope
Copy link
Member Author

BenPope commented Jul 26, 2024

/backport v24.2.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/redpanda area/schema-registry Schema Registry service within Redpanda
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants