Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include ArtifactType header in REST response to /ids/* requests #5209

Closed
wolfchimneyrock opened this issue Sep 19, 2024 · 6 comments
Closed

Comments

@wolfchimneyrock
Copy link
Contributor

I'm trying to replicate the confluent-kafka-python schema_registry_client behavior. in its get_schema(schema_id) method, the returned object includes both the schema contents string and its type.

Currently, using these api calls /ids/contentIds/{id}, /ids/globalIds/{id} and /ids/contentHashes/{hash} we get the artifact version contents string, but no indication of the artifact type.

can you add a response header with the artifact's type? this change wouldn't impact other clients, but would help to achieve feature parity with the confluent python client.

@apicurio-bot
Copy link

apicurio-bot bot commented Sep 19, 2024

Thank you for reporting an issue!

Pinging @andreaTP to respond or triage.

@EricWittmann
Copy link
Member

This could be done fairly easily for the globalId but not easily for the other two. The reason is that globalId is a pointer to a specific artifact version. So it's easy to get the artifactType in this case, because we know the artifact. But for contentId and contentHash, the lookup is artifact agnostic, and is a many-to-one relationship. It's possible that the content is used by multiple versions across multiple artifacts. And it's possible that those artifacts have different types (though admittedly the normal case is that they would have the same type).

@carlesarnal @jsenko wdyt?

@carlesarnal
Copy link
Member

carlesarnal commented Sep 23, 2024

We could add the content type for the other calls (contentId, contentHash) but that's, at best, just a hint, not really the artifact type. The real reason behind this is that we do not enforce validity rules by default (unlike other schema registries), so you might decide to register a JSON artifact and tell Apicurio Registry that it's e.g. Avro. This was a decision made to embrace flexibility.

There are two possible ideas here:

  • Discover the type from the content: For this to be possible the artifact must be valid for it's actual type (e.g. if it's Avro, it must be valid, with all the references being resolvable etc).
  • A possible idea would be to introduce a new feature that blocks the global validity rule, enforcing it always (at startup time) so we can always discover the type from the content or even discover it from one of the artifacts that are referencing that content (since it would necessarily be the same for all those artifacts). This is honestly how I would run Apicurio Registry in production anyway.

@EricWittmann
Copy link
Member

EricWittmann commented Sep 23, 2024

I think if we want this feature then we simply implement it and assume the result will be correct as long as users of the registry don't screw up.

We can look up the artifact type of one of the artifacts referencing the content. We can then simply return that value cuz it will be right almost every time. The only time it won't be right is if somebody registered an artifact under the incorrect artifact type.

@wolfchimneyrock
Copy link
Contributor Author

I think adding the header for just globalIds response is fine; for the use-case I am thinking of (a python program that can render any kafka message as human-readable yaml without prior knowledge of the schema type) it will likely require the references=DEREFERENCED parameter that the globalId query provides, which means in our organization we may have to restrict ourselves to use globalId in the id envelope anyway.

@EricWittmann
Copy link
Member

@github-project-automation github-project-automation bot moved this from In Progress to Done in Registry 3.0 Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

3 participants