Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avro 1.12 compatibility: schema references cause AvroTypeException Undefined schema #5320

Closed
creckord opened this issue Oct 8, 2024 · 4 comments

Comments

@creckord
Copy link

creckord commented Oct 8, 2024

Description

Registry Version: 2.5.9, (I don't think it depends on the registry server version, and it also happens with client version 2.6.5)

We are using the Apache Avro Java library both as a direct dependency and through the apicurio-registry-serdes-avro-serde library with Spring and Kafka.

In light of CVE-2024-47561, we tried updating our Avro dependency from 1.11.3 to 1.12.0, expecting no breaking changes from this being a minor release and from their changelog.

However, this caused a bunch of deserialization exceptions from the Apicurio Avro Serdes for schemas that contained an external type reference:

org.apache.avro.AvroTypeException: Undefined schema: org.example.MyReferencedEvent
    at org.apache.avro.util.SchemaResolver$ResolvingVisitor.visitNonTerminal(SchemaResolver.java:187)
    at org.apache.avro.util.Schemas.visitNonTerminal(Schemas.java:109)
    at org.apache.avro.util.Schemas.visit(Schemas.java:82)
    at org.apache.avro.ParseContext.lambda$ensureSchemasAreResolved$0(ParseContext.java:301)
    at java.util.LinkedHashMap$LinkedValues.forEach(LinkedHashMap.java:833)
    at org.apache.avro.ParseContext.ensureSchemasAreResolved(ParseContext.java:301)
    at org.apache.avro.ParseContext.resolve(ParseContext.java:324)
    at org.apache.avro.Schema$Parser.parse(Schema.java:1541)
    at org.apache.avro.Schema$Parser.parse(Schema.java:1515)
    at io.apicurio.registry.serde.avro.AvroSchemaUtils.parse(AvroSchemaUtils.java:69)
    at io.apicurio.registry.serde.avro.AvroSchemaParser.parseSchema(AvroSchemaParser.java:58)
    at io.apicurio.registry.serde.avro.AvroSchemaParser.parseSchema(AvroSchemaParser.java:37)
    at io.apicurio.registry.resolver.DefaultSchemaResolver.lambda$resolveSchemaByContentId$0(DefaultSchemaResolver.java:204)
    at io.apicurio.registry.resolver.ERCache.lambda$getValue$0(ERCache.java:201)
    at io.apicurio.registry.resolver.ERCache.retry(ERCache.java:254)
    at io.apicurio.registry.resolver.ERCache.getValue(ERCache.java:200)
    at io.apicurio.registry.resolver.ERCache.getByContentId(ERCache.java:180)
    at io.apicurio.registry.resolver.DefaultSchemaResolver.resolveSchemaByContentId(DefaultSchemaResolver.java:193)
    at io.apicurio.registry.resolver.DefaultSchemaResolver.resolveSchemaByArtifactReference(DefaultSchemaResolver.java:170)
    at io.apicurio.registry.serde.AbstractKafkaDeserializer.resolve(AbstractKafkaDeserializer.java:147)
    at io.apicurio.registry.serde.AbstractKafkaDeserializer.deserialize(AbstractKafkaDeserializer.java:104)
    at io.apicurio.registry.serde.AbstractKafkaDeserializer.deserialize(AbstractKafkaDeserializer.java:126)
    at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:73)
    at org.apache.kafka.clients.consumer.internals.CompletedFetch.parseRecord(CompletedFetch.java:321)
    at org.apache.kafka.clients.consumer.internals.CompletedFetch.fetchRecords(CompletedFetch.java:283)
    at org.apache.kafka.clients.consumer.internals.FetchCollector.fetchRecords(FetchCollector.java:168)
    at org.apache.kafka.clients.consumer.internals.FetchCollector.collectFetch(FetchCollector.java:134)
    at org.apache.kafka.clients.consumer.internals.Fetcher.collectFetch(Fetcher.java:145)
    at org.apache.kafka.clients.consumer.internals.LegacyKafkaConsumer.pollForFetches(LegacyKafkaConsumer.java:666)
    at org.apache.kafka.clients.consumer.internals.LegacyKafkaConsumer.poll(LegacyKafkaConsumer.java:617)
    at org.apache.kafka.clients.consumer.internals.LegacyKafkaConsumer.poll(LegacyKafkaConsumer.java:590)
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:874)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollConsumer(KafkaMessageListenerContainer.java:1625)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doPoll(KafkaMessageListenerContainer.java:1600)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1405)
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1296)

This happens for events looking like this:

{
  "namespace": "org.example",
  "name": "TheEvent",
  "type": "record",
  "fields": [
    {
      "name": "nestedEvent",
      "type": "org.example.MyReferencedEvent"
    },
    {
      "name": "other",
      "type": "boolean"
    }
}

The schema reference is recorded in the registry through Maven:

    <artifact>
        <groupId>default</groupId>
        <artifactId>TheEvent</artifactId>
        <version>${the-event.version}</version>
        <type>AVRO</type>
        <file>${project.basedir}/src/main/resources/avro/TheEvent.avsc</file>
        <ifExists>RETURN_OR_UPDATE</ifExists>
        <canonicalize>true</canonicalize>
        <references>
            <reference>
                <groupId>default</groupId>
                <artifactId>MyReferencedEvent</artifactId>
                <version>${referenced-event.version}</version>
                <type>AVRO</type>
                <ifExists>RETURN_OR_UPDATE</ifExists>
                <file>${project.basedir}/src/main/resources/avro/MyReferencedEvent.avsc</file>
                <name>org.example.MyReferencedEvent</name>
            </reference>
        </references>
    </artifact>

Nothing has changed with regards to the schemas, schema versions, or the reference between the working and the broken build. The only difference is the avro library version.

Possibly also of note: the exact schema versions for both the received event and the contained reference object were available as generated classes on the classpath (generated with the avro-maven-plugin using Avro 1.12.0).

I am not sure if this is actually an apicurio-registry issue, or an Apache Avro issue, or just an expected incompatibility at the moment. But since the error comes from a call made by the Apicurio Serde, I'm trying my luck here, first.

Environment

Registry Server:
Apicurio Registry 2.5.9 in Kubernetes

Client:

  • Spring Boot 3.3.4
  • Java 21
  • apicurio-registry-serdes-avro-serde 2.5.9.Final
  • Kafka Client 3.7.1
  • Spring Kafka 3.2.1
@apicurio-bot
Copy link

apicurio-bot bot commented Oct 8, 2024

Thank you for reporting an issue!

Pinging @jsenko to respond or triage.

@creckord
Copy link
Author

creckord commented Oct 8, 2024

Forgot: We now downgraded from Avro 1.12.0 to 1.11.4, which also fixes the CVE, and everything works again. Just interested in what kind of compatibility to expect here.

@carlesarnal
Copy link
Member

So, the reason for the exception is that we were relying (in some usecases) in the concrete exception being thrown when a schema could not be found by the Avro parser and that exception has been changed in Avro 1.12 from SchemaParseException to the exception you're getting. We've moved to 1.11.4 in 2.6.5.Final, so you should be good to go there and to 1.12 in 3.0, so I would consider this problem fixed. Let me know if you think it's not the case.

@creckord
Copy link
Author

Yes, that's fine. Thank you for the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants