Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

org.apache.avro.SchemaParseException: Can't redefine: test #182

Open
lockwobr opened this issue Feb 17, 2021 · 6 comments
Open

org.apache.avro.SchemaParseException: Can't redefine: test #182

lockwobr opened this issue Feb 17, 2021 · 6 comments

Comments

@lockwobr
Copy link

Having issues writing data with avro_rs and reading it with apache avro java. I was able to create one example that is close to what i am experiencing. I have a pretty complicated schema, so trying to boil it down the problem bits.

This code works just fine, but went read into avro tools i get an error.

use avro_rs::{Codec, Reader, Schema, Writer, from_value, types::Record, Error};
use serde::{Deserialize, Serialize};
use std;

#[derive(Debug, Deserialize, Serialize)]
struct Test {
    a: i64,
    b: String,
    test: Test2,
}

#[derive(Debug, Deserialize, Serialize)]
struct Test2 {
    a: i64,
    b: String,
}


fn main() -> Result<(), Error> {
    let raw_schema = r#"
        {
            "type": "record",
            "name": "test",
            "fields": [
                {"name": "a", "type": "long", "default": 42},
                {"name": "b", "type": "string"},
                {"name": "test", "type": {
                    "type": "record",
                    "name": "test",
                    "fields": [
                        {"name": "a", "type": "long", "default": 42},
                        {"name": "b", "type": "string"}
                    ]
                }}
            ]
        }
    "#;

    let schema = Schema::parse_str(raw_schema)?;

    // println!("{:?}", schema);

    let mut writer = Writer::new(&schema, std::io::stdout());

    let test = Test {
        a: 27,
        b: "foo".to_owned(),
        test: Test2 {
            a: 23,
            b: "bar".to_owned(),
        }
    };

    writer.append_ser(test)?;
    writer.flush()?;

    Ok(())
}
❯./target/debug/example > avro.out
❯ java -jar ~/bin/avro-tools-1.10.1.jar tojson avro.out
21/02/17 10:38:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.avro.SchemaParseException: Can't redefine: test
        at org.apache.avro.Schema$Names.put(Schema.java:1542)
        at org.apache.avro.Schema$Names.add(Schema.java:1536)
        at org.apache.avro.Schema.parse(Schema.java:1655)
        at org.apache.avro.Schema.parse(Schema.java:1668)
        at org.apache.avro.Schema$Parser.parse(Schema.java:1425)
        at org.apache.avro.Schema$Parser.parse(Schema.java:1413)
        at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:131)
        at org.apache.avro.file.DataFileStream.<init>(DataFileStream.java:90)
        at org.apache.avro.tool.DataFileReadTool.run(DataFileReadTool.java:93)
        at org.apache.avro.tool.Main.run(Main.java:67)
        at org.apache.avro.tool.Main.main(Main.java:56)

Seems like there might be a validation that apache avro is doing that avro_rs is not. How I found this error is using the parse_list or load a directory of schema files. I have a record type that is used more that once in a parent record type and because it in lines the child schemas in the record I get an error this like the one above. In apache avro when it inlines the child schemas in parent it only defines the child record type once and then uses it by name the subsequent times. In my example, this is sort of the same issues, the record type name is the same "test" and avro_rs is ok with that, but apache avro is not.

@martin-g
Copy link

martin-g commented Mar 5, 2022

This issue is fixed in apache_avro crate. There one can use schema references:

{
            "type": "record",
            "name": "test",
            "fields": [
                {"name": "a", "type": "long", "default": 42},
                {"name": "b", "type": "string"},
                {"name": "test", "type": "test"}
            ]
        }

apache_avro is a fork/donation of this project to Apache Avro project.
There is no official release of the crate yet but it should be released soon with Avro 1.11.1!

@travisbrown
Copy link

@martin-g Thanks for the pointer! There still seems to be an issue with can't refine errors, at least in some non-recursive cases. Take the following example:

fn main() {
    let schema = r#"
    {
      "name": "test.test",
      "type": "record",
      "fields": [
        {
          "name": "bar",
          "type": { "name": "test.foo", "type": "record", "fields": [{ "name": "id", "type": "long" }] }
        },
        { "name": "baz", "type": "test.foo" }
      ]
    }
    "#;

    let schema = apache_avro::schema::Schema::parse_str(&schema).unwrap();

    println!("{}", serde_json::to_string(&schema).unwrap());
}

This prints the following (the same thing happens if the test.foo definition is in a separate file):

$ target/release/avro-test | jq
{
  "type": "record",
  "name": "test.test",
  "fields": [
    {
      "name": "bar",
      "type": {
        "type": "record",
        "name": "test.foo",
        "fields": [
          {
            "name": "id",
            "type": "long"
          }
        ]
      }
    },
    {
      "name": "baz",
      "type": {
        "type": "record",
        "name": "test.foo",
        "fields": [
          {
            "name": "id",
            "type": "long"
          }
        ]
      }
    }
  ]
}

Which will cause the Java tooling to fail with the org.apache.avro.SchemaParseException: Can't redefine: test error above.

@martin-g
Copy link

martin-g commented Mar 6, 2022

@travisbrown I've logged https://issues.apache.org/jira/browse/AVRO-3433 for the issue!

@travisbrown
Copy link

@martin-g Thanks very much!

@martin-g
Copy link

martin-g commented Mar 6, 2022

@travisbrown Please try https://github.com/apache/avro/tree/avro-3433-preserve-schema-ref-in-json
There are some tests to update but the issue should be fixed!

@martin-g
Copy link

martin-g commented Mar 6, 2022

apache/avro#1580 is ready for review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants