Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to support convert json to avro value with specified schema? #154

Open
clojurians-org opened this issue Aug 1, 2020 · 4 comments
Open

Comments

@clojurians-org
Copy link

clojurians-org commented Aug 1, 2020

i want to implement the equal function for avro command line:

avro-tools fromjson --schema-file ../../data-model/schema/dc.avsc ../../data-model/sample/dc.json > dc_schema.avro

i can use [to_avro_datum] to write [Avro Value] to binary file.
but i can't find a way to convert [Json Value] to [Avro Value] with specified schema.

i can simple convert the [Json Value] to [Avro Value], but it's schema is not right.

let avro_val = avro_rs::to_value(json_val).unwrap() ;

maybe the [from_avro_datum] with[ json reader] can do it, but it's binding to bytes reader

@poros
Copy link
Collaborator

poros commented Aug 9, 2020

I would recommend you use the Writer and Reader if you can. to_avro_datum and from_avro_datum alone are not properly compliant to the avro spec, if used in the wrong way. Nevertheless, it might be OK for your use case, as I have never used that command line tool and I know nothing about it.

The JSON you have might be complex and perhaps might not be straightforward to translate to avro. Or there could be an incompatibility with the schema. Or some feature missing in the library (unfortunately there are quite a few). Without a bit more info, it's a bit hard for me to help out more, I am afraid...

@clojurians-org
Copy link
Author

clojurians-org commented Aug 10, 2020 via email

@clojurians-org
Copy link
Author

clojurians-org commented Aug 10, 2020

This is avro-tools code:
https://github.com/apache/avro/blob/master/lang/java/tools/src/main/java/org/apache/avro/tool/Main.java

It basically comtains two-level format.
Raw Avro Fragments (low level) and avro container with schema, block and codec.
For message protocol, we often use Fragment without schema as it reduce size. Javascript avro library is belong to this.

@poros
Copy link
Collaborator

poros commented Aug 12, 2020

I see, thanks for sharing more details. It seems a good use case for to_avro_datum, indeed.
Without the error and the input data is a bit difficult to understand what is going on, but let me give you some pointers.

This is where the function you are calling is defined:

pub fn to_value<S: Serialize>(value: S) -> Result<Value, Error> {

As you can see, this is pretty low level and it can fail if the avro schema is not valid as per the Avro spec.

Another option you have is using https://github.com/flavray/avro-rs/blob/master/src/types.rs#L231 , which implements a very naive conversion from JSON value to Avro value. This also could fail if you need a more clever conversion because of your avro schema.

One more option is https://serde.rs/transcode.html which could help you out with the transcoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants