-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bypass generic data step and serialize directly to binary #160
Comments
@Chuckame, thanks for the comprehensive issue for this really important topic! I also had this in mind and think this is the way to go. I think it would be cool to get rid of the avro dependency and create an extension library that maps our types to avro types. This would open the doors for multiplatform of the core library while still supporting users that rely on avro types. |
Just as a little heads up, I've created a new branch https://github.com/avro-kotlin/avro4k/tree/way-to-multiplatform, which implements a PoC that directly encodes to avro binary without using the avro library. So far, everything looks very promising. I've created a small benchmark that compares the speed of Avro4k (old), Avro4kDirect (new), and Jackson. Here are the results:
I will publish the code for the benchmark soon. Open stuff for the branch:
|
So cool! And wait for my refacto to be merged, where it should improve a lot of little things 👼 And mostly the encoder/decoder architecture that was simplified |
Can you create the PR as Draft to easily open discussions ? Done ! |
The sources of the benchmark used for the table above can now be found at https://github.com/avro-kotlin/kotlin-avro-benchmark. |
Can you add a little of polymorphism? With using the sealed root type as the serializer descriptor for avro4k? E.g |
I don't actually get what you mean, sry. Can you provide a PR? |
@thake Just a global comment on the recent work on avro4k, but we should talk about the changes we want to do inside the codebase and synchronise. Because as I can see, I reworked all the codebase as already said, but you are also moving a lot of stuff (tests, classes). I'm just afraid about rebases and duplicate work. Are you available in next few days to meet (discord, zoom, google meet, ..) to clarify and maybe make some roadmap ? |
#186 introduced |
Released in v2.0.0 |
Currently, the library is just interfacing as it can to the official apache avro library using GenericRecord and other GenericData stuff.
On avro4k codebase simplicity, it seems easy while it's not really since the codebase is doing a lot of adaptation to let the GenericData happy with the generic stuff we generated.
On performance side, we are checking, converting and adapting data to fit GenericData stuff, and then this adapted stuff is also tried to be re-adapted and checked, to be then serialized to binary. Also, all is runtime specific while kotlinx serialization is mainly prepared at build time (except contextual).
On user side, each user that just want to do avro format (and not directly use the generated GenericRecords) will have to call the apache avro library to serialize it to binary or json, or to use the libraries helping this stuff.
On compatibility and spec-fitting, we are like covering all the avro spec without knowing how it works. We just put the record from
Avro.toRecord()
into the apache avro lib, and 🎉 it works.On tests, we are currently not in testing the avro format compliance, but the generated GenericRecords with another generic record.
And the last axis, kotlin multi platform. Since a big part of the codebase and the mechanisms is highly linked to the apache avro Java lib, the multiplaform dream seems complicated to reach.
Now, how to improve that? Let kotlinx serialization do his best and why it has been created : easily encode whatever the kotlin object.
Have a look to the avro
Encoder
methods : https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/io/Encoder.javaAll the methods fits exactly to kotlinx encoders, and the way of calling them fits perfectly with kotlinx workflow :
The other advantage is that there is absolutely no primitive autoboxing, also no more callbacks hell thanks to direct use of avro Encoder as output.
@thake, any comment on this? You are in this project since a long time. This will be a big refactoring, while it could be game changer.
Plan for making the changes:
The text was updated successfully, but these errors were encountered: