-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVRO-2307: [java] list of primitive #2389
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
* AVRO-2307: list of primitive
Hi @clesaec I looks like this change broke the Avro deserialization of arrays of items using logicalTypes in Quarkus. In our existing solution, we now get this error after bumping the Quarkus version to a version using 1.12.0.
|
@opwvhk, could you have a look to this, as I'm in vacation and no longer participate much in apache project. (nice to see 1.12.0 is available 🙂) |
if (o == null) { | ||
return; | ||
} | ||
this.add(location, o.floatValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces a loss of precision, since it converts the Double
to a float
, and then casts back to double
. An example:
jshell> Double d = 40.001
d ==> 40.001
jshell> double d2 = d.floatValue()
d2 ==> 40.000999450683594
So for example, if you have a field in the schema that's an array of doubles, and you use the Avro JSON parser to read a record with that schema from a JSON object where the value of that field is [40.001]
, the value in the resulting object will be an array with the element 40.000999450683594
. This caused failures in unit tests that were using Precision.equals
to compare values because that difference is much bigger than Precision
's default epsilon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like Confluent Schema Registry may also be running into this regression
What is the purpose of the change
AVRO-2307 : sub task 1.
As described in JIRA
Another challenge we’ve come across is that lists of primitive types (floats in our case) are always boxed into Object Floats by Avro. In our case, this resulted in millions of Objects / second getting allocated, causing pathological GC pressure.
So, this PR aims to introduce list of primitives classes that save memory consomption when Generic data with array of boolean, int, long, float or double.
Verifying this change
New unit test class PrimitivesArraysTest is implemented + rest of code use it via GenericData.newArray method.
Documentation