Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AvroTypeException for some edge cases. #156

Closed
bashir2 opened this issue Apr 15, 2021 · 6 comments · Fixed by #157 or #202
Closed

AvroTypeException for some edge cases. #156

bashir2 opened this issue Apr 15, 2021 · 6 comments · Fixed by #157 or #202
Assignees
Labels
bug Something isn't working P0:immediate An issue to be handled ASAP

Comments

@bashir2
Copy link
Collaborator

bashir2 commented Apr 15, 2021

We don't know the root cause of this issue yet but @kimaina has reported the following exception in some special Observation resource cases:

org.apache.avro.AvroTypeException: Cannot encode decimal with scale 16 as scale 4 without rounding in field value in field quantity in field value

The full stack trace is here. This is coming from this line and as a temporary work-around, an exception handling is added around this line in PR #155. This will log the resource ID that causes this so it should be easy to find the exact resource that triggers this.

@bashir2 bashir2 added bug Something isn't working P1:must As issue that definitely needs to be implemented in near future. labels Apr 15, 2021
@bashir2 bashir2 added this to the AMPATH_Deployment milestone Apr 15, 2021
bashir2 added a commit that referenced this issue Apr 23, 2021
This is to reproduce the bug #156 in a unit-test environment.
@bashir2
Copy link
Collaborator Author

bashir2 commented Apr 23, 2021

I did not mean to close this (GitHub's misinterpretation of the PR comment).

@bashir2
Copy link
Collaborator Author

bashir2 commented Jul 15, 2021

The root cause of this bug seems to be here in Bunsen where the schema for a decimal is set to have precision 12 and scale 4. However, this limits the domain of values that can be represented (e.g., large numbers cannot be represented). Although the schema is set like this, but the actual value returned here is not guaranteed to have this precision/scale. So for example, for large numbers we don't get any errors when turning them into BigDecimal (with parse()) but if we try to serialize these, it will throw the exceptions in this function because a number with more than 12 significant digits cannot be represented by precision 12.

It seems to me that the choice of using Avro decimal type to represent FHIR decimal type needs to be changed.

@bashir2 bashir2 added P0:immediate An issue to be handled ASAP and removed P1:must As issue that definitely needs to be implemented in near future. labels Jul 15, 2021
@bashir2
Copy link
Collaborator Author

bashir2 commented Jul 15, 2021

Changing this to P0 since it is blocking #168 too.

@bashir2
Copy link
Collaborator Author

bashir2 commented Jul 16, 2021

Update: I made this tentative fix on the Bunsen side. With that change incorporated locally, I verified that this bug will be fixed in this commit.

@kimaina, is it possible for you to patch these two commits in your local environment and test the batch pipeline with your data?

@bashir2 bashir2 self-assigned this Jul 16, 2021
@kimaina
Copy link
Collaborator

kimaina commented Jul 16, 2021

@bashir2 thank you for working on this! Let me test this out then get back to you!

@kimaina
Copy link
Collaborator

kimaina commented Jul 27, 2021

Update: This is to confirm that this tentative fix on the Bunsen side and this commit.
fixes this issue! Thanks @bashir2 for fixing this. Let's finalize and close this ticket!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0:immediate An issue to be handled ASAP
Projects
None yet
2 participants