-
Notifications
You must be signed in to change notification settings - Fork 1.5k
PARQUET-385 PARQUET-379: Fixes strict schema merging #315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Not quite sure whether we should also deprecate the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since originalType is an Enum, seems we can just do like:
if (getOriginalType() != toMerge.getOriginalType())
reportSchemaMergeError(toMerge);
which covers:
- if both are null, the condition evals to false
- if one is null and the other is not null, the condition evals to true
- if both are not null, then the condition evals to false if they take different values, and evals to true otherwise.
@liancheng what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Being in the Scala world for so long a time, I really need to re-learn Java again :)
|
Having examined all existing 20 OriginalTypes, and 2 OriginalTypes to be added, I think this PR works well. So: |
|
+1 |
|
@liancheng can you remove the square brackets in your pr title? That trips up the merge script. |
|
@julienledem Done. |
This PR fixes strict mode schema merging. To merge two `PrimitiveType` `t1` and `t2`, they must satisfy the following conditions: 1. `t1` and `t2` have the same primitive type name 1. `t1` and `t2` either - don't have original type, or - have the same original type 1. If `t1` and `t2` are both `FIXED_LEN_BYTE_ARRAY`, they should have the same length Also, merged schema now preserves original name if there's any. Author: Cheng Lian <lian@databricks.com> Closes apache#315 from liancheng/fix-strict-schema-merge and squashes the following commits: a29138c [Cheng Lian] Addresses PR comment 1ac804e [Cheng Lian] Fixes strict schema merging
This PR fixes strict mode schema merging. To merge two `PrimitiveType` `t1` and `t2`, they must satisfy the following conditions: 1. `t1` and `t2` have the same primitive type name 1. `t1` and `t2` either - don't have original type, or - have the same original type 1. If `t1` and `t2` are both `FIXED_LEN_BYTE_ARRAY`, they should have the same length Also, merged schema now preserves original name if there's any. Author: Cheng Lian <lian@databricks.com> Closes apache#315 from liancheng/fix-strict-schema-merge and squashes the following commits: a29138c [Cheng Lian] Addresses PR comment 1ac804e [Cheng Lian] Fixes strict schema merging
This PR fixes strict mode schema merging. To merge two
PrimitiveTypet1andt2, they must satisfy the following conditions:t1andt2have the same primitive type namet1andt2eithert1andt2are bothFIXED_LEN_BYTE_ARRAY, they should have the same lengthAlso, merged schema now preserves original name if there's any.