-
Notifications
You must be signed in to change notification settings - Fork 1.5k
PARQUET-346: Minor fixes for PARQUET-350, PARQUET-348, PARQUET-346, PARQUET-345 #252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, as far as I can tell, this implementation is faster than:
try {
final int id = enumLookup.get(value);
} catch (NullPointerException e) {
throw new ParquetDecodingException(...)
}
I thought maybe the above would be better because there's sort of a hint to the branch predictor that the 'normal' case is for this not to throw so predict that... but i did a small benchmark and the if statement seems to be faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment to state the struct being passed in should be file schema stored in json, and the isUnion method will nor be called...
Basically just document why it's separated from the normal convert code path in case we forget it in the future
|
Minor commet, LGTM |
|
@tsdeng updated with comments, and cleaned up usage of ThriftSchemaConverter |
|
+1 |
|
Thanks! |
…ARQUET-345 PARQUET-346: ThriftSchemaConverter throws for unknown struct or union type This is triggered when passing a StructType that comes from old file metadata PARQUET-350: ThriftRecordConverter throws NPE for unrecognized enum values This is just some better error reporting. PARQUET-348: shouldIgnoreStatistics too noisy This is just a case of way over logging something, to the point that it make the logs unreadable PARQUET-345 ThriftMetaData toString() should not try to load class reflectively This is a case where the error reporting itself crashes, which results in the real error message getting lost Author: Alex Levenson <alexlevenson@twitter.com> Closes apache#252 from isnotinvain/alexlevenson/various-fixes and squashes the following commits: 9b5cb0e [Alex Levenson] Add comments, cleanup some minor use of ThriftSchemaConverter 376343e [Alex Levenson] Fix test d9d5dad [Alex Levenson] add license headers e26dc0c [Alex Levenson] Add tests 8d9dde0 [Alex Levenson] Fixes for PARQUET-350, PARQUET-348, PARQUET-346, PARQUET-345
PARQUET-348: shouldIgnoreStatistics too noisy This is just a case of way over logging something, to the point that it make the logs unreadable Author: Alex Levenson <alexlevenson@twitter.com> Closes apache#252 from isnotinvain/alexlevenson/various-fixes and squashes the following commits: 9b5cb0e [Alex Levenson] Add comments, cleanup some minor use of ThriftSchemaConverter 376343e [Alex Levenson] Fix test d9d5dad [Alex Levenson] add license headers e26dc0c [Alex Levenson] Add tests 8d9dde0 [Alex Levenson] Fixes for PARQUET-350, PARQUET-348, PARQUET-346, PARQUET-345
PARQUET-346:
ThriftSchemaConverter throws for unknown struct or union type
This is triggered when passing a StructType that comes from old file metadata
PARQUET-350:
ThriftRecordConverter throws NPE for unrecognized enum values
This is just some better error reporting.
PARQUET-348:
shouldIgnoreStatistics too noisy
This is just a case of way over logging something, to the point that it make the logs unreadable
PARQUET-345
ThriftMetaData toString() should not try to load class reflectively
This is a case where the error reporting itself crashes, which results in the real error message getting lost