Skip to content

Conversation

@isnotinvain
Copy link
Contributor

PARQUET-346:
ThriftSchemaConverter throws for unknown struct or union type
This is triggered when passing a StructType that comes from old file metadata

PARQUET-350:
ThriftRecordConverter throws NPE for unrecognized enum values
This is just some better error reporting.

PARQUET-348:
shouldIgnoreStatistics too noisy
This is just a case of way over logging something, to the point that it make the logs unreadable

PARQUET-345
ThriftMetaData toString() should not try to load class reflectively
This is a case where the error reporting itself crashes, which results in the real error message getting lost

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, as far as I can tell, this implementation is faster than:

try {
  final int id = enumLookup.get(value);
} catch (NullPointerException e) {
  throw new ParquetDecodingException(...)
}

I thought maybe the above would be better because there's sort of a hint to the branch predictor that the 'normal' case is for this not to throw so predict that... but i did a small benchmark and the if statement seems to be faster

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment to state the struct being passed in should be file schema stored in json, and the isUnion method will nor be called...
Basically just document why it's separated from the normal convert code path in case we forget it in the future

@tsdeng
Copy link
Contributor

tsdeng commented Jul 30, 2015

Minor commet, LGTM

@isnotinvain
Copy link
Contributor Author

@tsdeng updated with comments, and cleaned up usage of ThriftSchemaConverter

@tsdeng
Copy link
Contributor

tsdeng commented Jul 31, 2015

+1

@isnotinvain
Copy link
Contributor Author

Thanks!

@asfgit asfgit closed this in b86f68e Jul 31, 2015
rdblue pushed a commit to rdblue/parquet-mr that referenced this pull request Jul 13, 2016
…ARQUET-345

PARQUET-346:
ThriftSchemaConverter throws for unknown struct or union type
This is triggered when passing a StructType that comes from old file metadata

PARQUET-350:
ThriftRecordConverter throws NPE for unrecognized enum values
This is just some better error reporting.

PARQUET-348:
shouldIgnoreStatistics too noisy
This is just a case of way over logging something, to the point that it make the logs unreadable

PARQUET-345
ThriftMetaData toString() should not try to load class reflectively
This is a case where the error reporting itself crashes, which results in the real error message getting lost

Author: Alex Levenson <alexlevenson@twitter.com>

Closes apache#252 from isnotinvain/alexlevenson/various-fixes and squashes the following commits:

9b5cb0e [Alex Levenson] Add comments, cleanup some minor use of ThriftSchemaConverter
376343e [Alex Levenson] Fix test
d9d5dad [Alex Levenson] add license headers
e26dc0c [Alex Levenson] Add tests
8d9dde0 [Alex Levenson] Fixes for PARQUET-350, PARQUET-348, PARQUET-346, PARQUET-345
rdblue added a commit to rdblue/parquet-mr that referenced this pull request Jan 6, 2017
PARQUET-348:
shouldIgnoreStatistics too noisy
This is just a case of way over logging something, to the point that it
make the logs unreadable

Author: Alex Levenson <alexlevenson@twitter.com>

Closes apache#252 from isnotinvain/alexlevenson/various-fixes and squashes the
following commits:

9b5cb0e [Alex Levenson] Add comments, cleanup some minor use of
ThriftSchemaConverter
376343e [Alex Levenson] Fix test
d9d5dad [Alex Levenson] add license headers
e26dc0c [Alex Levenson] Add tests
8d9dde0 [Alex Levenson] Fixes for PARQUET-350, PARQUET-348, PARQUET-346,
PARQUET-345
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants