-
Notifications
You must be signed in to change notification settings - Fork 1.5k
PARQUET-77: ByteBuffer use in read and write paths #267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
102 commits
Select commit
Hold shift + click to select a range
686d598
Use ByteBuf-based api to read magic.
gerashegalov 2d32f49
Reading file metadata using zero-copy API
gerashegalov df1ad93
Reading chunk using zero-copy API
dsy88 53500d4
Add ByteBufferInputStream and modify Chunk to consume ByteBuffer instead
dsy88 36aba13
Read from ByteBuffer instead of ByteArray to avoid unnecessary array …
dsy88 7ac1df5
Using Writable Channel to replace write to OutputStream one by one.
dsy88 4f399aa
Add original readIntLittleEndian function to keep compatible with pre…
dsy88 970fc8b
Add a Hadoop compatible layer to abstract away the zero copy API and old
dsy88 47b177d
Move CompatibilityUtil to parquet.hadoop.util.
dsy88 01c2ae5
Implement FSDISTransport in Compatible layer.
dsy88 a7bcfbb
Make BytePacker consume ByteBuffer directly.
dsy88 26dc879
disable enforcer to pass build.
dsy88 016e89c
remove some unncessary codes.
dsy88 912cbaf
fix a bug in equals in ByteBuffer Binary with offset and length
dsy88 861e541
enable enforcer check.
dsy88 8be638a
Address tsdeng's comments
dsy88 0d22908
merging with master
adeneche 5bc8774
Update Snappy Codec to implement DirectDecompressionCodec interface
parthchandra 7bc2a4d
Make a copy of Min and Max values for BinaryStatistics so that direct…
parthchandra 2c2b183
Remove Zero Copy read path while reading footers
parthchandra 8143174
update pig.version to build with Hadoop 2 jars
adeneche 2187697
Update Binary to make a copy of data for initial statistics.
jacques-n 35b10af
Use ByteBuffers in the Write path. Allow callers to pass in an alloca…
parthchandra e488924
after merge code cleanup
adeneche a6389db
Make constructor for PrimitiveType that takes decimalMetadata public.
StevenMPhillips 98b99ea
Revert readFooter to not use ZeroCopy path.
parthchandra 48cceef
Fix allocation in DictionaryValuesWriter
StevenMPhillips 6943536
fixing bug related to testDictionaryError_419
adeneche e1df3b9
disabled enforcer and changed version to -drill
adeneche 51cf2f1
cherry pick pull#188
rdblue c98ec2a
bumped version to 1.6.0rc3-drill-r0.1
adeneche 173aa25
Set max preferred slab size to 16mb
jacques-n 4a9dd28
update pom version
jacques-n 9f22bd7
Make CodecFactory pluggable
jacques-n 9bbc269
Update to 1.6.0rc3-drill-r0.3
jacques-n 1bfa3a0
Merge branch 'master' into 1.6.0rc3-drill-r0.3-merge
jaltekruse 2b8328b
I all of the tests are now passing after the merge.
jaltekruse 864b011
Simplifying how buffer allocators are passed when creating ValuesWrit…
jaltekruse 45cadee
Cleaning up code in Binary after merge.
jaltekruse ab54c4e
Moving classes out of the old packages.
jaltekruse 1f4f504
WIP - addressing review comments
jaltekruse 7e252f3
WIP - addressing review comments
jaltekruse 23ad48e
WIP - addressing review comments
jaltekruse 829af6f
WIP - getting rid of unnecessary copies in Binary.java
jaltekruse fddd4af
WIP - removing copies from the ByteBufferBasedBinary equals, compareT…
jaltekruse d40706b
Get rid of unnecessary calls to Bytebuffer.wrap(byte[]), as an interf…
jaltekruse 35d8386
Move call to getBytes() on dictionaryPages to remove the need to cach…
jaltekruse 705b864
Rename CapacityByteArrayOutputStream to CapacityByteBufferOutputStrea…
jaltekruse ebae775
Fix issue reading page data into an off-heap ByteBuffer
jaltekruse 1971fc5
Fixes made while debugging drill unit tests
jaltekruse 86317b0
Address review comments, make field in immutable ParquetProperties ob…
jaltekruse 104a1d1
Remove test requiring a hard-coded binary file. This was actually a b…
jaltekruse fec4242
Address review comments - factoring out code in tests
jaltekruse 6959db7
addressing review comments, avoiding unnecessary copies when creating…
jaltekruse 29cc747
Factor out common code
jaltekruse 8c6e4a9
Addressing review comments, moving code out of generated class into a…
jaltekruse 0098b1c
Remove unused method
jaltekruse f0e31ec
revert small formatting and renaming changes, TODO make sure these re…
jaltekruse 9dccb94
Add new method to turn BytesInput into an InputStream.
jaltekruse b1040a8
Remove code used to debug a test that was failing after the initial m…
jaltekruse e79684e
Review comments - fixing use of ParquetProperties and removing unused…
jaltekruse 9fb65dd
Rename method to get a dictionary page to clarify that the dictionary…
jaltekruse ad58bbe
Addressing small review comments, unused imports, doc cleanup, etc.
jaltekruse d4819b4
remove methods now unneccesary as same implementation has been moved …
jaltekruse fdb689c
Remove unnecessary copy writing a Binary to an OutputStream if it is …
jaltekruse a793be8
Add closeQuietly method to convert checked IOExceptions from classle…
jaltekruse 2e95915
Addressing minor review comments, comments out code, star import, for…
jaltekruse 4c3195e
Turn back on SemVer
jaltekruse d5536b6
Restore original name of CapacityByteArrayOutputStream to keep compat…
jaltekruse f217e6a
Restore old interfaces
jaltekruse 8f66e43
Create utility methods to transform checked exceptions to unchecked w…
jaltekruse 2f1a6c7
Consolidate a little more code
jaltekruse da1b52a
Moving classes into parquet from Drill.
jaltekruse 0496350
Add unit test for direct codec factory.
jaltekruse 862eb13
Fix usage of old constructor in Thrift module that caused a compilati…
jaltekruse 8ff878a
Addressing review comments
jaltekruse b7a6457
fix license leader
d332ca7
Add test for UnsignedVarIntBytesInput
f8e5988
Added javadocs, removed unused code in DirectCodecFactory
b4266fb
Add license header to new class
ae58486
Changing argument lists that previously included both an allocator an…
jaltekruse b8f54c2
Add a unit test for ByteBufferBackedBinary.
jaltekruse c305984
Adding back code generation for method to take a byte array as well a…
jaltekruse 659230f
Remove second version of the class ByteBufferBytesInput that was nest…
jaltekruse e7f7f7f
WIP - removing unneeded generics form CodecFactories
jaltekruse 3945674
Switch to using the DirectCodecFactory everywhere, one test is failin…
jaltekruse 5869156
Move fallback classes from HeapCodecFactory to the DirectCodecFactory
jaltekruse 1a47767
Address review comments
jaltekruse 192c717
Fix error message
jaltekruse df7fd9c
Limit access to classes and methods used for reflection based access …
jaltekruse 40714a4
Move pageSize to the constructor of codecfactory rather than the meth…
jaltekruse a8d2dc1
Address review comments.
jaltekruse d6501b1
Thought I had fixed this double deallocation earlier, guess the chang…
jaltekruse 57491a2
Delete older version of test file, all of these tests look to be cove…
jaltekruse 10b5ba3
Remove unneeded TODO
jaltekruse 723701c
Adding isDirect interface to ByteBufferAllocator to add a restriction…
jaltekruse a44fdba
Fix logging and restrict access to classes inside of CodecFactory.
jaltekruse bd7aa97
Remove unused imports, one of which has been moved to package private…
jaltekruse 269daef
Make CodecFactory public
jaltekruse 96e19a8
Properly set the byte buffer position when reading out of a filesyste…
jaltekruse 58340d8
Fix CompatibilityUtil, primary issue was a small error in the package…
jaltekruse 56316d0
An exception out of the read method doesn't necessarily mean somethin…
jaltekruse File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not multiply public constructors. Just make one private constructor with all fields.
You can make a pattern like this: (not a builder but similar idea)
Probably we should have one parameterLess public constructor and deprecate the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to go with any pattern, I think it would be best to just go completely with a builder, this would allow for us to enforce required/optional parameters properly. For us to do this in the pattern you suggest we would have to have a validateSetup() method as well which would have to be called after setting all of the properties. Calling this method could not be enforced by the compiler, which would make it a little brittle going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's follow up in a later PR