-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-27264 Add options to consider compressed size when delimiting blocks during hfile writes #4675
Conversation
…locks during hfile writes
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@@ -886,6 +905,27 @@ void ensureBlockReady() throws IOException { | |||
finishBlock(); | |||
} | |||
|
|||
public boolean shouldFinishBlock() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason to have this logic here vs in HFileWriteRImpl with the rest of the shouldfinish
logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method involves dealing with some block specifics, like compression, the block content byte array buffer and what to do with compression size when deciding what should be a block limit. Moving it to HFileWriteRImpl would spill some block specific variables and logic into the file writer logic. It just feels to me, putting it here is more cohesive.
@@ -319,6 +323,9 @@ protected void checkBlockBoundary() throws IOException { | |||
shouldFinishBlock = blockWriter.encodedBlockSizeWritten() >= hFileContext.getBlocksize() | |||
|| blockWriter.blockSizeWritten() >= hFileContext.getBlocksize(); | |||
} | |||
if (blockWriter.isSizeLimitCompressed()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's change this to if (blockWriter.isSizeLimitCompressed() && !shouldFinishBlock)
? Just noting your comment on the calculation of compression ratio in the other file, we could further avoid that cost if we already know we need to finish the block for other reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually want to enter here if shouldFinishBlock
is true. Because that means the raw/encoded uncompressed size is larger than BLOCK_SIZE. But we don't know if the compressed size is smaller than BLOCK_SIZE, so we'll call blockWriter.shouldFinishBlock()
to find that out.
And in the case where shouldFinishBlock
is false, we'll not actually calculate the compressed size inside blockWriter.shouldFinishBlock()
, because we don't go beyond this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly LGTM, just one question about the additional blockWriter.shouldFinishBlock() with original shouldFinishBlock logic.
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStoreFile.java
Outdated
Show resolved
Hide resolved
@@ -319,6 +323,9 @@ protected void checkBlockBoundary() throws IOException { | |||
shouldFinishBlock = blockWriter.encodedBlockSizeWritten() >= hFileContext.getBlocksize() | |||
|| blockWriter.blockSizeWritten() >= hFileContext.getBlocksize(); | |||
} | |||
if (blockWriter.isSizeLimitCompressed()) { | |||
shouldFinishBlock &= blockWriter.shouldFinishBlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
related to the above comment by @bbeaudreault
in what situation the shouldFinishBlock = true and blockWriter.shouldFinishBlock() = false ? is it possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, shouldFinishBlock could be true at this point because so far here we just checked "raw" uncompressed size or encoded uncompressed against BLOCK_SIZE. It is possible that these sizes are higher than BLOCK_SIZE, but the compressed size might still be less than the BLOCK_SIZE.
I think this is a useful feature. When implementing an in-house LSM tree based storage system, I used to make use of the compression rates for the already written blocks to predicate the compression rate of the next block, to determine whether we should finish a block. I think the approach here is also OK, compressing once when we reach the default block size to predicate the compression rate. Maybe we could introduce something like a plugin to predicate the compression rate? The default implementation just always returns 1, and we could introduce different algorithms to predicate the compression rate of the current block size. What do you guys think? For me, the option names are a bit confusing... What does 'size.limit.compressed' mean? Thanks. |
🎊 +1 overall
This message was automatically generated. |
I thought about it originally, but then concluded this feature made more sense as an on/off (off by default) behaviour. Do you see other variations in between?
The properties were a bit confusing indeed. I tried to rename those now, let me know if those are more intuitive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sorry, I think I misunderstood it, previously. What if we leave this current PR as is, for an initial approach on using compressed sizes to determine blocks, then work on the pluggable solution on a separate ticket? |
If we expose the configs out then it will be a pain to switch to a pluggable solution because we can not simply remove these configs without a deprecation cycle... I still prefer we make a pluggable solution first, and then the first approach could be the algorithm in this PR. |
Ok, had just pushed a new commit where the algorithm for defining the compression rate and adjust the block size limit is pluggable. |
💔 -1 overall
This message was automatically generated. |
@@ -248,6 +254,8 @@ static class Header { | |||
static final byte[] DUMMY_HEADER_NO_CHECKSUM = | |||
new byte[HConstants.HFILEBLOCK_HEADER_SIZE_NO_CHECKSUM]; | |||
|
|||
public static final String MAX_BLOCK_SIZE_UNCOMPRESSED = "hbase.block.max.size.uncompressed"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this and it's logic to the predicate itself?
// In order to avoid excessive compression size calculations, we do it only once when | ||
// the uncompressed size has reached BLOCKSIZE. We then use this compression size to | ||
// calculate the compression rate, and adjust the block size limit by this ratio. | ||
if (adjustedBlockSize == 0 || uncompressedBlockSize >= adjustedBlockSize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are recomputing compression size every time if we are even a little over the previously calculated adjusted block size, which is possible for all the blocks and will result in computation all the time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, whenever we are closing the current block, based on adjusted block size value from previous block. It doesn't add much overhead because we are already closing the block, regardless. Main problem here, is that we don't know yet if it's little over or not, because we have been checking only uncompressed size.
int compressedSize = EncodedDataBlock.getCompressedSize(context.getCompression(), | ||
context.getCompression().getCompressor(), contents.getBuffer(), 0, | ||
contents.size()); | ||
adjustedBlockSize = uncompressedBlockSize / compressedSize; | ||
adjustedBlockSize *= context.getBlocksize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting the algorithm to be
- Let the first block get closed with the regular adjustedBlockSize == block_size
- Have a method in this predicate to be called on finishBlock and use the currently closed block statistics to determine the compression ratio and store the adjusted size.
- Use the above compression ratio/adjusted size to determine the boundaries of the next block.
This way, we will keep on adjusting the ratio for the next block as per the previous block without doing any extra compression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had pushed a new commit with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments on the current alogrithm.
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have left a few comments that you can complete before committing. +1 so that you don't need to wait for me to commit it.
if (uncompressedBlockSize >= adjustedBlockSize) { | ||
adjustedBlockSize = context.getBlocksize() * compressionRatio; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the (uncompressedBlockSize >= adjustedBlockSize) check so that we are adjusting size on the basis of previous block compression everytime by calculating the adjustedBlockSize in updateLatestBlockSizes itself, And from this method only return it
public boolean shouldFinishBlock() throws IOException { | ||
// int uncompressedBlockSize = blockSizeWritten(); | ||
int uncompressedBlockSize = baosInMemory.size(); | ||
if (uncompressedBlockSize >= fileContext.getBlocksize()) { | ||
if (uncompressedBlockSize < maxSizeUnCompressed) { | ||
int adjustedBlockSize = compressedSizePredicator. | ||
calculateCompressionSizeLimit(fileContext, uncompressedBlockSize); | ||
return uncompressedBlockSize >= adjustedBlockSize; | ||
} | ||
return true; | ||
} | ||
return false; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also move this to Predicator and generalize this method
public boolean shouldFinishBlock() throws IOException { | |
// int uncompressedBlockSize = blockSizeWritten(); | |
int uncompressedBlockSize = baosInMemory.size(); | |
if (uncompressedBlockSize >= fileContext.getBlocksize()) { | |
if (uncompressedBlockSize < maxSizeUnCompressed) { | |
int adjustedBlockSize = compressedSizePredicator. | |
calculateCompressionSizeLimit(fileContext, uncompressedBlockSize); | |
return uncompressedBlockSize >= adjustedBlockSize; | |
} | |
return true; | |
} | |
return false; | |
} | |
public boolean checkBoundariesWithPredicate() throws IOException { | |
if(predicator==null){ | |
throw new IllegalArgumentException("Expected at least the default BoundariesCheckPredicate"); | |
} | |
return predicator. | |
shouldFinishBlock(fileContext, uncompressedBlockSize); | |
} | |
*/ | ||
@Override | ||
public void updateLatestBlockSizes(int uncompressed, int compressed) { | ||
compressionRatio = uncompressed/compressed; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
compressionRatio = uncompressed/compressed; | |
compressionFactor = uncompressed/compressed; |
System.out.println(">>>> " + block.getUncompressedSizeWithoutHeader()); | ||
System.out.println(">>>> " + blockCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove System.out
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
…locks during hfile writes (#4675) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Signed-off-by: Ankit Singhal <ankit@apache.org>
…locks during hfile writes (apache#4675) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Signed-off-by: Ankit Singhal <ankit@apache.org> Change-Id: I0d969c83f9bb6a2a8d24e60e6b76b412b659e2b0
Here we propose two additional properties,"hbase.block.size.limit.compressed" and "hbase.block.size.max.compressed" that would allow for consider the compressed size (if compression is in use) for delimiting blocks during hfile writing. When compression is enabled, certain datasets can have very high compression efficiency, so that the default 64KB block size and 10GB max file size can lead to hfiles with very large number of blocks.
In this proposal, "hbase.block.size.limit.compressed" is a boolean flag that switches to compressed size for delimiting blocks, and "hbase.block.size.max.compressed" is an int with the limit, in bytes for the compressed block size, in order to avoid very large uncompressed blocks (defaulting to 320KB).