-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28805: Chunked persistence of backing map for persistent bucket cache. #6183
Conversation
💔 -1 overall
This message was automatically generated. |
byte[] convertToBytes (long value) { | ||
byte[] bytes = new byte[Long.BYTES]; | ||
int length = bytes.length; | ||
for (int i = 0; i < length; i++) { | ||
bytes[length - i - 1] = (byte) (value & 0xFF); | ||
value >>= 8; | ||
} | ||
return bytes; | ||
} | ||
|
||
long convertToLong(byte[] bytes) { | ||
long value = 0; | ||
for (byte b : bytes) { | ||
value = (value << 8) + (b & 0xFF); | ||
} | ||
return value; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are already Bytes.toBytes
, Bytes.toLong
for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let me use those.
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
Outdated
Show resolved
Hide resolved
int blockCount = 0; | ||
int chunkCount = 0; | ||
|
||
BucketCacheProtos.BackingMap.Builder builder = BucketCacheProtos.BackingMap.newBuilder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we creating proto builder here? Can't this go to BucketProtoUtils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
🎊 +1 overall
This message was automatically generated. |
… cache. The persistent bucket cache implementation feature relies on the persistence of backing map to a persistent file. the protobuf APIs are used to serialise the backing map and its related structures into the file. An asynchronous thread periodically flushes the contents of backing map to the persistence file. The protobuf library has a limitation of 2GB on the size of protobuf messages. If the size of backing map increases beyond 2GB, an unexpected exception is reported in the asynchronous thread and stops the persister thread. This causes the persistent file go out of sync with the actual bucket cache. Due to this, the bucket cache shrinks to a smaller size after a cache restart. Checksum errors are also reported. This Jira tracks the implementation of introducing chunking of the backing map to persistence such that every protobuf is smaller than 2GB in size. Change-Id: I8623ad2eaf1d1a56f96bc3120b14e5229ae55c42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to move all the serialization/deserialization logic to BucketProtoUtils?
.setChecksum(ByteString | ||
.copyFrom(((PersistentIOEngine) cache.ioEngine).calculateChecksum(cache.getAlgorithm()))) | ||
.build(); | ||
} | ||
|
||
private static BucketCacheProtos.BackingMap toPB(Map<BlockCacheKey, BucketEntry> backingMap) { | ||
static void toPB(BucketCache cache, FileOutputStream fos, long chunkSize) throws IOException{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We should rename it to serializeAsPB
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
fos.write(PB_MAGIC_V2); | ||
long numChunks = backingMap.size() / persistenceChunkSize; | ||
if (backingMap.size() % persistenceChunkSize != 0) { | ||
numChunks += 1; | ||
} | ||
|
||
LOG.debug("persistToFile: before persisting backing map size: {}, " | ||
+ "fullycachedFiles size: {}, chunkSize: {}, numberofChunks: {}", | ||
backingMap.size(), fullyCachedFiles.size(), persistenceChunkSize, numChunks); | ||
|
||
fos.write(Bytes.toBytes(persistenceChunkSize)); | ||
fos.write(Bytes.toBytes(numChunks)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are already passing the output stream to BucketProtoUtils, who's now also responsible for serialization, can we just move everything there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
💔 -1 overall
This message was automatically generated. |
Change-Id: I3eaddcef94a711e63eeefd47063f074d0a57c984
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Change-Id: I082a83ef841125eed00d3ccd660bd2beb95dc4f9
00963f0
to
f853dbc
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, +1.
… cache. (apache#6183) HBASE-28805: Chunked persistence of backing map for persistent bucket cache. Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org> Change-Id: I72efe71040a8970f87a8b2a08c4c4ac1ceef74bc
… cache. (apache#6183) HBASE-28805: Chunked persistence of backing map for persistent bucket cache. Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org> Change-Id: I72efe71040a8970f87a8b2a08c4c4ac1ceef74bc
… cache. (apache#6183) HBASE-28805: Chunked persistence of backing map for persistent bucket cache. Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org> Change-Id: I72efe71040a8970f87a8b2a08c4c4ac1ceef74bc
The persistent bucket cache implementation feature relies on the persistence of backing map to a persistent file. the protobuf APIs are used to serialise the backing map and its related structures into the file.
An asynchronous thread periodically flushes the contents of backing map to the persistence file.
The protobuf library has a limitation of 2GB on the size of protobuf messages. If the size of backing map increases beyond 2GB, an unexpected exception is reported in the asynchronous thread and stops the persister thread. This causes the persistent file go out of sync with the actual bucket cache. Due to this, the bucket cache shrinks to a smaller size after a cache restart. Checksum errors are also reported.
This Jira tracks the implementation of introducing chunking of the backing map to persistence such that every protobuf is smaller than 2GB in size.
Change-Id: I8623ad2eaf1d1a56f96bc3120b14e5229ae55c42