-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-22802 Avoid temp ByteBuffer allocation in FileIOEngine#read #479
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
*/ | ||
@FunctionalInterface | ||
interface ChannelReader { | ||
int read(ReadableByteChannel channel, ByteBuffer buf, long offset) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ReadableByteChannel channel cannot accept the argument with offset position ? Looks strange here we provide a offset argument...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any good idea to abstract this?because FileChannel need this, so provide it as an arg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have no better idea now, maybe we can keep the current way :-)
// Read max possible into the current BB | ||
int len = channelRead(channel, this.curItem); | ||
if (len > 0) | ||
int len = read(channel, this.curItem, offset, reader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we can also create a iterator to read & fill the ByteBuff ? Similar with the BufferIterator in ByteBufferArray, I think that would be more clearer.... Can be a following issue ?
hbase-common/src/main/java/org/apache/hadoop/hbase/nio/MultiByteBuff.java
Show resolved
Hide resolved
@@ -304,23 +303,23 @@ void refreshFileConnection(int accessFileNum, IOException ioe) throws IOExceptio | |||
} | |||
|
|||
private interface FileAccessor { | |||
int access(FileChannel fileChannel, ByteBuffer byteBuffer, long accessOffset) | |||
int access(FileChannel fileChannel, ByteBuff byteBuffer, long accessOffset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
byteBuffer -> byteBuff or buff ?
throws IOException; | ||
} | ||
|
||
private static class FileReadAccessor implements FileAccessor { | ||
@Override | ||
public int access(FileChannel fileChannel, ByteBuffer byteBuffer, | ||
public int access(FileChannel fileChannel, ByteBuff byteBuffer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
} | ||
} | ||
|
||
private static class FileWriteAccessor implements FileAccessor { | ||
@Override | ||
public int access(FileChannel fileChannel, ByteBuffer byteBuffer, | ||
public int access(FileChannel fileChannel, ByteBuff byteBuffer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
...erver/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileScannerImplReferenceCount.java
Show resolved
Hide resolved
@@ -689,7 +691,7 @@ private void freeEntireBuckets(int completelyFreeBucketsNeeded) { | |||
// this set is small around O(Handler Count) unless something else is wrong | |||
Set<Integer> inUseBuckets = new HashSet<>(); | |||
backingMap.forEach((k, be) -> { | |||
if (be.isRpcRef()) { | |||
if (ioEngine.usesSharedMemory() && be.isRpcRef()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The javadoc says:
/**
* This method will find the buckets that are minimally occupied
* and are not reference counted and will free them completely
* without any constraint on the access times of the elements,
* and as a process will completely free at most the number of buckets
* passed, sometimes it might not due to changing refCounts
*
* @param completelyFreeBucketsNeeded number of buckets to free
**/
Means the free processing only consider the usage percentage regardless of the accessCounter, so we will free the entire bucket even if we still have some RPC reading for Exclusive ioengine ? That should be inconrrect ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense
@@ -502,8 +502,10 @@ public Cacheable getBlock(BlockCacheKey key, boolean caching, boolean repeat, | |||
// block will use the refCnt of bucketEntry, which means if two HFileBlock mapping to | |||
// the same BucketEntry, then all of the three will share the same refCnt. | |||
Cacheable cachedBlock = ioEngine.read(bucketEntry); | |||
// RPC start to reference, so retain here. | |||
cachedBlock.retain(); | |||
if (ioEngine.usesSharedMemory()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we consider the patch in another side ? Say the BucketEntry#refCnt still means how many reference path from the RPC handler & BucketCache ... But the Cacheable#refCnt will always be 1 when the RPC is still handling.. Once the cells shipped to client, then both the Cacheable#refCnt & BucketEntry#refCnt will decrease, actually the Cacheable will de-allocate the memory from ByteBuffAllocator.
In this way, the BucketEntry#refCnt still have the same meaning as the SharedFileIOEngine in BucketCache. I mean in BucketCache we don't need to care the IOEngine is shared or exclusive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me think how to achieve this
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
hbase-common/src/main/java/org/apache/hadoop/hbase/nio/MultiByteBuff.java
Show resolved
Hide resolved
private Iterator<ByteBuffer> buffsIterator = new Iterator<ByteBuffer>() { | ||
@Override | ||
public boolean hasNext() { | ||
return curItemIndex <= limitedItemIndex && items[curItemIndex].hasRemaining(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern here, assume the following steps:
- currentByteBuffer = it#next();
- it#hasNext() must return true;
- we write the current ByteBuffer & fullfill it, then its hasRemaining will be false ....
- it#hasNext() will return false because of the currBuff.hasRemaining is false.
Say for the same ByteBuffer, the it#hasNext() will return different value. In a general sense, if we don't have any Next(), then the it#hasNext() should always have the same return value, I mean ?
Maybe we can remove the && items[curItemIndex].hasRemaining()
here, can let the outside logic hande it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about change it like this
curItemIndex < limitedItemIndex || (curItemIndex == limitedItemIndex && items[curItemIndex].hasRemaining())
*/ | ||
@FunctionalInterface | ||
interface ChannelReader { | ||
int read(ReadableByteChannel channel, ByteBuffer buf, long offset) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have no better idea now, maybe we can keep the current way :-)
@@ -60,4 +64,30 @@ protected final void deallocate() { | |||
public final ReferenceCounted touch(Object hint) { | |||
throw new UnsupportedOperationException(); | |||
} | |||
|
|||
public void setInnerRefCnt(Optional<RefCnt> refCnt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RefCnt is a general reference count class here... The refCnt with interRefCnt is mainly used for BucketEntry ? Could we create a subclass for BuckentEntry only ? I mean the general ByteBuff use the customized will be resource-consuming ...
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better now, Please address the newly added comment. Thanks.
return this.refCnt; | ||
} | ||
|
||
public void setRefCnt(RefCnt refCnt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think setRefCnt for a given ByteBuff is dangerous ? because its previous recycler will be lost, so better not to expose this method as public ? We have one similar issue before, see:
/**
* In theory, the upstream should never construct an ByteBuff by passing an given refCnt, so
* please don't use this public method in other place. Make the method public here because the
* BucketEntry#wrapAsCacheable in hbase-server module will use its own refCnt and ByteBuffers from
* IOEngine to composite an HFileBlock's ByteBuff, we didn't find a better way so keep the public
* way here.
*/
public static ByteBuff wrap(ByteBuffer[] buffers, RefCnt refCnt) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have not find a better way yet, any suggestions?
hbase-common/src/main/java/org/apache/hadoop/hbase/nio/CompositeRefCnt.java
Show resolved
Hide resolved
|
||
@Override | ||
public boolean release() { | ||
boolean innerRes = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the following (not compile, should be similar):
return super.release() && innerRefCnt.map(innerRefCnt::release).orElse(true);
|
||
@Override | ||
public ReferenceCounted retain() { | ||
if (innerRefCnt.isPresent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
innerRefCnt.map(innerRefCnt::retain).orElse(super.retain);
hbase-common/src/main/java/org/apache/hadoop/hbase/nio/MultiByteBuff.java
Show resolved
Hide resolved
} | ||
|
||
@Override | ||
@edu.umd.cs.findbugs.annotations.SuppressWarnings(value="IT_NO_SUCH_ELEMENT", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here can throw the NoSuchElementException if curItemIndex exceed ?
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@chenxu14 seems there're still some checkstyle issues ? please see: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-479/55/artifact/out/diff-javadoc-javadoc-hbase-common.txt ... |
Sorry for the delay, very glad if you can help improve it. |
* the upstream should not use this public method in other place, or the previous recycler | ||
* will be lost. | ||
*/ | ||
public void shareRefCnt(RefCnt refCnt, boolean replace) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of File based BC, now we make it to read the cached data into the pooled BBs. Every read RPC will acquire own BBs and read into. There is ideally no sharing of BBs across the readers happen at all.. But seems here we try share the ref count of the BC (File based here) with the RPCs. Little strange to digest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, there missing some conversations with @openinx
We do this in order not to violate the LRU, when BucketCache#freeEntireBuckets is executed
dstBuffer.rewind(); | ||
return be.wrapAsCacheable(new ByteBuffer[] { dstBuffer }); | ||
dstBuff.rewind(); | ||
return be.wrapAsCacheable(dstBuff); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this wrapAsCacheable method. Seems we have only one BucketEntry type now for shared or exclusive types of entries and that uses the refCount associated with it (within BC scope) and this makes it required to have this kind of Composite count mechanism and all now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, exclusive HFileBlock will use CompositeRefCnt#innerRefCnt to share with BucketEntry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this is an unwanted sharing and ref count check.
* BucketCache#freeEntireBuckets is called, will not violate the LRU policy. | ||
* <p> | ||
* And it has its own refCnt & Recycler, Once the cells shipped to client, then both the | ||
* Cacheable#refCnt & BucketEntry#refCnt will be decreased. when Cacheable's refCnt decrease |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means when say 2 read RPCs access a block from File based BC, that entry can NOT get evicted unless both these RPCs are over?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, as mentioned above
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
I think it should good enough to commit now, let me commit this patch if no objection. |
💔 -1 overall
This message was automatically generated. |
Raised a concern regarding the usage of RPC ref counts even for the evictions when it is FileIOE and it is Exclusive cache. IMHO we should really see how to avoid this. |
Can we ignore LRU in this case(BucketCache#freeEntireBuckets), I think HBASE-16630 has already done so. In this way, there is no refCnt sharing between exclusive HFileBlock and BucketEntry, and many temporary object creation(CompositeRefCnt) can be omitted. |
I think @chenxu14 have tried to accomplish the patch in some eailier patch version ( says just ignore the RPC reference when freeEntireBuckets for exclusive io engine case). What I concern before is: we're freeing some referenced block in exclusive IO engine, seems it's volidating the LRU policy. Reconsider the case, for exclusive IO engine, free the memory area in bucket allocator should be OK. Compared to the buggy CompositeRefCnt (Not say the current version have bug but say it's an more complex implementation and easy to introduce new bugs), sacrificing some LRU attribution in rare cases should also be acceptable. I mean we need to balance between them, maybe we can still try the patch without CompositeRefCnt. @anoopsjohn FYI. |
will add another PR to replace this |
No description provided.