-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-24962 Optimize BufferNode Lock #2343
base: master
Are you sure you want to change the base?
Conversation
1130b79
to
db49131
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left few comments, also we should be careful touching some sensitive area here. Do you have any specific test results to share?
Thanks for the patch.
@@ -298,6 +299,8 @@ default boolean storeInDispatchedQueue() { | |||
// ============================================================================================ | |||
private final class TimeoutExecutorThread extends Thread { | |||
private final DelayQueue<DelayedWithTimeout> queue = new DelayQueue<DelayedWithTimeout>(); | |||
private final ConcurrentHashMap<DelayedWithTimeout, DelayedWithTimeout> pendingBufferNode = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please use ConcurrentMap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?why use concurrentMap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general rule, it's good to handle implementation by using interface to avoid the pain of changing all references if we wish to change the implementation. I understand, we don't want to change implementation here, but it's always great to refer to implementation using interface unless we want to deliberately use implementor methods which are not overridden from interface.
Hence, just like why we would want to use Map<Key,Val> map=new HashMap<>
, let's also use:
private final ConcurrentMap<DelayedWithTimeout, DelayedWithTimeout> pendingBufferNode = new ConcurrentHashMap<>();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx @virajjasani
yes, but i think it is okay to use either, RemoteProcedureDispatcher doesn't change often
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
Outdated
Show resolved
Hide resolved
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
Outdated
Show resolved
Hide resolved
private final Set<RemoteProcedure> dispatchedOperations = new HashSet<>(); | ||
private Set<RemoteProcedure> operations = Sets.newConcurrentHashSet(); | ||
private final Set<RemoteProcedure> dispatchedOperations = Sets.newConcurrentHashSet(); | ||
private final Object lock = new Object(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this looks promising but add(RemoteProcedure operation)
and abortOperationsInQueue()
will now contend with each other. Do we expect both to synchronize on the same object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, but abortOperationsInQueue does't happen very often
((BufferNode) task).dispatch(); | ||
} | ||
} | ||
} | ||
|
||
public void putIfAbsent(BufferNode bufferNode) { | ||
if (pendingBufferNode.putIfAbsent(bufferNode, bufferNode) == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to add same object as key and value, why are we using ConcurrentHashMap
instead of Sets.newConcurrentHashSet()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use atomically putIfAbsent.
@@ -370,39 +383,47 @@ public TRemote getKey() { | |||
} | |||
|
|||
@Override | |||
public synchronized void add(final RemoteProcedure operation) { | |||
if (this.operations == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure we are matching this condition with timeoutExecutor.putIfAbsent(this);
?
Based on putIfAbsent()
method, it seems only first time when entry is added for new key in map i.e when concurrent map's thread-safe version of putIfAbsent()
returns null, we execute this block but is there any scenario that might not adhere to this if (this.operations == null)
condition by any chance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TimeoutExecutorThread#run will remove task from pendingBufferNode, so after run, task will not be included in the pendingBufferNode
else {
pendingBufferNode.remove(task);
((BufferNode) task).dispatch();
}
@@ -357,8 +369,9 @@ public void awaitTermination() { | |||
*/ | |||
protected final class BufferNode extends DelayedContainerWithTimestamp<TRemote> | |||
implements RemoteNode<TEnv, TRemote> { | |||
private Set<RemoteProcedure> operations; | |||
private final Set<RemoteProcedure> dispatchedOperations = new HashSet<>(); | |||
private Set<RemoteProcedure> operations = Sets.newConcurrentHashSet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemoteProcedure
is parameterized class and it is used in raw format. Although not for this PR, but we should really change this notion of using the class as parameterized in this entire class.
e.g
private Set<RemoteProcedure<TEnv, TRemote>> operations ...
public void add(final RemoteProcedure<TEnv, TRemote> operation) ...
protected abstract void remoteDispatch(TRemote key, Set<RemoteProcedure<TEnv, TRemote>> operations);
Nothing urgent, can be taken up in separate patch to well distinguish the commit purpose.
db49131
to
aeed6f5
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have numbers to show this a better approach and/or a writeup on the approach and why it better? On cursory glance, we are adding task accounting in a concurrent map -- not cheap -- and then synchronizing on a data member lock instead of synchronizing on the method (though the synchronize on lock extends over the full method call in at least two places so effectively the same price is being paid). Thanks @cuibo01
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
No description provided.