-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure read-lock is not continuously held on a section while iterating over concurrent maps #9787
Conversation
…g over concurrent maps
} finally { | ||
if (acquiredReadLock) { | ||
storedKey = (K) table[bucket]; | ||
storedValue = (V) table[bucket + 1]; | ||
unlockRead(stamp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try-finally for unlock? (Same comment to all unlock locations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think in all places here it should be guaranteed to not throw, but makes sense to do it as general practice.
Nice! Can't wait to try it out. |
Sorry. I came too late to the party. +1 |
…g over concurrent maps (apache#9787) * Ensure read-lock is not continuously held on a section while iterating over concurrent maps * Added try/finally
btw. I happened to come across #8877 which was a fix to a deadlock. |
…g over concurrent maps (#9787) * Ensure read-lock is not continuously held on a section while iterating over concurrent maps * Added try/finally
…g over concurrent maps (apache#9787) * Ensure read-lock is not continuously held on a section while iterating over concurrent maps * Added try/finally (cherry picked from commit 4c369c9)
### Motivation In several places in the code when iterating over the custom hashmaps, we are taking over a copy of the map. This was done every time the iteration could end up modifying the map, since there was a non-reentrant mutex taken during the iteration. Any modification would lead to a deadlock. Since the behavior was changed in #9787 to not hold the section mutex during the iteration, there's no more need to make a copy of the maps.
### Motivation In several places in the code when iterating over the custom hashmaps, we are taking over a copy of the map. This was done every time the iteration could end up modifying the map, since there was a non-reentrant mutex taken during the iteration. Any modification would lead to a deadlock. Since the behavior was changed in apache#9787 to not hold the section mutex during the iteration, there's no more need to make a copy of the maps.
Fixes #618 ### Motivation See #618 (comment) for the deadlock analysis. ### Modifications - Use `ConcurrentHashMap` instead of `ConcurrentLongHashMap`. Though this bug may already be fixed in apache/pulsar#9787, the `ConcurrentHashMap` from Java standard library is more reliable. The possible performance enhancement brought by `ConcurrentLongHashMap` still needs to be proved. - Use `AtomicBoolean` as `KafkaTopicConsumerManager`'s state instead of read-write lock to avoid `close()` method that tries to acquire write lock blocking. - Run a single cursor expire task instead one task per channel, since #404 changed `consumerTopicManagers` to a static field, there's no reason to run a task for each connection.
Fixes streamnative#618 ### Motivation See streamnative#618 (comment) for the deadlock analysis. ### Modifications - Use `ConcurrentHashMap` instead of `ConcurrentLongHashMap`. Though this bug may already be fixed in apache/pulsar#9787, the `ConcurrentHashMap` from Java standard library is more reliable. The possible performance enhancement brought by `ConcurrentLongHashMap` still needs to be proved. - Use `AtomicBoolean` as `KafkaTopicConsumerManager`'s state instead of read-write lock to avoid `close()` method that tries to acquire write lock blocking. - Run a single cursor expire task instead one task per channel, since streamnative#404 changed `consumerTopicManagers` to a static field, there's no reason to run a task for each connection.
Fixes #618 ### Motivation See #618 (comment) for the deadlock analysis. ### Modifications - Use `ConcurrentHashMap` instead of `ConcurrentLongHashMap`. Though this bug may already be fixed in apache/pulsar#9787, the `ConcurrentHashMap` from Java standard library is more reliable. The possible performance enhancement brought by `ConcurrentLongHashMap` still needs to be proved. - Use `AtomicBoolean` as `KafkaTopicConsumerManager`'s state instead of read-write lock to avoid `close()` method that tries to acquire write lock blocking. - Run a single cursor expire task instead one task per channel, since #404 changed `consumerTopicManagers` to a static field, there's no reason to run a task for each connection.
### Motivation In several places in the code when iterating over the custom hashmaps, we are taking over a copy of the map. This was done every time the iteration could end up modifying the map, since there was a non-reentrant mutex taken during the iteration. Any modification would lead to a deadlock. Since the behavior was changed in apache#9787 to not hold the section mutex during the iteration, there's no more need to make a copy of the maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi mrlimat @merlimat @eolivelli @lhotari :
I do not quite understand enhance for loop is replaced with "for (int i = 0...) " in this mr, as I know for array loop, enhanced for loop is quite equivalent to "for (int i = 0...) ", Is there any other purpose? look forward to your reply,
thanks a lot
Motivation
As discussed in #9764 the fact that we're potentially holding a read-lock while scanning through a section of the map has several implications:
Instead of holding the lock throughout the scan of the section, we should instead release the read lock before calling the processing function, going back into the optimistic read mode.
This will not add any overhead (in terms of volatile reads) compared to the current implementation, but will avoid all the possible deadlock traps, since we're never going to be holding the lock while calling the user code.