-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch enqueue/dequeue for bqueue #14121
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing a signed-off-by.
There are many test failures in the tests-functional-ubuntu (20.04) logs. I know that the test suite has been flaky in the CI infrastructure lately (and the buildbot being down is not helping), but this looks like the result of a regression at a glance. |
@ahrens The commit should be squashed (you will want to force push the result) and given signed-off-by. I am waiting on the results of the buildbot before doing a deeper look at the logic, although I like how small and well documented the commit is. |
I wonder if you have any benchmark for this patch? |
@ryao Would you like to take another look at this? I've made the suggested changes to spelling and comments. |
I would be happy to take another look at it. I should be able to look on Friday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had something come up on Friday that kept me from looking at this then. Anyway, I have looked at it now.
The short version is is the logic looks solid to me, but my remarks about naming a variable deqing
being unkind to bilingual reviewers have not been addressed. I would strongly prefer it if someone ran sed -e s/deqing/dequeuing/g -e s/enqing/enqueuing/g
on this patch before we merged it. Other than that, I see no reason not to merge it. It is a very nice technical improvement.
The long version is the following attempt to elaborate on why I dislike the mneumonic deqing
so much. First, I fully understand that the meaning of deqing
is immediately obvious to any native English speaker that has no knowledge of Chinese Pinyin, such that others might think "why is he making such a big deal over it?". Having figured out what deqing
meant when I made my original comments, I got over it fairly quickly this time to be able to read the code properly, but the first time I saw it, it took longer than I am willing to admit for me to realize what it meant, because my brain had not interpreted it as English, but as Chinese Pinyin.
Unfortunately, de
is the pinyin for some of the most widely used Chinese characters, which have a frequency similar to the
in English (despite chinese not having grammatical articles). In addition, both qing
and en
are valid Pinyin; and Chinese loanwords in English are often written using Pinyin, so seeing deqing
in the context that this patch used it is highly likely to trigger an "it is romanized Chinese" reaction in the minds of people familiar with Chinese Pinyin.
With most Romanizations of foreign languages, upon failing to understand the Romanization as being part of that language, most would fallback to trying to interpret it as another language. However, Chinese Pinyin is highly ambiguous with up to 16 theoretical different possible interpretations when considering tonality (minus the often omitted neutral tone) and even more when considering context sensitivity. I recall once seeing a dictionary define a word in chinese to have 97 different possible interpretations, dependent on context. The result is that when someone reads Chinese Pinyin, without tone marks or context, the natural reaction is to treat it as the output of a hash function, which strips it of any potential meaning.
The lexical similarity of enqing
to deqing
would make one correctly assume that it is related, but since the misindentification of deqing
as Pinyin meant it was treated as the output of a hash function, it is natural to treat enqing
as the output of a hash function too. That meant that my initial read of the code was one where I did not have the benefit of the mnemonics, and the code was substantially more difficult to understand than it would have been had I understood the mnemonic.
The result is that this mnemonic is very unkind to bilingual individuals.
@ryao Apologies, I made the naming change 2 months ago but didn't properly upload it here. I've updated the PR now. Let me know if there are any places that I missed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patches need to be squashed, but otherwise, this looks good,
The Blocking Queue (bqueue) code is used by zfs send/receive to send messages between the various threads. It uses a shared linked list, which is locked whenever we enqueue or dequeue. For workloads which process many blocks per second, the locking on the shared list can be quite expensive. This commit changes the bqueue logic to have 3 linked lists: 1. An enquing list, which is used only by the (single) enquing thread, and thus needs no locks. 2. A shared list, with an associated lock. 3. A dequing list, which is used only by the (single) dequing thread, and thus needs no locks. The entire enquing list can be moved to the shared list in constant time, and the entire shared list can be moved to the dequing list in constant time. These operations only happen when the `fill_fraction` is reached, or on an explicit flush request. Therefore, the lock only needs to be acquired infrequently. The API already allows for dequing to block until an explicit flush, so callers don't need to be changed. Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
The Blocking Queue (bqueue) code is used by zfs send/receive to send messages between the various threads. It uses a shared linked list, which is locked whenever we enqueue or dequeue. For workloads which process many blocks per second, the locking on the shared list can be quite expensive. This commit changes the bqueue logic to have 3 linked lists: 1. An enquing list, which is used only by the (single) enquing thread, and thus needs no locks. 2. A shared list, with an associated lock. 3. A dequing list, which is used only by the (single) dequing thread, and thus needs no locks. The entire enquing list can be moved to the shared list in constant time, and the entire shared list can be moved to the dequing list in constant time. These operations only happen when the `fill_fraction` is reached, or on an explicit flush request. Therefore, the lock only needs to be acquired infrequently. The API already allows for dequing to block until an explicit flush, so callers don't need to be changed. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#14121
New Features - Block cloning (#13392) - Linux container support (#14070, #14097, #12263) - Scrub error log (#12812, #12355) - BLAKE3 checksums (#12918) - Corrective "zfs receive" - Vdev and zpool user properties Performance - Fully adaptive ARC (#14359) - SHA2 checksums (#13741) - Edon-R checksums (#13618) - Zstd early abort (#13244) - Prefetch improvements (#14603, #14516, #14402, #14243, #13452) - General optimization (#14121, #14123, #14039, #13680, #13613, #13606, #13576, #13553, #12789, #14925, #14948) Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Motivation and Context
The Blocking Queue (bqueue) code is used by zfs send/receive to send messages between the various threads. It uses a shared linked list, which is locked whenever we enqueue or dequeue. For workloads which process many blocks per second, the locking on the shared list can be quite expensive.
Description
This commit changes the bqueue logic to have 3 linked lists:
The entire enquing list can be moved to the shared list in constant time, and the entire shared list can be moved to the dequing list in constant time. These operations only happen when the
fill_fraction
is reached, or on an explicit flush request. Therefore, the lock only needs to be acquired infrequently.The API already allows for dequing to block until an explicit flush, so callers don't need to be changed.
How Has This Been Tested?
ZTS
Types of changes
Checklist:
Signed-off-by
.