Batch enqueue/dequeue for bqueue #14121

ahrens · 2022-11-01T17:15:39Z

Motivation and Context

The Blocking Queue (bqueue) code is used by zfs send/receive to send messages between the various threads. It uses a shared linked list, which is locked whenever we enqueue or dequeue. For workloads which process many blocks per second, the locking on the shared list can be quite expensive.

Description

This commit changes the bqueue logic to have 3 linked lists:

An enquing list, which is used only by the (single) enquing thread, and thus needs no locks.
A shared list, with an associated lock.
A dequing list, which is used only by the (single) dequing thread, and thus needs no locks.

The entire enquing list can be moved to the shared list in constant time, and the entire shared list can be moved to the dequing list in constant time. These operations only happen when the fill_fraction is reached, or on an explicit flush request. Therefore, the lock only needs to be acquired infrequently.

The API already allows for dequing to block until an explicit flush, so callers don't need to be changed.

How Has This Been Tested?

ZTS

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the OpenZFS code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
I have run the ZFS Test Suite with this change applied.
All commit messages are properly formatted and contain Signed-off-by.

ryao

This is missing a signed-off-by.

module/zfs/bqueue.c

ryao · 2022-11-02T04:49:31Z

There are many test failures in the tests-functional-ubuntu (20.04) logs. I know that the test suite has been flaky in the CI infrastructure lately (and the buildbot being down is not helping), but this looks like the result of a regression at a glance.

ryao · 2022-11-02T16:18:47Z

@ahrens The commit should be squashed (you will want to force push the result) and given signed-off-by.

I am waiting on the results of the buildbot before doing a deeper look at the logic, although I like how small and well documented the commit is.

gmelikov · 2022-11-02T16:46:43Z

I wonder if you have any benchmark for this patch?

ahrens · 2022-11-03T04:09:45Z

After re-running the tests, the 20.04 tests pass, and the 18.04 test seem to be timing out on other PR's as well (e.g. #14131 this run).

include/sys/bqueue.h

module/zfs/bqueue.c

ahrens · 2023-01-04T22:24:37Z

@ryao Would you like to take another look at this? I've made the suggested changes to spelling and comments.

ryao · 2023-01-05T03:38:07Z

@ryao Would you like to take another look at this? I've made the suggested changes to spelling and comments.

I would be happy to take another look at it. I should be able to look on Friday.

ryao

I had something come up on Friday that kept me from looking at this then. Anyway, I have looked at it now.

The short version is is the logic looks solid to me, but my remarks about naming a variable deqing being unkind to bilingual reviewers have not been addressed. I would strongly prefer it if someone ran sed -e s/deqing/dequeuing/g -e s/enqing/enqueuing/g on this patch before we merged it. Other than that, I see no reason not to merge it. It is a very nice technical improvement.

The long version is the following attempt to elaborate on why I dislike the mneumonic deqing so much. First, I fully understand that the meaning of deqing is immediately obvious to any native English speaker that has no knowledge of Chinese Pinyin, such that others might think "why is he making such a big deal over it?". Having figured out what deqing meant when I made my original comments, I got over it fairly quickly this time to be able to read the code properly, but the first time I saw it, it took longer than I am willing to admit for me to realize what it meant, because my brain had not interpreted it as English, but as Chinese Pinyin.

Unfortunately, de is the pinyin for some of the most widely used Chinese characters, which have a frequency similar to the in English (despite chinese not having grammatical articles). In addition, both qing and en are valid Pinyin; and Chinese loanwords in English are often written using Pinyin, so seeing deqing in the context that this patch used it is highly likely to trigger an "it is romanized Chinese" reaction in the minds of people familiar with Chinese Pinyin.

With most Romanizations of foreign languages, upon failing to understand the Romanization as being part of that language, most would fallback to trying to interpret it as another language. However, Chinese Pinyin is highly ambiguous with up to 16 theoretical different possible interpretations when considering tonality (minus the often omitted neutral tone) and even more when considering context sensitivity. I recall once seeing a dictionary define a word in chinese to have 97 different possible interpretations, dependent on context. The result is that when someone reads Chinese Pinyin, without tone marks or context, the natural reaction is to treat it as the output of a hash function, which strips it of any potential meaning.

The lexical similarity of enqing to deqing would make one correctly assume that it is related, but since the misindentification of deqing as Pinyin meant it was treated as the output of a hash function, it is natural to treat enqing as the output of a hash function too. That meant that my initial read of the code was one where I did not have the benefit of the mnemonics, and the code was substantially more difficult to understand than it would have been had I understood the mnemonic.

The result is that this mnemonic is very unkind to bilingual individuals.

ahrens · 2023-01-09T17:13:28Z

@ryao Apologies, I made the naming change 2 months ago but didn't properly upload it here. I've updated the PR now. Let me know if there are any places that I missed.

ryao

The patches need to be squashed, but otherwise, this looks good,

The Blocking Queue (bqueue) code is used by zfs send/receive to send messages between the various threads. It uses a shared linked list, which is locked whenever we enqueue or dequeue. For workloads which process many blocks per second, the locking on the shared list can be quite expensive. This commit changes the bqueue logic to have 3 linked lists: 1. An enquing list, which is used only by the (single) enquing thread, and thus needs no locks. 2. A shared list, with an associated lock. 3. A dequing list, which is used only by the (single) dequing thread, and thus needs no locks. The entire enquing list can be moved to the shared list in constant time, and the entire shared list can be moved to the dequing list in constant time. These operations only happen when the `fill_fraction` is reached, or on an explicit flush request. Therefore, the lock only needs to be acquired infrequently. The API already allows for dequing to block until an explicit flush, so callers don't need to be changed. Signed-off-by: Matthew Ahrens <mahrens@delphix.com>

The Blocking Queue (bqueue) code is used by zfs send/receive to send messages between the various threads. It uses a shared linked list, which is locked whenever we enqueue or dequeue. For workloads which process many blocks per second, the locking on the shared list can be quite expensive. This commit changes the bqueue logic to have 3 linked lists: 1. An enquing list, which is used only by the (single) enquing thread, and thus needs no locks. 2. A shared list, with an associated lock. 3. A dequing list, which is used only by the (single) dequing thread, and thus needs no locks. The entire enquing list can be moved to the shared list in constant time, and the entire shared list can be moved to the dequing list in constant time. These operations only happen when the `fill_fraction` is reached, or on an explicit flush request. Therefore, the lock only needs to be acquired infrequently. The API already allows for dequing to block until an explicit flush, so callers don't need to be changed. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Wilson <george.wilson@delphix.com> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#14121

New Features - Block cloning (#13392) - Linux container support (#14070, #14097, #12263) - Scrub error log (#12812, #12355) - BLAKE3 checksums (#12918) - Corrective "zfs receive" - Vdev and zpool user properties Performance - Fully adaptive ARC (#14359) - SHA2 checksums (#13741) - Edon-R checksums (#13618) - Zstd early abort (#13244) - Prefetch improvements (#14603, #14516, #14402, #14243, #13452) - General optimization (#14121, #14123, #14039, #13680, #13613, #13606, #13576, #13553, #12789, #14925, #14948) Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

ahrens requested review from grwilson and pcd1193182 November 1, 2022 17:15

ahrens added Type: Performance Performance improvement or performance problem Component: Send/Recv "zfs send/recv" feature labels Nov 1, 2022

behlendorf added the Status: Code Review Needed Ready for review and testing label Nov 1, 2022

ryao suggested changes Nov 2, 2022

View reviewed changes

module/zfs/bqueue.c Outdated Show resolved Hide resolved

ahrens force-pushed the bqueue branch from 4e87cd3 to 7133615 Compare November 2, 2022 15:32

ryao reviewed Nov 3, 2022

View reviewed changes

include/sys/bqueue.h Outdated Show resolved Hide resolved

grwilson approved these changes Nov 28, 2022

View reviewed changes

module/zfs/bqueue.c Outdated Show resolved Hide resolved

behlendorf added the Status: Revision Needed Changes are required for the PR to be accepted label Dec 9, 2022

ahrens force-pushed the bqueue branch from 7133615 to 786de9e Compare January 4, 2023 18:57

ahrens removed the Status: Revision Needed Changes are required for the PR to be accepted label Jan 4, 2023

behlendorf approved these changes Jan 5, 2023

View reviewed changes

ryao suggested changes Jan 9, 2023

View reviewed changes

ahrens force-pushed the bqueue branch from 786de9e to d6e7b17 Compare January 9, 2023 17:12

ahrens requested a review from ryao January 9, 2023 17:13

ryao approved these changes Jan 9, 2023

View reviewed changes

ahrens force-pushed the bqueue branch from d6e7b17 to 9364995 Compare January 9, 2023 22:18

behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jan 10, 2023

behlendorf merged commit fc45975 into openzfs:master Jan 10, 2023

ahrens deleted the bqueue branch February 7, 2023 17:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch enqueue/dequeue for bqueue #14121

Batch enqueue/dequeue for bqueue #14121

ahrens commented Nov 1, 2022

ryao left a comment

ryao commented Nov 2, 2022

ryao commented Nov 2, 2022 •

edited

Loading

gmelikov commented Nov 2, 2022

ahrens commented Nov 3, 2022

ahrens commented Jan 4, 2023

ryao commented Jan 5, 2023

ryao left a comment •

edited

Loading

ahrens commented Jan 9, 2023

ryao left a comment

Batch enqueue/dequeue for bqueue #14121

Batch enqueue/dequeue for bqueue #14121

Conversation

ahrens commented Nov 1, 2022

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Checklist:

ryao left a comment

Choose a reason for hiding this comment

ryao commented Nov 2, 2022

ryao commented Nov 2, 2022 • edited Loading

gmelikov commented Nov 2, 2022

ahrens commented Nov 3, 2022

ahrens commented Jan 4, 2023

ryao commented Jan 5, 2023

ryao left a comment • edited Loading

Choose a reason for hiding this comment

ahrens commented Jan 9, 2023

ryao left a comment

Choose a reason for hiding this comment

ryao commented Nov 2, 2022 •

edited

Loading

ryao left a comment •

edited

Loading