Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network: fix invariant violation in peer set #1366

Merged
merged 3 commits into from
Nov 24, 2020
Merged

Conversation

hdevalence
Copy link
Contributor

Closes #1183.

The peer set maintains a preselected ready service that it can use to
perform power-of-two-choices (p2c) routing of requests. Ready services
are stored by key (socket address) in an IndexMap, and the preselected
service is represented by an Option<usize> indexing that map. This
means that whenever the set of ready services changes (e.g., a service
is removed from the peer set, or a service is taken to be used to
process a request), the preselected index is invalidated. The original
P2C-only implementation maintained this invariant but did not document
it.

The change to inventory-based routing introduced a bug by failing to
maintain this invariant and appropriately invalidate the preselected
index. However, this was only noticeable approximately 1/N of the time
on the next request after an inventory-directed request, so the bug
occurred infrequently. Luckily, the use of .expect caused the bug to
be an immediate panic, making it possible to identify by inspecting all
uses of the ready service map.

The second commit in the PR tidies the code as a cleanup, but is kept
separate to be able to pinpoint the bugfix.

Closes #1183.

The peer set maintains a preselected ready service that it can use to
perform power-of-two-choices (p2c) routing of requests.  Ready services
are stored by key (socket address) in an `IndexMap`, and the preselected
service is represented by an `Option<usize>` indexing that map.  This
means that whenever the set of ready services changes (e.g., a service
is removed from the peer set, or a service is taken to be used to
process a request), the preselected index is invalidated.  The original
P2C-only implementation maintained this invariant but did not document
it.

The change to inventory-based routing introduced a bug by failing to
maintain this invariant and appropriately invalidate the preselected
index.  However, this was only noticeable approximately 1/N of the time
on the next request after an inventory-directed request, so the bug
occurred infrequently.  Luckily, the use of `.expect` caused the bug to
be an immediate panic, making it possible to identify by inspecting all
uses of the ready service map.
- rename functions more descriptively
- create a common `take_ready_service` function
- organize poll_ functions separately
Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, lets merge once CI passes.

@teor2345 teor2345 merged commit 8d6ac8e into main Nov 24, 2020
@teor2345 teor2345 deleted the fix-next-idx-bug branch November 24, 2020 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic: preselected index must be valid in peer_set
2 participants