-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
K8SSAND-1697 ⁃ Make Bootstrap Operations Deterministic #381
Comments
Hey team! Please add your planning poker estimate with ZenHub @burmanm @adejanovski @Miles-Garnsey |
A few data points to document what I'm seeing and what I'm going to modify: The distribution of pods in racks is deterministic and balanced already. It's deterministic because the distribution is done in However the order in which Cassandra processes are started is not deterministic:
I am thus going to modify the above methods, and make them behave in a strictly deterministic order. Note that this might slow down the overall startup time of the dc.
I'm rather inclined to respect the order in which racks were declared in the spec. |
My 2c here: since rack names cannot be changed on a live datacenter, but the ordering in the manifest could be changed, using names would probably be more "deterministic". |
I'm not sure here. Using the rack names means that one given rack will always be the first one to bootstrap, and that cannot be changed. What if the user wants another rack to be the first one? Using rack names would force users to name their racks with alphabetically sorted names reflecting the bootstrap order, which may be tricky depending on how they name them. E.g. |
As a user, I can't think of a use case where you'll need a specific rack to start first, just the order to be predictable. Do have a case in mind? |
Following up on our offline chat, we agree that as long as nodes get started by cycling through racks (vs starting several nodes in a single rack at once), how racks are ordered doesn't matter much. |
+1 to using declaration order to determine which rack bootstraps first. |
I’m fine with either approach as long as it’s deterministic and documented. |
What is missing?
When adding nodes to a cluster or creating a new cluster we can end up in a state where we do not bootstrap nodes in a balanced manner. This is possible when the number of nodes bootstrapping is greater than the number of racks.
Ideally we would sort the set of nodes to bootstrap by rack. Then the rack with the most nodes to bootstrap would have a node bootstrap. If there are any ties for the rack with the most nodes to bootstrap we should sort the racks by name.
Why do we need it?
Data can be skewed if we don't maintain balance during bootstrap operations. This is less of a concern with vnode clusters. When utilizing single-token we may end up with multiple replicas on a single node
Environment
Anything else we need to know?:
Y'all are awesome, thanks for the great work 🌟
┆Issue is synchronized with this Jira Task by Unito
┆friendlyId: K8SSAND-1697
┆priority: Medium
The text was updated successfully, but these errors were encountered: