-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
K8SSAND-1042 ⁃ Feature request: Add option to allow all pods to start in parallel #230
Comments
I think we'll have to commit something here so that we can toggle on an implementation using static IPs so that this feature can be tested? |
I'm curious how we could detect if the cluster is using static IPs or not. Just a boolean in the spec? I assume there is a sidecar or something that handles setting up the appropriate addresses and routing. |
Yep that's what I was thinking, something like Alternatively, we could add a flag something like |
Let me ask the obvious, What are the risks of starting in parallel if static IPs are not used? |
Please add your planning poker estimate with ZenHub @burmanm |
I assume this would fall under the |
Do we still need to start seed nodes first before parallel starting the rest of the nodes? |
If we start the seed nodes first (one by one), it should allow us to start other nodes in parallel even if we're not using static IPs. These nodes will then be able to connect to the cluster through the seeds and broadcast their new IP address. |
@bradfordcp, can we move the ticket to the product backlog or does it require a design session? |
@rchernobelskiy Is this still necessary feature? |
From my personal perspective I still believe it would be a good feature to have. |
I agree, there have been multiple incidents that were due to nodes which are already part of the ring being blocked from starting by cass-operator because another node was bootstrapping (which can take a while). What we need to identify is if a node had previously bootstrapped, and allow it to start concurrently with other nodes in that case if we have at least one available seed node. |
Solved in #673 |
Nice, cc @berndocklin and @Liwanshi we should look at adding this to Astra, it'll significantly reduce the time to resume a large stopped cluster. |
Currently, when resuming a stopped cluster, all the cassandra pods start up sequentially because the ips for the pods change and cassandra can only join one node at a time.
When using static IPs however, there is no concern about the IPs changing and therefore all the pods can start up in parallel.
An option to start all pods in parallel will significantly reduce the time to resume a large stopped cluster.
┆Issue is synchronized with this Jira Task by Unito
┆friendlyId: K8SSAND-1042
┆priority: Medium
The text was updated successfully, but these errors were encountered: