-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Akka.Cluster: enable keep-majority
default SBR
#6628
Conversation
Enables `keep-majority` as the default SBR and turns it on by default for Akka.Cluster. This was a planned change for Akka.NET v1.5.0 but we didn't implement it.
I think I need to write some updates to the documentation and a release advisory for this change as well |
Looks like I need to update the |
keep-majority
default SBRkeep-majority
default SBR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detailed my changes
@@ -18,6 +18,7 @@ | |||
[assembly: System.Runtime.Versioning.TargetFrameworkAttribute(".NETCoreApp,Version=v6.0", FrameworkDisplayName=".NET 6.0")] | |||
namespace Akka.Cluster | |||
{ | |||
[System.ObsoleteAttribute("No longer used as of Akka.NET v1.5.2")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AutoDowning
is now obsolete - it was a bad idea anyway.
@@ -37,7 +37,7 @@ public void ClusterSingletonManagerSettings_must_have_default_config() | |||
clusterSingletonManagerSettings.SingletonName.ShouldBe("singleton"); | |||
clusterSingletonManagerSettings.Role.ShouldBe(null); | |||
clusterSingletonManagerSettings.HandOverRetryInterval.TotalSeconds.ShouldBe(1); | |||
clusterSingletonManagerSettings.RemovalMargin.TotalSeconds.ShouldBe(0); | |||
clusterSingletonManagerSettings.RemovalMargin.TotalSeconds.ShouldBe(20); // now 20 due to default SBR settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RemovalMargin
now uses SBR default value of 20 seconds (this is where the value is computed from.)
@@ -11,6 +11,48 @@ This document contains specific upgrade suggestions, warnings, and notices that | |||
<iframe width="560" height="315" src="https://www.youtube.com/embed/-UPestlIw4k" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> | |||
<!-- markdownlint-enable MD033 --> | |||
|
|||
## Upgrading to Akka.NET v1.5.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrade advisories for this PR as well as #6389
|
||
#### Disabling the Default Downing Provider | ||
|
||
To disable the default Akka.Cluster downing provider, simply configure the following in your HOCON: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated this section from the Split Brain Resolver page - explains how to disable the new default SBR, since I'm fairly certain we'll get a question about that in the future.
@@ -38,6 +38,19 @@ Keep in mind that split brain resolver will NOT work when `akka.cluster.auto-dow | |||
|
|||
Beginning in Akka.NET v1.4.16, the Akka.NET project has ported the original split brain resolver implementations from Lightbend as they are now open source. The following section of documentation describes how Akka.NET's hand-rolled split brain resolvers are implemented. | |||
|
|||
> [!IMPORTANT] | |||
> As of Akka.NET v1.5.2, the `keep-majority` split brain resolution strategy is now enabled by default. This should be acceptable for the majority of Akka.Cluster users, but please read on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to make it clear that there is now a default SBR enabled.
DowningProviderType = typeof(AutoDowning); | ||
else | ||
DowningProviderType = typeof(NoDowning); | ||
DowningProviderType = !string.IsNullOrEmpty(downingProviderClassName) ? Type.GetType(downingProviderClassName, true) : typeof(NoDowning); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disabled loading of AutoDowning
here as it's no longer supported.
# formed in case of network partition. | ||
# Disable with "off" or specify a duration to enable auto-down. | ||
# If a downing-provider-class is configured this setting is ignored. | ||
auto-down-unreachable-after = off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed auto-down-unreachable-after
setting from the base config.
@@ -63,7 +55,9 @@ akka { | |||
# `Akka.Cluster.IDowningProvider` having two argument constructor: | |||
# - argument 1: accepting an `ActorSystem` | |||
# - argument 2: accepting an `Akka.Cluster.Cluster` | |||
downing-provider-class = "" | |||
downing-provider-class = "Akka.Cluster.SBR.SplitBrainResolverProvider, Akka.Cluster" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the major change - the default SBR.
@@ -379,6 +379,7 @@ public static Config BaseConfig | |||
} | |||
} | |||
} | |||
cluster.downing-provider-class = """" #disable SBR by default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backwards compat for the MNTR - keep the SBR disabled by default. Most tests probably don't want the SBR running anyway unless they explicitly enable it.
AutoDownUnreachableAfter = clusterConfig.GetTimeSpanWithOffSwitch("auto-down-unreachable-after"); | ||
if (AutoDownUnreachableAfter == default(TimeSpan)) // need to restore correct defaults now that it's been deleted from HOCON file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a work-around needed to restore the "correct" defaults for AutoDownUnreachableAfter
now that it's no longer included in the default cluster.conf
file.
@@ -58,7 +58,9 @@ public class ClusterShardingSpecConfig : MultiNodeClusterShardingConfig | |||
CommonConfig = ConfigurationFactory.ParseString($@" | |||
akka.cluster.sharding.verbose-debug-logging = on | |||
#akka.loggers = [""akka.testkit.SilenceAllTestEventListener""] | |||
|
|||
akka.cluster.downing-provider-class = ""Akka.Cluster.SBR.SplitBrainResolverProvider, Akka.Cluster"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This spec previously relied on auto-down-unreachable-after
- in order to make it pass again we needed to:
- Enable the SBR
- Set the
akka.cluster.split-brain-resolver.stable-after
to the lowest possible value (must be > 0.0s) - Set the
akka.cluster.down-removal-margin
to the lowest possible value (must be > 0.0s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just have a question
To disable the default Akka.Cluster downing provider, simply configure the following in your HOCON: | ||
|
||
```hocon | ||
akka.cluster.downing-provider-class = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question, since we removed AutoDowning, what will happen if a user turn SBR off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered in chat - nothing. AutoDowning was never enabled by default and we strongly discouraged users from ever enabling it. If the default SBR is disabled Akka.Cluster behaves exactly like it does today: unreachable nodes stay unreachable until downed manually by pbm
or some other process.
Local test runner appears to be crashing on Windows - I'll re-run it once the rest of the PR completes testing.
Do not press
|
Can't really run the Akka.Cluster.Sharding MNTR suite without it
Changes
Enables
keep-majority
as the default SBR and turns it on by default for Akka.Cluster. This was a planned change for Akka.NET v1.5.0 but we didn't implement it.Checklist
For significant changes, please ensure that the following have been completed (delete if not relevant):