Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: ClusterEvent.MemberRemoved can jump from MemberLeaving without being MemberExited #4948

Closed
Aaronontheweb opened this issue Apr 19, 2021 · 2 comments

Comments

@Aaronontheweb
Copy link
Member

Version: Akka.NET v1.4.18

Based on some build log failures detected in PR #4946

Akka.Cluster.Tests.ClusterSpec.A_cluster_must_leave_via_CoordinatedShutdownRun_when_member_status_is_Joining
Expected removed.PreviousStatus to equal MemberStatus.Exiting(3) by value, but found MemberStatus.Leaving(2).

This isn't legal according to the member transition spec - issue occurs here:

internal static ImmutableList<IMemberEvent> DiffMemberEvents(Gossip oldGossip, Gossip newGossip)
{
if (newGossip.Equals(oldGossip))
{
return ImmutableList<IMemberEvent>.Empty;
}
var newMembers = newGossip.Members.Except(oldGossip.Members);
var membersGroupedByAddress = newGossip.Members
.Concat(oldGossip.Members)
.GroupBy(m => m.UniqueAddress);
var changedMembers = membersGroupedByAddress
.Where(g => g.Count() == 2
&& (g.First().Status != g.Skip(1).First().Status
|| g.First().UpNumber != g.Skip(1).First().UpNumber))
.Select(g => g.First());
var memberEvents = CollectMemberEvents(newMembers.Union(changedMembers));
var removedMembers = oldGossip.Members.Except(newGossip.Members);
var removedEvents = removedMembers.Select(m => new MemberRemoved(m.Copy(status: MemberStatus.Removed), m.Status));
return memberEvents.Concat(removedEvents).ToImmutableList();
}

Going to need to see if we can reproduce this with a simple use case, as this is definitely a gossip propagation / merging / diffing bug.

@Aaronontheweb
Copy link
Member Author

The unit test that was able to reproduce this bug (and this unit test has a very low flip rate, historically)

[Fact]
public void A_cluster_must_leave_via_CoordinatedShutdownRun_when_member_status_is_Joining()
{
var sys2 = ActorSystem.Create("ClusterSpec2", ConfigurationFactory.ParseString(@"
akka.actor.provider = ""cluster""
akka.remote.dot-netty.tcp.port = 0
akka.coordinated-shutdown.run-by-clr-shutdown-hook = off
akka.coordinated-shutdown.terminate-actor-system = off
akka.coordinated-shutdown.run-by-actor-system-terminate = off
akka.cluster.run-coordinated-shutdown-when-down = off
akka.cluster.min-nr-of-members = 2
").WithFallback(Akka.TestKit.Configs.TestConfigs.DefaultConfig));
try
{
var probe = CreateTestProbe(sys2);
Cluster.Get(sys2).Subscribe(probe.Ref, typeof(ClusterEvent.IMemberEvent));
probe.ExpectMsg<ClusterEvent.CurrentClusterState>();
Cluster.Get(sys2).Join(Cluster.Get(sys2).SelfAddress);
probe.ExpectMsg<ClusterEvent.MemberJoined>();
CoordinatedShutdown.Get(sys2).Run(CoordinatedShutdown.UnknownReason.Instance);
probe.ExpectMsg<ClusterEvent.MemberLeft>();
// MemberExited might not be published before MemberRemoved
var removed = (ClusterEvent.MemberRemoved)probe.FishForMessage(m => m is ClusterEvent.MemberRemoved);
removed.PreviousStatus.Should().BeEquivalentTo(MemberStatus.Exiting);
}
finally
{
Shutdown(sys2);
}
}

@Aaronontheweb
Copy link
Member Author

Pretty sure this has been resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant