Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

Add support for recovery of async/semisync replicas of failed replication group members #1254

Merged
merged 1 commit into from
Oct 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs/configuration-recovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ These hooks are available for recoveries:
- `PreFailoverProcesses`: executed immediately before `orchestrator` takes recovery action. Failure (nonzero exit code) of any of these processes aborts the recovery.
Hint: this gives you the opportunity to abort recovery based on some internal state of your system.
- `PostMasterFailoverProcesses`: executed at the end of a successful master recovery.
- `PostIntermediateMasterFailoverProcesses`: executed at the end of a successful intermediate master recovery.
- `PostIntermediateMasterFailoverProcesses`: executed at the end of a successful intermediate master or replication
group member with replicas recovery.
- `PostFailoverProcesses`: executed at the end of any successful recovery (including and adding to the above two).
- `PostUnsuccessfulFailoverProcesses`: executed at the end of any unsuccessful recovery.
- `PostGracefulTakeoverProcesses`: executed on planned, graceful master takeover, after the old master is positioned under the newly promoted master.
Expand Down
17 changes: 8 additions & 9 deletions docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,21 +62,20 @@ No.

### Does orchestrator support MySQL Group Replication?

Partially. Replication groups in single primary mode are somewhat supported under MySQL 8.0. The extent of the support so far is:
Partially. Replication groups in single primary mode are supported under MySQL 8.0. The extent of the support is:

* Orchestrator understands that all group members are part of the same cluster, retrieves replication group information
as part of instance discovery, stores it in its database, and exposes it via the API.
* The orchestrator web UI displays single primary group members. They are shown like this:
* All group secondary members as replicating from the primary.
* All group members have an icon that shows they are group members (as opposed to traditional async/semi-sync replicas).
* All secondary group members as replicating from the primary.
* All group members have an icon that shows they are group members (as opposed to traditional async/semi-sync
replicas).
* Hovering over the icon mentioned above provides information about the state and role of the DB instance in the
group.
* Some relocation operations are forbidden for group members. In particular, orchestrator will refuse to relocate a secondary group member, as it, by definition, replicates always from the group primary. It will also reject an attempt to relocate a group primary under a secondary of the same group.

No support has been added (yet) to handling group member failure. If all you have is a single replication group, this is fine, because you don't need it; the group will handle all failures as long as it can secure a majority.

If, however, you have the primary of a group as a replica to another instance; or you have replicas under your group
members, know that this has not been tested and results are, therefore, unpredictable at the moment. It *might* work, but it might also create a singularity and suck your database under the event horizon.
* Some relocation operations are forbidden for group members. In particular, orchestrator will refuse to relocate a
secondary group member, as it, by definition, replicates always from the group primary. It will also reject an attempt
to relocate a group primary under a secondary of the same group.
* Traditional async/semisync replicas from failed group members are relocated to a different group member.

### Does orchestrator support Yet Another Type of Replication?

Expand Down
7 changes: 7 additions & 0 deletions go/inst/analysis.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@ const (
AllIntermediateMasterReplicasNotReplicating = "AllIntermediateMasterReplicasNotReplicating"
FirstTierReplicaFailingToConnectToMaster = "FirstTierReplicaFailingToConnectToMaster"
BinlogServerFailingToConnectToMaster = "BinlogServerFailingToConnectToMaster"
// Group replication problems
DeadReplicationGroupMemberWithReplicas = "DeadReplicationGroupMemberWithReplicas"
)

const (
Expand Down Expand Up @@ -110,6 +112,7 @@ const (
AnalysisInstanceTypeMaster AnalysisInstanceType = "master"
AnalysisInstanceTypeCoMaster AnalysisInstanceType = "co-master"
AnalysisInstanceTypeIntermediateMaster AnalysisInstanceType = "intermediate-master"
AnalysisInstanceTypeGroupMember AnalysisInstanceType = "group-member"
)

// ReplicationAnalysis notes analysis on replication chain status, per instance
Expand All @@ -122,6 +125,7 @@ type ReplicationAnalysis struct {
AnalyzedInstancePhysicalEnvironment string
AnalyzedInstanceBinlogCoordinates BinlogCoordinates
IsMaster bool
IsReplicationGroupMember bool
IsCoMaster bool
LastCheckValid bool
LastCheckPartialSuccess bool
Expand Down Expand Up @@ -213,6 +217,9 @@ func (this *ReplicationAnalysis) GetAnalysisInstanceType() AnalysisInstanceType
if this.IsCoMaster {
return AnalysisInstanceTypeCoMaster
}
if this.IsReplicationGroupMember {
return AnalysisInstanceTypeGroupMember
}
if this.IsMaster {
return AnalysisInstanceTypeMaster
}
Expand Down
Loading