Skip to content
This repository has been archived by the owner on Feb 6, 2024. It is now read-only.

refactor: implement failover based distributed etcd lock #142

Closed

Conversation

ZuLiangWang
Copy link
Contributor

Which issue does this PR close?

Closes #

Rationale for this change

In order to ensure that user data in distributed mode will not be lost, we need a mechanism to ensure that CeresDB will not have multiple leader shard under any circumstances. etcd lock has been added in apache/horaedb#706 , we refactor scheduler to implement failover based etcd distributed lock.

What changes are included in this PR?

  • Add shard watcher to get notify when shard lock is expired.
  • Refactor scheduler module, add shard lock delete callback to implement auto failover.

Are there any user-facing changes?

None.

How does this change test

Pass all unit tests and integration tests.

@ZuLiangWang ZuLiangWang force-pushed the refactor_cluster_procedure branch 4 times, most recently from 7da1c94 to 3ec790c Compare March 14, 2023 12:17
@ZuLiangWang ZuLiangWang force-pushed the refactor_cluster_procedure branch from 3ec790c to 8339883 Compare March 14, 2023 12:32
@ZuLiangWang ZuLiangWang force-pushed the refactor_cluster_procedure branch from d1496db to 2a5a993 Compare March 20, 2023 13:13
@ZuLiangWang ZuLiangWang force-pushed the refactor_cluster_procedure branch from 2a5a993 to 6937d3b Compare March 21, 2023 08:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant