-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
persist learner info #3771
persist learner info #3771
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome job!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job~ You'd better write a simple test case: write some learner info, restart the NebulaStore, check if the state is correct. I concern about the convert between storage address and raft address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
04b78af
to
3a04aae
Compare
why don't you merge them into one commit, it's hard to read |
ad43bb3
to
98a2dae
Compare
did you consider about removing parts in metaclient? |
@liwenhui-soul Metaclient only calculate diff in two version of local cache, but when add peer, we will not update local cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM~
a05bdf4
to
e71482f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done... This bug fix is way more complicated than I expected.
9f61253
to
6d1879b
Compare
What type of PR is this?
What problem(s) does this PR solve?
Issue(s) number:
#3689
Description:
Balance is a long process. If the storaged restart when the cluster is doing balancing, we will lose some info now.
That is, all the learner info will be lost, because they will neither be persisted in storaged nor in metad.
How do you solve it?
Now the partition balance process includes:
We will persist all the partition peers info in the storage local, including the status in balancing.
When the storaged restart, we will join the info in storage local with the meta to decide if the part should be kept or removed, started as learner or normal peer, started with which peers.
Special notes for your reviewer, ex. impact of this fix, design document, etc:
Checklist:
Tests:
Affects:
Release notes:
Please confirm whether to be reflected in release notes and how to describe: