-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set the backend again after recovering v3 backend from snapshot #13500
Conversation
dcec6c7
to
a0a1468
Compare
/kind bug |
Could you possibly add a test that would prevent the problem in the future? |
453fb91
to
79aa860
Compare
@serathius Thanks for the comment. Added a unit test. |
Should I update the CHANGELOG-3.6 ? |
I have an another question,If that's the problem,it happens 100 percent,but the problematic endpoint soon returned to normal,how did this happen ? |
Because the original db file has already been replaced with the snapshot db. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, Overall LGTM.
Thanks for adding tests
cc @hexfusion
79aa860
to
b1d7b68
Compare
@serathius Thanks for the comments. All resolved. |
b1d7b68
to
6136c2a
Compare
Just updated the CHANGELOG-3.6 as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing this.
I'm a little concerned (independent from this CL) why functional (https://github.com/etcd-io/etcd/tree/main/tests/functional) tests had not catched this.
Failure injection should lead to recovery from snapshot from time to time
(but I assume this issue triggers only if 2 conditions are met at the same time:
- the task was lagging enough to need to recover from snapshot (and not WAL) from peer)
- the task failed after downloading the file, but before removing it.
And we might not have such combination.
6136c2a
to
1682ce6
Compare
@ptabor resolved all your comments. Thanks. I can take a look at the functional test and check whether it's possible to add a case to cover this scenario in a separate PR. |
1682ce6
to
7be1464
Compare
Just rebased. |
Thank you. Will merge when the tests are green. |
Fix issues/13494.
When etcd recovers v3 backend from a snapshot, then it closes the old backend, see backend.go#L107. So we should set the backend for the consistentIndex again in this case.
This PR should be back ported to 3.5 as well, will submit a separate PR soon.
cc @ptabor @serathius @jingyih