-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: attempt to delete processes from instance-managers in unknown state #3127
fix: attempt to delete processes from instance-managers in unknown state #3127
Conversation
CI failures:
|
e27dc1b
to
f75f01e
Compare
To test (in contrast with longhorn/longhorn#6552 (comment)):
Rerun RWX fast failover tests from longhorn/longhorn#6205 (comment) to ensure no regression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. The re-factor makes the logic clearer, too.
@james-munson raised a concern about the effect this might have on RWX fast failover. I will test it a bit before merging (and modify the test plan so QA also tests it eventually). |
@james-munson, I ran case 1 from longhorn/longhorn#6205 (comment) a few times. The results were:
I think RWX fast failover is still working as expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Leave a styling comment but not strong about it. Feel free to ignore it if you prefer the current implementation
Thank you for the investigation and the fix
Longhorn 6552 Signed-off-by: Eric Weber <eric.weber@suse.com>
f75f01e
to
25ca057
Compare
Longhorn 6552 Signed-off-by: Eric Weber <eric.weber@suse.com>
Longhorn 6552 Signed-off-by: Eric Weber <eric.weber@suse.com>
25ca057
to
5b534ef
Compare
@mergify backport v1.6.x v1.5.x |
✅ Backports have been created
|
Which issue(s) this PR fixes:
longhorn/longhorn#6552
What this PR does / why we need it:
See longhorn/longhorn#6552 (comment) for context. It may be possible to delete engine and replica processes from an instance-manager even when the instance-manager's state is unknown. Doing so prevents the processes from becoming orphans if the instance-manager eventually recovers from the unknown state.
This PR is looking good in some local testing, but I want to put it through a few more paces before review.