-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the bug where VirtualMachine::shutdown function throw affects multipass delete #3625
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3625 +/- ##
==========================================
+ Coverage 88.85% 88.92% +0.06%
==========================================
Files 254 254
Lines 14269 14271 +2
==========================================
+ Hits 12679 12690 +11
+ Misses 1590 1581 -9 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, I don't think this is the proper way of fixing this. It's more of a hack than an actually fix. I think we need to go back to the question of "What is non-purge delete supposed to do?". Moving check_state_for_shutdown()
to the daemon isn't the correct proper fix either in my opinion. Instead we should be evaluating what stop is supposed to do in the context of a delete of a suspended vm.
Secondly, it would be nice if the commit history got cleaned up a bit. Currently, only f34db7f is relevant to the final diff, so the rest of the commits can be dropped.
To answer the question "What is non-purge delete supposed to do?", let's delve into the expected behavior of each state. Major states:
So in summary, the catch exception and do nothing did the skip shutdown thing. In my opinion, this is a reasonable thing to do while the There are other minor states like starting or restarting, these states on cmd blocks the daemon. The user will have to open another terminal to delete the starting VM. The ideal behavior would be waiting for the starting process finish and shut it down gracefully from the running state. However, I do not think we have good thread safety for this in the first place, so tackling these states would be beyond the scope of this PR. |
This can be done, sorry about that. Initially I was thinking about put two fixes into one PR. |
2d13b4d
to
a9e57cc
Compare
Hmm, I would not say that those states are out of scope, but I think we could simplify and not wait for a graceful shutdown either. Instead, Here is an option to achieve that: |
@sharder996
|
This just occurred to me and I thought I'd note it down. We need to remember to test what happens when we pass multiple instance arguments or use
Do we do the right thing in all those cases? |
67da547
to
1addda1
Compare
I changed the name of the VM to vm1, vm2 and vm3 for eaiser illustration. It is thoughtful that you mentioned the bulk operations. This boils down to how the command behaves when one of the VM throws during this operation. Based on this line of code, it should exit on the first throwing VM and report the error to the cli and to the user. For example, if vm1 is running, vm2 is suspended and v,3 is stopped, then
Regarding these three lines,
They do not throw, so they can apply the operation to every VMs. |
Hi @georgeliao, thanks for clarifying. I did not quite understand your last comment though. Do you mean that those commands do not go through the line of code you linked to earlier? In any case, I still think this should be manually tested, just to verify expectations. |
What I was trying to say is that things are simpler without throwing. The loop will not exit early. Every VM in that selection will get the operation applied or skipped. It is not that different from applying the operation to a single VM. |
In some cases it will still need to throw, right? Anyway, just trying to make sure the behavior is verified experimentally. |
{ | ||
std::unique_lock<std::mutex> lock{state_mutex}; | ||
|
||
force_shutdown = force; | ||
force_shutdown = (shutdown_policy == ShutdownPolicy::Poweroff); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sharder996 I just noticed that force_shutdown
value should be set to true later in order to truely pair up with this line, check_state_for_shutdown
has a chance to throw and leaves force_shutdown
to true without actually executing forceful shutdown. Something we overlooked in the silencing error PR. I think the right place to set it might be after line 370, mpl::log(mpl::Level::info, vm_name, "Killing process");
. What do you think? That makes
mpl::log(mpl::Level::info, vm_name, "Killing process");
force_shutdown = true;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me.
@ricab @sharder996 , the follow-up codework is done based on our team discussion, so it is ready for another round of review. Please read the intro description first to get an overview of the steps and the functional testing cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the work in to make this better, Jia!
{ | ||
std::unique_lock<std::mutex> lock{state_mutex}; | ||
|
||
force_shutdown = force; | ||
force_shutdown = (shutdown_policy == ShutdownPolicy::Poweroff); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me.
b0d80ca
to
1257e80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick review and LGTM, thanks Jia!
Leaving it with @sharder996.
@georgeliao Just needs the private side now 😄 |
1257e80
to
21b00b8
Compare
21b00b8
to
dcf6b7b
Compare
…voke non-force shutdown.
…he VirtualMachine::shutdown function handle that.
…n, so it no longer contains info of the client code.
…d of INVALID_ARGUMENT to avoid unwanted matches.
…tion error message change.
…suspended vm based on shutdown policy.
…with the false value resetting.
dcf6b7b
to
7d01280
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Jia!
There is a commit message that I find confusing, but I am not going to block on that. Merge away!
Fix the bug where VirtualMachine::shutdown function throw affects multipass delete
close #3624
close #3606
A few things were done based on our team discussion
daemon.cpp
was removed because it is the responsibility ofVirtualMachine::shutdown
function to check the state.VMStateInvalidException
exception togrpc::StatusCode::FAILED_PRECONDITION
, and the CLI side interprets this code respectively.VirtualMachine::shutdown
is replaced by a 3 values enumShutdownPolicy
, corresponding refactor was performed as well.Functional testing:
Please at least cover the following scenarios:
multipass stop
on a deleted or non-existing VM does not trigger the--force
suggestionmultipass stop
on a VM with invalid states does trigger the--force
suggestion. Invalid states include suspended, suspending, starting and restarting, the transtioning states require runningmultipass stop
in 2nd cmd panel.multipass delete
on a deleted or non-existing VM does not trigger the--purge
suggestionmultipass delete
on a suspended and stopped vm can retain the state meaning the recovered vm should have the same state.multipass delete
on arunning vm will shutdown the vm gracefully and recover it to the stopped state.multipass delete
on a VM with invalid states does trigger the--purge
suggestion. Invalid states are only transtioning states like suspending, starting and restarting, they all require the 2nd cmd panel to run.Note:
shutdown
andcheck_state_for_shutdown
interface change. I would rather waiting the public PR review finish and change that later.test_daemon_snapshot_restore.cpp
. It is not a big problem to create thestop
anddelete
version of this, but it might take more time.