-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
F #6099: Improve FT Hook Script host_error.rb #6182
Conversation
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
share/hooks/ft/host_error.rb
Outdated
@@ -283,6 +290,15 @@ def states_xpath(*arr) | |||
log "delete #{vm_id}" | |||
vm.delete | |||
when :migrate | |||
vm_ds_id = vm.retrieve_elements("/VM/HISTORY_RECORDS/HISTORY[position()=last()]/DS_ID")[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linting " --> ' I think there is no interpolation so single quotes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also is this "exception" save? I mean retrieve elements what happen if no vm_sd_id is found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch (pun intended) :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be safe, as in the vm_id array should only contain VMs which were either in states ACTIVE, SUSPENDED, or POWEROFF (3, 5, 8) when pulled and should therefore have been scheduled at least once. I'll add a check for it in the event some edge case occurs that I'm unaware of.
share/hooks/ft/host_error.rb
Outdated
vm_ds_id = vm.retrieve_elements("/VM/HISTORY_RECORDS/HISTORY[position()=last()]/DS_ID")[0] | ||
|
||
ds_xpath = "/DATASTORE_POOL/DATASTORE[ID=\"#{vm_ds_id}\"]/TEMPLATE/SHARED" | ||
is_shared = ds_pool.retrieve_elements(ds_xpath)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here what if no is_shared?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe migrate or not migrate by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe migrate or not migrate by default?
the idea is to not migrate non shared VMs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also be safe, however if a datastore somehow doesn't contain SHARED it should skip and it seems this code would allow it to pass. I'll add a check here as well.
if is_shared == "NO" | ||
log "Skipping VM #{vm_id} deployed on non-shared datastore" | ||
next | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So suggestion could be a rescue for this block that would skip the migration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking at the code again, I agree the rescue seems like the cleanest way about this
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
Adds checks to the "migrate" option of the Fault Tolerance hook script
host_error.rb
to skip virtual machines which are hosted on the failed host but are not deployed to a shared datastore.The whole datastore pool is gathered at first, and that pool is used to reference the shared value to keep the API calls low.
This resolves #6099
Should apply to 6.6 and up