Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F #6099: Improve FT Hook Script host_error.rb #6182

Closed
wants to merge 4 commits into from
Closed

Conversation

onenhansen
Copy link
Contributor

Adds checks to the "migrate" option of the Fault Tolerance hook script host_error.rb to skip virtual machines which are hosted on the failed host but are not deployed to a shared datastore.

The whole datastore pool is gathered at first, and that pool is used to reference the shared value to keep the API calls low.

This resolves #6099

Should apply to 6.6 and up

Signed-off-by: Neal Hansen <nhansen@opennebula.io>
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
Signed-off-by: Neal Hansen <nhansen@opennebula.io>
@tinova tinova requested review from rsmontero and sk4zuzu April 19, 2023 09:37
@@ -283,6 +290,15 @@ def states_xpath(*arr)
log "delete #{vm_id}"
vm.delete
when :migrate
vm_ds_id = vm.retrieve_elements("/VM/HISTORY_RECORDS/HISTORY[position()=last()]/DS_ID")[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linting " --> ' I think there is no interpolation so single quotes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is this "exception" save? I mean retrieve elements what happen if no vm_sd_id is found?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch (pun intended) :p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be safe, as in the vm_id array should only contain VMs which were either in states ACTIVE, SUSPENDED, or POWEROFF (3, 5, 8) when pulled and should therefore have been scheduled at least once. I'll add a check for it in the event some edge case occurs that I'm unaware of.

vm_ds_id = vm.retrieve_elements("/VM/HISTORY_RECORDS/HISTORY[position()=last()]/DS_ID")[0]

ds_xpath = "/DATASTORE_POOL/DATASTORE[ID=\"#{vm_ds_id}\"]/TEMPLATE/SHARED"
is_shared = ds_pool.retrieve_elements(ds_xpath)[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here what if no is_shared?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe migrate or not migrate by default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe migrate or not migrate by default?

the idea is to not migrate non shared VMs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be safe, however if a datastore somehow doesn't contain SHARED it should skip and it seems this code would allow it to pass. I'll add a check here as well.

if is_shared == "NO"
log "Skipping VM #{vm_id} deployed on non-shared datastore"
next
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So suggestion could be a rescue for this block that would skip the migration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at the code again, I agree the rescue seems like the cleanest way about this

Signed-off-by: Neal Hansen <nhansen@opennebula.io>
@onenhansen onenhansen closed this Apr 20, 2023
@onenhansen onenhansen deleted the f-6099 branch April 20, 2023 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fault Tolerance hook script improvements
3 participants