Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F #2352 #2359 #2981 #3130 #3233: Fix fs_lvm cleanup and offline migration issues #3201

Merged
merged 11 commits into from
Apr 17, 2019

Conversation

kvaps
Copy link
Contributor

@kvaps kvaps commented Apr 8, 2019

Note for reviewer:

This PR includes the next fixes:

  1. Remove check for /VM/TEMPLATE/DISK[DISK_ID=$DISK_ID]/TYPE solves problem with offline and datastore migration. fixes fs_lvm offline migration does not work #2359 and LVM datastore migration does not work #3130. Add check for LCM_STATE instead, fs_lvm driver now correctly disables logical volume during undeploy VM.

  2. Fix directory cleanup during termination undeployed VM for storage with the BRIDGE_LIST variable set. The expression for finding DS_SYS_ID was never work if DST_PATH is directory. fixes Destroy VM isn't working with BRIDGE_LIST and UNDEPLOYED VMs #2981 (comment) and LVM cleanup executes on master instead storage node #2352

  3. Fix when node in BRIDGE_LIST have no activated lvm device, so zeroing is not working. Now it will always activate the device before zeroing.

kvaps added 4 commits April 8, 2019 12:56
Signed-off-by: kvaps <kvapss@gmail.com>
Signed-off-by: kvaps <kvapss@gmail.com>
Signed-off-by: kvaps <kvapss@gmail.com>
Signed-off-by: kvaps <kvapss@gmail.com>
@kvaps
Copy link
Contributor Author

kvaps commented Apr 8, 2019

@xorel please review the changes

@rsmontero rsmontero requested a review from xorel April 9, 2019 13:21
Copy link
Member

@xorel xorel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also rebase to get rid of the conflict with the missing DD_BLOCK_SIZE?

@kvaps
Copy link
Contributor Author

kvaps commented Apr 10, 2019

Could you please also rebase to get rid of the conflict with the missing DD_BLOCK_SIZE?

Done, I've also replaced DST_PATH check with internal is_disk function:

if [ `is_disk $DST_PATH` -eq 1 ]; then
DS_SYS_ID=$(echo $DST_PATH | $AWK -F '/' '{print $(NF-2)}')
else
DS_SYS_ID=$(echo $DST_PATH | $AWK -F '/' '{print $(NF-1)}')
fi

Signed-off-by: kvaps <kvapss@gmail.com>
@kvaps kvaps force-pushed the fix-lvm2 branch 3 times, most recently from 3bd9400 to d67c4b0 Compare April 14, 2019 20:19
Signed-off-by: kvaps <kvapss@gmail.com>
@kvaps
Copy link
Contributor Author

kvaps commented Apr 15, 2019

Found the problem, when we executing delete operation on the BRIDGE_LIST node, device if not removing from the device mapper on the current node, where volume was activated, that's prevent attaching the new drives on same place.

Another words we should find BRIDGE_LIST node only in case if VM is stopped or undeployed

@kvaps kvaps changed the title F #2352 #2359 #2981 #3130: Fix fs_lvm cleanup and offline migration issues F #2352 #2359 #2981 #3130 #3233: Fix fs_lvm cleanup and offline migration issues Apr 15, 2019
@xorel
Copy link
Member

xorel commented Apr 15, 2019

That's good point.

OK, I did some tests and realized that detecting the UNDEPLOYED using the LCM check 10|30|41|42 is wrong, when the fs_lvm/delete is called the LCM_STATE is always 11 (EPILOG) no matter the previous state (except some error states), that was introduced by me by mistake.

So, I am afraid the only way how to detect if the BRIDGE_LIST should be used is tho compare
the $DST_HOST and the HOST from the VM history, like you did before.

And third point, it seems that we are completely missing the device mapper clanup, I think we shoul call dmsetup remove after each lvremove.

This thing now bloated quite a lot and I am thinking starting again from scratch and tackle the issues one by one.

@kvaps
Copy link
Contributor Author

kvaps commented Apr 15, 2019

using the LCM check 10|30|41|42 is wrong, when the fs_lvm/delete is called the LCM_STATE is always 11 (EPILOG) no matter the previous state (except some error states), that was introduced by me by mistake.

You're right, this check is working fine for the fs_lvm/mv action, but it is wrong to use it for delete.
Let me fix that.

And third point, it seems that we are completely missing the device mapper clanup, I think we shoul call dmsetup remove after each lvremove.

We don't have to, because it is automatically done by lvremove command.
This is just should be run on the same host where VM is located (if it's deployed)

@kvaps kvaps force-pushed the fix-lvm2 branch 3 times, most recently from 6949aa7 to e429d1c Compare April 15, 2019 17:13
@kvaps
Copy link
Contributor Author

kvaps commented Apr 15, 2019

@xorel, I've found elegant solution: [ "$LAST_HOST" != "$DST_HOST" ] - means running on frontend.

Commit (a075eb9) force pushed to the branch for replace last wrong changes.

Signed-off-by: kvaps <kvapss@gmail.com>
@xorel
Copy link
Member

xorel commented Apr 16, 2019

@xorel, I've found elegant solution: [ "$LAST_HOST" != "$DST_HOST" ] - means running on frontend.

OK, this seems to work better.

One more detail, when testing I noticed that after offline migration the LV remains active on the SRC_HOST, I suppose the mv action should always deactivate the LV and only if the action is stop or undeploy
then exit. I don't see any other scenarios when the SRC LV should not be deactivated.

@kvaps
Copy link
Contributor Author

kvaps commented Apr 16, 2019

after offline migration the LV remains active on the SRC_HOST, I suppose the mv action should always deactivate the LV and only if the action is stop or undeploy
then exit.

Indeed

I don't see any other scenarios when the SRC LV should not be deactivated.

When copying from the frontend (deploy)

! [[ "$LCM_STATE" =~ ^(31|50)$ ]]

@kvaps
Copy link
Contributor Author

kvaps commented Apr 16, 2019

Found another problem: undeploy --> deploy on selected datastore is not working
but it seems that it is not our bug, tm driver always calls with the same path during deploy

Signed-off-by: kvaps <kvapss@gmail.com>
@kvaps
Copy link
Contributor Author

kvaps commented Apr 16, 2019

after offline migration the LV remains active on the SRC_HOST, I suppose the mv action should always deactivate the LV and only if the action is stop or undeploy
then exit.

Indeed

I don't see any other scenarios when the SRC LV should not be deactivated.

When copying from the frontend (deploy)

solved

@rsmontero rsmontero merged commit 99ded0f into OpenNebula:master Apr 17, 2019
rsmontero pushed a commit that referenced this pull request Apr 17, 2019
…ation issues (#3201)

Signed-off-by: kvaps <kvapss@gmail.com>
(cherry picked from commit 99ded0f)
rsmontero pushed a commit that referenced this pull request Aug 20, 2024
* Include field path for ALL form errors
* Fix VNC validation

Signed-off-by: Victor Hansson <vhansson@opennebula.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Destroy VM isn't working with BRIDGE_LIST and UNDEPLOYED VMs fs_lvm offline migration does not work
3 participants