Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: privTgt mount is lost after vxflexos-node pod restart #1546

Closed
donatwork opened this issue Oct 28, 2024 · 4 comments
Closed

[BUG]: privTgt mount is lost after vxflexos-node pod restart #1546

donatwork opened this issue Oct 28, 2024 · 4 comments
Assignees
Labels
area/csi-powerflex Issue pertains to the CSI Driver for Dell EMC PowerFlex type/bug Something isn't working. This is the default label associated with a bug issue.
Milestone

Comments

@donatwork
Copy link
Contributor

Bug Description

We have observed two issues when privTgt is lost:

  1. Using "oc rollout restart deployment " to restart of the user deployment that consumes PowerFlex and if the new pod is assigned to the same node results in the new pod failing to start, remaining in the ContainerCreating state.

  2. Any operation that causes the kubelet to restart (for example, replacing the cluster certificate) will cause the vxflexos-node pod to receive NodePublishVolume events for existing PVCs, leading to continuous warning events in the user pod (though the user pod remains in a running state).

Logs

Warning FailedMount 40s (x598 over 20h) kubelet MountVolume.SetUp failed for volume "vrtestraven-7169900ccd" : rpc error: code = Internal desc = Device already in use and mounted elsewhere for privTgt /var/lib/kubelet/plugins/vxflexos.emc.dell.com/disks/1dabfae970da540f-b83f174c000002a0

Screenshots

No response

Additional Environment Information

Problem is not related to a particular platform.

Steps to Reproduce

Create a deployment that consumes PowerFlex through PowerFlex CSI.
Restart the vxflexos-node pod on the same node as the deployment.
Enter the newly restarted vxflexos-node pod. Run the mount command and observe that the privTgt is lost, with only the target mount displayed.

Expected Behavior

Expect that the node pods should be restarted without error.

CSM Driver(s)

csi-powerflex 1.11

Installation Type

No response

Container Storage Modules Enabled

No response

Container Orchestrator

OCP: 4.16.6

Operating System

RCOS

@donatwork donatwork added type/bug Something isn't working. This is the default label associated with a bug issue. area/csi-powerflex Issue pertains to the CSI Driver for Dell EMC PowerFlex labels Oct 28, 2024
@donatwork
Copy link
Contributor Author

link: 28509

@donatwork
Copy link
Contributor Author

Fix merged into main for CSM 1.12 release. Closing issue.

@donatwork
Copy link
Contributor Author

Need to add changes to csm-operator.

@donatwork
Copy link
Contributor Author

Issue resolved. Closing JIRA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/csi-powerflex Issue pertains to the CSI Driver for Dell EMC PowerFlex type/bug Something isn't working. This is the default label associated with a bug issue.
Projects
None yet
Development

No branches or pull requests

3 participants