-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Driver name s3.csi.aws.com not found in the list of registered CSI drivers #107
Comments
Did you install this EKS Addon? Then the CSI driver is not installed. But this works
|
Yes, I installed it via EKS addons, the drive was already installed. The strange thing is just that, even with it installed I get the error: ... driver name s3.csi.aws.com not found in the list of registered CSI drivers
|
If i understand it correctly then the CSIDriver is in place but the resource can't find it. |
Hey, I have noticed this issue as well. Usually it happens when the pod with mounted volume spawns before the This is an error I have had before using other CSIs, such as FSx CSI, and they have a mechanism with startup taints to prevent pods to start before the CSI pod is present. This is the reference for FSx CSI taint: Hope this helps somehow! 😃 |
I found the same error in some nodes, not all the nodes:
I found that by run:
But I have 5 nodes, not only 3. I found that the error only occur in the node with some custom taints, so I delete these taints off the Node, then it works. |
Closing the issue for now, feel free to re-open if this issue persists. |
I have encountered a similar issue to what's been described in the thread. It seems to be a timing problem when communicating with the Mountpoint S3 CSI driver. I'm attempting to mount multiple pods to the same static PVC, which is linked to an S3 bucket. This setup is for running Spark Driver and Executor jobs, essentially using the S3 bucket as a shuffle disk in place of an EBS volume. Here's the configuration I'm using: Initially, my driver pods only mount successfully after a few attempts and restarting the pod. The error encountered is:
The Spark driver pod works on the second attempt of running the same job. However, the executor pods error out with the same issue. Error in First Attempt On the second attempt, both driver and executor pods run successfully. @patrickpa suggested a possible solution using node startup taint, which I haven't tried yet but plan to explore later and will update accordingly. However, I've run into a different issue related to file renaming. All the Spark executor pods failed with the following error, causing the job to terminate:
This leads me to question the viability of using an S3 bucket with Given that Spark Driver and Executors need to create, update, and delete files from the S3 bucket, are there any limitations or considerations I might be missing here? Any insights would be greatly appreciated. |
I redid the installation using the deploy files, and I was successful. for some reason the aws addons were not replicating the drive pods on all nodes. |
As anyone found a way to resolve this problem, In my case I use auto-scaling in my cluster whenever a new node provisioned via auto-scaler, The S3-csi-driver addon won't configured in new node which is annoying but I also have efs-driver-addon which works fine. Update: When I check the
s3-csi-node should be two each one in each node but the desired state is only one . |
check the taints of your nodes and the tolerance of your s3-csi-node and ebs-csi-node, which may be different. |
@wcw84 When I check the
And for
So I am not sure which stops the Note: I have installed both from EKS web interface in aws. |
I found the issue, my GPU node have a following taint and i removed the taint
These taints prevents the |
@dlakhaws any chance we we can reopen this given the activity on the issue? I am experiencing similar issues to @surya9teja, seems problematic that the driver won't run on nodes when some taints are used. |
Reopened. @spolloni are you also using EKS add-on? |
@spolloni If you install
|
@surya9teja ok, thanks for the tip!
@unexge yes I am. I just updated to the latest version (1.7.0) to make sure the issue persisted. I am completely ignorant about how this driver works but do you think this issue is "generally" fixable in the add-on install without the workaround suggested above? |
Hey @spolloni, our EKS add-on doesn't allow configuring tolerations at the moment. We plan to support that, it's tracked by #109. Meanwhile, I think only workaround is using our Helm-chart/Kustomization-manfiest to configure tolerations as @surya9teja suggested. |
sounds good. thanks for the help @unexge! |
v1.8.0 of our EKS add-on has been released with $ aws eks describe-addon-configuration --addon-name aws-mountpoint-s3-csi-driver --addon-version v1.8.0-eksbuild.1
{
"addonName": "aws-mountpoint-s3-csi-driver",
"addonVersion": "v1.8.0-eksbuild.1",
"configurationSchema": "{\"$schema\":\"https://json-schema.org/draft/2019-09/schema\",\"additionalProperties\":false,\"description\":\"Configurable param
eters for Mountpoint for S3 CSI Driver\",\"properties\":{\"node\":{\"additionalProperties\":false,\"properties\":{\"tolerateAllTaints\":{\"default\":false,\"
description\":\"Mountpoint for S3 CSI Driver Pods will tolerate all taints and will be scheduled in all nodes\",\"type\":\"boolean\"},\"tolerations\":{\"defa
ult\":[],\"items\":{\"type\":\"object\"},\"title\":\"Tolerations for Mountpoint for S3 CSI Driver Pods\",\"type\":\"array\"}},\"type\":\"object\"}},\"type\":
\"object\"}",
"podIdentityConfiguration": []
} You can set For example: $ aws eks create-addon --cluster-name ... \
--addon-name aws-mountpoint-s3-csi-driver \
--service-account-role-arn ... \
--configuration-values '{"node":{"tolerateAllTaints":true}}' Closing the issue now. Could you please try upgrading to v1.8.0 with a toleration config to see if that solves the problem? Please let us know if the issue persists. |
/kind bug
What happened?
When mounting the volume on the pod, it cannot locate the drive
Warning FailedMount 12s (x8 over 76s) kubelet MountVolume.MountDevice failed for volume "s3-pv" : kubernetes.io/csi: attacher.MountDevice failed to create newCsiDriverClient: driver name s3.csi.aws.com not found in the list of registered CSI drivers
What you expected to happen?
That it can normally mount the volume without failure
How to reproduce it (as minimally and precisely as possible)?
Apply the example yaml
Anything else we need to know?:
Is it necessary to create a new storage class?
Environment
kubectl version
):The text was updated successfully, but these errors were encountered: