-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Driver crashes unexpectedly with Failed to read /host/proc/mounts
requiring pod restart
#284
Comments
Thanks for opening the bug report, @dienhartd. We'll investigate further. Would you be able to review |
Please can you let us know what operating system you're running on the cluster nodes too! |
i have the same problem, i was runing on amazon linux 2 |
Thanks for sharing, @John-Funcity. Please can you open a new issue so we can get logs relevant to your problem, and also include information such as the |
2.280531] systemd[1]: systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
[ 2.291650] systemd[1]: Detected virtualization amazon.
[ 2.295150] systemd[1]: Detected architecture x86-64.
[ 2.298554] systemd[1]: Running in initial RAM disk.
[ 2.302928] systemd[1]: No hostname configured.
[ 2.306128] systemd[1]: Set hostname to <localhost>.
[ 2.309546] systemd[1]: Initializing machine ID from VM UUID.
[ 2.336041] systemd[1]: Reached target Local File Systems.
[ 2.340338] systemd[1]: Reached target Swap.
[ 2.344257] systemd[1]: Created slice Root Slice.
[ 2.497890] XFS (nvme0n1p1): Mounting V5 Filesystem
[ 2.666828] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[ 3.033970] XFS (nvme0n1p1): Ending clean mount
[ 3.253141] systemd-journald[863]: Received SIGTERM from PID 1 (systemd).
[ 3.309998] printk: systemd: 18 output lines suppressed due to ratelimiting
[ 3.537461] SELinux: Runtime disable is deprecated, use selinux=0 on the kernel cmdline.
[ 3.543529] SELinux: Disabled at runtime.
[ 3.610275] audit: type=1404 audit(1732528464.939:2): enforcing=0 old_enforcing=0 auid=4294967 |
Thanks @John-Funcity for the information, but could you please open a new issue so we're able to root cause the issues separately from this one. Please include the dmsg logs and other logs following the logging guide: https://github.com/awslabs/mountpoint-s3-csi-driver/blob/main/docs/LOGGING.md |
Maybe this problem? |
MountVolume.SetUp failed for volume "s3-models-pv" : rpc error: code = Internal desc = Could not mount "xxxx-models-test" at "/var/lib/kubelet/pods/xxxxxxxxx/volumes/kubernetes.io |
Same issue and logs. When I delete the CSI pod running on the node that I get the error from, it is fixed. |
Thanks for the reports @John-Funcity @fatihmete. Would you be able to share any log that might be relevant from |
The error does not occur in a specific pattern, and I cannot understand when it will happen. Similarly, I am getting the following error.
CSI pods appear to be working without errors. I will add the logs when the problem occurs again. |
@dannycjones would you prefer I opened a new issue as well? Seems I'm getting the exact same issue. Running k3s v1.30.6+k3s1 on Ubuntu 22.04 (also on Ubuntu 24.04) and s3-mountpoint 1.10.0. I am able to access /proc/mounts on the host, but I don't see anything in there related to s3 or CSI, what do we expect to find in there relating to the s3 csi driver? Not much in dmesg (not sure if this is relevant):
|
My team's now moved this data from s3 to EFS. That said when we were using s3 and the s3 mountpoint driver, I'm not positive I had access to This was usually Amazon Linux 2, though I believe this also happened with the Ubuntu AMI |
/kind bug
What happened?
Periodically without warning one of my s3 mountpoint driver pods will crash with GRPC errors until I delete it. It will usually cause a dependent pod to fail to start. The replacement immediately after this pod's deletion works fine, but requires manual intervention after noticing dependent pod crashes due to missing pv.
What you expected to happen?
Error not to occur.
How to reproduce it (as minimally and precisely as possible)?
Unclear.
Anything else we need to know?:
Logs
Environment
Kubernetes version (use
kubectl version
):Client Version: v1.31.1
Server Version: v1.30.5-eks-ce1d5eb
Driver version: v1.9.0
Installation of s3 mountpoint driver is through
eksctl
, i.e.eksctl create addon aws-mountpoint-s3-csi-driver
Was directed by @muddyfish to file this issue here: #174 (comment)
The text was updated successfully, but these errors were encountered: