-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
warning: fileutil: file already locked #15360
Comments
Here is the link to the file responsible for that error: https://github.com/etcd-io/etcd/blob/main/client/pkg/fileutil/purge.go#L72 There was a discussion on this error last month: #15100, including some parameters that might help with preventing this error. Can you please confirm which filesystem are you using? |
I think in our case it was a customer running etcd over NFS, so it's not entirely surprising they would get an error like that. |
Hello ✋🏽 I have a customer who faced this issue as well. The error is raised from here, and it wraps the error from here However, there was no other process but
Any ideas about the cause of this ? |
I can reproduce this with a fresh db when I run Steps:
Run benchmark:
Warning message observed in log:
|
Thanks for the repro @cenkalti - confirming I can reproduce also following those steps, just on a standard nvme ssd with standard |
It's expected behavior. The wal files will be locked until a snapshot (with higher index) is created. Only 5 WAL files are retained by default, and the snapshot-count is 100K by default. When the sixth WAL file is created, it tries to remove the oldeset WAL file, but it isn't released yet because the snapshot isn't created yet. But we can lower the default value for snapshotCount, (e.g set to 10K by default) FYI. etcd/server/etcdserver/server.go Line 77 in 7a98ab3
I think majority maintainers have already reached an agreement on this, so please feel free to deliver a PR for this. The related discussion: |
Thanks for the behavior clarification @ahrtr, was good context to read through the previous discussions. I've raised a tentative pull request to see if there is now consensus to reduce back to 10k 🙏🏻 |
Still getting this in etcd 3.5.13. Our cluster is setup by kops. Attached some output for a specific fileentry. This however happens frequently in the cluster. stern -l k8s-app=etcd-manager-main --kubeconfig ~/.kube/kubeconfig | grep /rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wa
+ etcd-manager-main-i-007dde6f9b72ff95f › init-etcd-3-4-13
+ etcd-manager-main-i-007dde6f9b72ff95f › kops-utils-cp
+ etcd-manager-main-i-007dde6f9b72ff95f › etcd-manager
+ etcd-manager-main-i-007dde6f9b72ff95f › init-etcd-symlinks-3-4-13
+ etcd-manager-main-i-007dde6f9b72ff95f › init-etcd-symlinks-3-5-13
+ etcd-manager-main-i-007dde6f9b72ff95f › init-etcd-3-5-13
+ etcd-manager-main-i-0104720a905b41f00 › init-etcd-3-4-13
+ etcd-manager-main-i-0104720a905b41f00 › kops-utils-cp
+ etcd-manager-main-i-0104720a905b41f00 › etcd-manager
+ etcd-manager-main-i-0104720a905b41f00 › init-etcd-symlinks-3-4-13
+ etcd-manager-main-i-0104720a905b41f00 › init-etcd-3-5-13
+ etcd-manager-main-i-0104720a905b41f00 › init-etcd-symlinks-3-5-13
+ etcd-manager-main-i-03ab4ad4d4f9318ce › kops-utils-cp
+ etcd-manager-main-i-03ab4ad4d4f9318ce › init-etcd-3-4-13
+ etcd-manager-main-i-03ab4ad4d4f9318ce › init-etcd-symlinks-3-4-13
+ etcd-manager-main-i-03ab4ad4d4f9318ce › etcd-manager
+ etcd-manager-main-i-03ab4ad4d4f9318ce › init-etcd-3-5-13
+ etcd-manager-main-i-03ab4ad4d4f9318ce › init-etcd-symlinks-3-5-13
- etcd-manager-main-i-0104720a905b41f00 › kops-utils-cp
- etcd-manager-main-i-007dde6f9b72ff95f › kops-utils-cp
- etcd-manager-main-i-007dde6f9b72ff95f › init-etcd-3-5-13
- etcd-manager-main-i-007dde6f9b72ff95f › init-etcd-symlinks-3-5-13
- etcd-manager-main-i-007dde6f9b72ff95f › init-etcd-symlinks-3-4-13
- etcd-manager-main-i-007dde6f9b72ff95f › init-etcd-3-4-13
- etcd-manager-main-i-0104720a905b41f00 › init-etcd-symlinks-3-4-13
- etcd-manager-main-i-0104720a905b41f00 › init-etcd-3-4-13
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"info","ts":"2024-10-23T13:06:05.159115Z","caller":"wal/wal.go:785","msg":"created a new WAL segment","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:16:36.113918Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:17:06.114143Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:17:36.114629Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:18:06.114902Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:18:36.115556Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:19:06.115693Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:19:36.116597Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:20:06.117199Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:20:36.117541Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
etcd-manager-main-i-0104720a905b41f00 etcd-manager {"level":"warn","ts":"2024-10-23T13:21:06.118584Z","caller":"fileutil/purge.go:80","msg":"failed to lock file","path":"/rootfs/mnt/master-vol-02872b62a36b652e5/data/pq8MghF2NAjsaA7JWOJAsg/member/wal/000000000000b2ed-00000000387e67e9.wal","error":"fileutil: file already locked"}
- etcd-manager-main-i-0104720a905b41f00 › init-etcd-3-5-13 |
What happened?
got a
warning
level log :What did you expect to happen?
I don't know the effect of this warning
How can we reproduce it (as minimally and precisely as possible)?
I don't know how it came about
Anything else we need to know?
No response
Etcd version (please run commands below)
Etcd configuration (command line flags or environment variables)
Minimum configuration
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
No response
Relevant log output
No response
The text was updated successfully, but these errors were encountered: