Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to watch file, no space left on device #2399

Closed
dave-yotta opened this issue Jul 30, 2021 · 2 comments
Closed

failed to watch file, no space left on device #2399

dave-yotta opened this issue Jul 30, 2021 · 2 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@dave-yotta
Copy link

dave-yotta commented Jul 30, 2021

We've started getting 2021-07-29T11:33:30.7825671Z failed to watch file "/var/log/pods/default_run-tests-2xcj7_18057078-1be1-4bba-a458-55ae462508cc/run-tests/0.log": no space left on device when trying to tail some pods in hosted CI environment using kind - looks similar to #717 (comment) with the sysctl -w fs.inotify.max_user_watches=524288 thing.

Not sure how to repro this, is there a way to configure these user limits when creating the kind node maybe?
Also, not sure what's using up the watches - any advice on how to check that? I guess would need to exec a command on the node container? :/

Here's the versions:

kind v0.11.1 go1.16.4 linux/amd64
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.1-5-g76a04fc", GitCommit:"8b4b09487463415374368af3bbc4ff2e6366477b", GitTreeState:"clean", BuildDate:"2021-06-25T21:59:28Z", GoVersion:"go1.15.7", Compiler:"gc", Platform:"linux/amd64"}
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Build with BuildKit (Docker Inc., 0.6.0+azure)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 16
 Server Version: 20.10.7+azure
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7eba5930496d9bbe375fdf71603e610ad737d2b2
 runc version: 4144b63817ebcc5b358fc2c8ef95f7cddd709aa7
 init version: 
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-1055-azure
 Operating System: Ubuntu 18.04.5 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 6.791GiB
 Name: fv-az418-368
 ID: 6G24:VSM3:URBE:WI5E:LMC3:PBF2:4HAD:XQB6:6LEX:EBQN:U3TZ:T62J
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: githubactions
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

kind create logs:

Creating cluster "kind" ...
 • Ensuring node image (kindest/node:v1.21.1) 🖼  ...
 ✓ Ensuring node image (kindest/node:v1.21.1) 🖼
 • Preparing nodes 📦   ...
 ✓ Preparing nodes 📦 
 • Writing configuration 📜  ...
 ✓ Writing configuration 📜
 • Starting control-plane 🕹️  ...
 ✓ Starting control-plane 🕹️
 • Installing CNI 🔌  ...
 ✓ Installing CNI 🔌
 • Installing StorageClass 💾  ...
 ✓ Installing StorageClass 💾

some more info about the kube/host machine:

Set kubectl context to "kind-kind"
NAME                 PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
standard (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  2s
node stats: ncpu=2, ram=7121248 kB
@dave-yotta dave-yotta added the kind/bug Categorizes issue or PR as related to a bug. label Jul 30, 2021
@dave-yotta
Copy link
Author

dave-yotta commented Jul 30, 2021

Ok - I've fixed it by running sudo sysctl -w fs.inotify.max_user_watches=524288 on the host CI machine after installing kind - I'll close this, I don't think this is something I can blame you guys for!

@tao12345666333
Copy link
Member

Thanks for feedback

I will change the label.

/remove-kind bug
/kind support

@k8s-ci-robot k8s-ci-robot added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jul 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

3 participants