You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran while [ 1 ]; do dd if=/dev/urandom bs=1024 count=1024 | etcdctl put key || break; done from 2 machines against my etcd cluster and it became unstable.
Client side I get: "error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"} Error: context deadline exceeded
Etcd side the whole etcd cluster becomes unstable for 1-2 minutes, then a selection of things can happen:
Full recovery, all nodes rejoin
2 nodes rejoin and become stable, one node gets into a broken state and never rejoins
No nodes rejoin and it remains offline
What did you expect to happen?
I expected ETCD to gracefully handle this. Either by rate limiting the client or at least always fully recovering when it does become overloaded.
Having the whole cluster become indefinitely unstable isn't very desirable
How can we reproduce it (as minimally and precisely as possible)?
I used the following machines (all VMs):
3 ETCD nodes - 8 core, 32Gi Ram, 50Gi disk
2 Kubernetes master nodes - 36 core, 125 Gi Ram, 120Gi disk
Kube-api etcdCompactionInterval=2m30s
I ran the equivalent of (I had to pass various flags to the certs + addresses):
while [ 1 ]; do dd if=/dev/urandom bs=1024 count=1024 | etcdctl put key || break; done
From both masters.
Some notes here. To prevent hitting the etcd limit, sometimes I have to manually stop running it on one master until we hit a compaction period and then start it again
You can get the timeout even with a single master, but usually etcd recovers gracefully from this. I used 2 as it causes the issue faster
Anything else we need to know?
I originally was wanting to test how etcd handled having values of --quota-backend-bytes when I ran into this
I was trying to fill the database to see if performance will still good when it was nearly full + measure recovery time
The easiest way I've had to replicate is to set --quota-backend-bytes to 25 Gi, turn off compaction on k8s, and run the above commands
I never got it to hit the limit --quota-backend-bytes as it'd always break the cluster before getting that far
However I have replicated with --quota-backend-bytes set to 8Gi, it just required a bit more effort to make sure you don't hit the limit (as described above) but still regularly occurs within 15 minutes. It seems harder to have it fail badly at the 8Gi limit, but not impossible.
Generally it seems the histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (le)) just continues to increase until it hits an inflection point and falls over
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
What happened?
I ran
while [ 1 ]; do dd if=/dev/urandom bs=1024 count=1024 | etcdctl put key || break; done
from 2 machines against my etcd cluster and it became unstable.Client side I get:
"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"} Error: context deadline exceeded
Etcd side the whole etcd cluster becomes unstable for 1-2 minutes, then a selection of things can happen:
What did you expect to happen?
I expected ETCD to gracefully handle this. Either by rate limiting the client or at least always fully recovering when it does become overloaded.
Having the whole cluster become indefinitely unstable isn't very desirable
How can we reproduce it (as minimally and precisely as possible)?
I used the following machines (all VMs):
I ran the equivalent of (I had to pass various flags to the certs + addresses):
while [ 1 ]; do dd if=/dev/urandom bs=1024 count=1024 | etcdctl put key || break; done
From both masters.
Anything else we need to know?
I originally was wanting to test how etcd handled having values of
--quota-backend-bytes
when I ran into thisThe easiest way I've had to replicate is to set
--quota-backend-bytes
to 25 Gi, turn off compaction on k8s, and run the above commands--quota-backend-bytes
as it'd always break the cluster before getting that farHowever I have replicated with
--quota-backend-bytes
set to 8Gi, it just required a bit more effort to make sure you don't hit the limit (as described above) but still regularly occurs within 15 minutes. It seems harder to have it fail badly at the 8Gi limit, but not impossible.Generally it seems the
histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[5m])) by (le))
just continues to increase until it hits an inflection point and falls overEtcd version (please run commands below)
We are using v3.5.3 from https://quay.io/repository/coreos/etcd?tab=tags&tag=latest
Etcd configuration (command line flags or environment variables)
We just use the standard entrypoint for that image mentioned above - no extra command line flags
I'll need to find if I can share this, but it is pretty standard configuration, shout if this is needed.
I guess main difference is:
ETCD_QUOTA_BACKEND_BYTES=8589934592
Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)
I'll need to find if I can share this, but it is pretty standard configuration, shout if this is needed.
Relevant log output
No response
The text was updated successfully, but these errors were encountered: