-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test etcd container in K8s cluster #15139
Comments
Not sure about it, have we got any complaints about problems with running etcd in K8s? There are many different way to deploy etcd in K8s and I don't see any benefits on just testing one of them. I think it's up to projects packaging etcd on K8s to test in properly. On the other hand, etcd project distributes container image that those project use. This is the artifact passed between etcd project and K8s distributions. We might want to test the image itself to eliminate chances we break K8s users. To test etcd image we don't need to run K8s, just can start etcd with docker, which should be much simpler. |
Idea to add test is good, however I would want to avoid adding new testing framework in this time as we still didn't addressed issues of test flakiness and deduplicating integration and e2e tests. |
Testing the etcd image itself is must-to-have. We can do some sanity test with command something like Testing etcd POD/container in K8s is nice-to-have to me. Note we are not going to test K8s, instead we are testing the etcd's integration with K8s. K8s is the most important user of etcd, I think it deserves the effort. At least we should run some sanity test in K8s before publishing each release. Some issues which aren't discovered in etcd's existing workflow/pipeline might be relatively easier to be reproduced in K8s environment, such as #14420. We also received issue that etcd's DNS name (e.g. Automatically testing the etcd image itself using docker should be an easy task. But I'm not sure what's the best approach to sanity test etcd image in K8s. Automatic test is definitely preferred, but manual test might be also accepted in the first step. |
I think you are talking about two different use cases, etcd being used by K8s as storage running in control plane and etcd deployed in K8s running on nodes. K8s is the biggest user of etcd for the first use case, second is much less popular. I would like to first classify the errors to decide which use case we should focus on, for example issues you linked:
Overall I'm not as concerned about K8s testing as every etcd release is immediately tested in K8s CI in both scenarios and used for detecting performance regressions. Don't think tightening the feedback loop will decrease number of issues in noticeable way. |
Yes, it's good. The only minor concern is it's post verification. It would be better to do the some simple verification before we publish each release.
Ideally we can cover both. But the former seems not feasible, and It's OK to leverage the K8s CI although it's post verification. The latter should be relatively easy.
Please see discussion in #14402. It should be the former case ( |
In the first step, let's add a github workflow to do some sanity test on the etcd image on each PR. Please anyone feel free to deliver a PR for this. Please refer to the release TEXT, pasted below,
|
The pseudo code for this seems to be:
Will that constitute the "first step" ? I can work on a PR which accomplishes the above. Is the next step being etcd being deployed in K8s running on nodes ? |
No need to push the image to any register for now, I think we just need to build binaries, build image and then verify the local image. Actually we already have a workflow release, it builds images; so I would suggest to add an additional step or job in it. Please feel free to deliver a PR for this. Thanks.
Since Kubernetes is the most important user of etcd, so running etcd as the K8s apiserver's backend makes more sense. But it seems not an easy job to automate it ourselves. Just as I mentioned in #15139 (comment), K8s's CI verifies each etcd release, but it's post verification instead of ex-ante. It's also acceptable if we can even manually do this via K8s's CI, but it should be clearly documented. |
@pchan any update? |
This issue is a priority to me, and I expect we add some sanity test per my comment above #15139 (comment), and also add some cases to verify the image arch (please see #15270). |
I will prioritize this and come up with a PR within this week. |
What would you like to be added?
Currently all the existing automatic test cases are running etcd as process instead of container. But usually etcd is running as POD/container (e.g. static POD) in K8s. So It would be better to add automatic test case to running etcd container in a K8s cluster.
At least, we should do some sanity test for such scenario. For example, all members should can communicate with each other via the K8s service name (e.g
etcd-1.etcd-headless-svc.svc.cluster.local
). I am not sure whether we can use Kind to do this.I am thinking the other alternative is to leverage the existing K8s pipeline to do such testing, but in that case we should clearly document the guide/steps, so that everyone can do it.
Why is this needed?
It can improve etcd release's quality.
The text was updated successfully, but these errors were encountered: