Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sonobuoy failing to test clusters with custom taints on some nodes. #507

Closed
mauilion opened this issue Jul 24, 2018 · 3 comments · Fixed by #509
Closed

Sonobuoy failing to test clusters with custom taints on some nodes. #507

mauilion opened this issue Jul 24, 2018 · 3 comments · Fixed by #509
Assignees
Labels
kind/bug Behavior isn't as expected or intended p1-important

Comments

@mauilion
Copy link

What steps did you take and what happened:

I have a set of nodes that have the following taints:

terraform-az1-0
  effect:NoExecute key:node-role.kubernetes.io/etcd
  effect:NoSchedule key:node-role.kubernetes.io/etcd
terraform-az1-1
  effect:NoSchedule key:node-role.kubernetes.io/master]
terraform-az1-2
terraform-az2-0
  effect:NoExecute key:node-role.kubernetes.io/etcd
  effect:NoSchedule key:node-role.kubernetes.io/etcd
terraform-az2-1
  effect:NoSchedule key:node-role.kubernetes.io/master
terraform-az2-2
terraform-az3-0
  effect:NoExecute key:node-role.kubernetes.io/etcd
  effect:NoSchedule key:node-role.kubernetes.io/etcd
terraform-az3-1
  key:node-role.kubernetes.io/master effect:NoSchedule
terraform-az3-2

When I run sonobuoy against this cluster it fails the tests.
In debugging I can see that part of this is related to the fact that not all nodes are addressable with the default tolerations of:

      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      - key: CriticalAddonsOnly
        operator: Exists

I've tried to work around the issue by specifying a toleration of

  tolerations:
  - operator: "Exists"

in the sonobuoy.yaml file. example here: https://gist.github.com/mauilion/3e13dfbf649cc4e30fa56ea2fd258b12

What did you expect to happen:
I expected that either the default tolerations would be very permissive. Or that when provided an additional set of tolerations the provided tolerations would be merged into the spec deployed by sonobuoy.

Environment:

  • Sonobuoy version: v0.11.4

  • Kubernetes version: (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:05:37Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

  • Kubernetes installer & version:
    kubeadm 1.10.3

  • Cloud provider or hardware configuration:
    VMW

  • OS (e.g. from /etc/os-release):
    Container Linux

  • Sonobuoy tarball (which contains * below)

@timothysc timothysc self-assigned this Jul 24, 2018
@timothysc
Copy link
Contributor

So right now we can fix our daemonset plugin's pretty easily, but upstream tests are a different beast.

@mauilion do you have a list of tests that fail, or is it just the systemd-plugin?

@timothysc timothysc added p1-important kind/enhancement New or improved functionality kind/bug Behavior isn't as expected or intended and removed kind/enhancement New or improved functionality labels Jul 24, 2018
@mauilion
Copy link
Author

The logs from the failed tests are here:
https://k8s.work/201807241952_sonobuoy_05e0dca1-4cc8-4b04-b4a0-dc4369c7dc65.tar.gz

The outputs have improved in a recent version of sonobuoy cause now I can grab results. I haven't parsed the output to see what tests failed or succeeded yet.

@timothysc
Copy link
Contributor

Awesome thanks @mauilion !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Behavior isn't as expected or intended p1-important
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants