Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-1.22] K3s CI failures due to tiller crashes #3843

Closed
brandond opened this issue Aug 13, 2021 · 3 comments
Closed

[release-1.22] K3s CI failures due to tiller crashes #3843

brandond opened this issue Aug 13, 2021 · 3 comments
Assignees
Labels
kind/dev-validation Dev will be validating this issue
Milestone

Comments

@brandond
Copy link
Member

brandond commented Aug 13, 2021

K3s CI has been failing due to tiller crashes. When this happens, the helm_v2 ls command hangs, and CI eventually times out and fails the PR.

The upstream issue (helm/helm#4753) has been closed due to helm v2 (and tiller with it) no longer being supported. Since we can't drop helm v2 for 2 minor Kubernetes versions, we need to do our best to keep it running until then.

Ref:

 tiller --listen=127.0.0.1:44134 --storage=secret
[main] 2021/08/12 22:49:15 Starting Tiller v2.17.0 (tls=false)
[main] 2021/08/12 22:49:15 GRPC listening on 127.0.0.1:44134
[main] 2021/08/12 22:49:15 Probes listening on :44135
[main] 2021/08/12 22:49:15 Storage driver is Secret
[main] 2021/08/12 22:49:15 Max history per release is 0
fatal error: concurrent map iteration and map write
++ helm_v2 ls --all '^traefik-crd$' --output json
++ jq -r '.Releases | length'

goroutine 30 [running]:
runtime.throw(0x1974ffd, 0x26)
        /usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc000316bb8 sp=0xc000316b88 pc=0x434c92
runtime.mapiternext(0xc000316df8)
        /usr/local/go/src/runtime/map.go:853 +0x552 fp=0xc000316c38 sp=0xc000316bb8 pc=0x40f8c2
runtime.mapiterinit(0x172d320, 0xc000478a80, 0xc000316df8)
        /usr/local/go/src/runtime/map.go:843 +0x1c4 fp=0xc000316c58 sp=0xc000316c38 pc=0x40f274
k8s.io/helm/vendor/google.golang.org/grpc.(*Server).GetServiceInfo(0xc000803200, 0xd021d4ba)
        /go/src/k8s.io/helm/vendor/google.golang.org/grpc/server.go:452 +0x96 fp=0xc000316e68 sp=0xc000316c58 pc=0x8bfbf6
k8s.io/helm/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(*ServerMetrics).InitializeMetrics(0xc000294000, 0xc000803200)
        /go/src/k8s.io/helm/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server_metrics.go:133 +0x43 fp=0xc000316f88 sp=0xc000316e68 pc=0x8dacc3
k8s.io/helm/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.Register(...)
        /go/src/k8s.io/helm/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:38
main.start.func2(0xc0000f0a80)
        /go/src/k8s.io/helm/cmd/tiller/tiller.go:238 +0x5b fp=0xc000316fd8 sp=0xc000316f88 pc=0x15be70b
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc000316fe0 sp=0xc000316fd8 pc=0x4676c1
created by main.start
        /go/src/k8s.io/helm/cmd/tiller/tiller.go:230 +0xa76

goroutine 1 [select]:
main.start()
        /go/src/k8s.io/helm/cmd/tiller/tiller.go:248 +0xb5d
main.main()
        /go/src/k8s.io/helm/cmd/tiller/tiller.go:125 +0x161

goroutine 13 [chan receive]:
k8s.io/helm/vendor/k8s.io/klog.(*loggingT).flushDaemon(0x2acb9e0)
        /go/src/k8s.io/helm/vendor/k8s.io/klog/klog.go:1018 +0x8b
created by k8s.io/helm/vendor/k8s.io/klog.init.0
        /go/src/k8s.io/helm/vendor/k8s.io/klog/klog.go:404 +0x6c

goroutine 29 [runnable]:
k8s.io/helm/vendor/google.golang.org/grpc.(*Server).Serve(0xc000803200, 0x1bd5140, 0xc0004ee100, 0xc0001212c0, 0xc0001212c0)
        /go/src/k8s.io/helm/vendor/google.golang.org/grpc/server.go:514
main.start.func1(0xc0001aa500, 0x1bd5140, 0xc0004ee100, 0xc0000f0a20)
        /go/src/k8s.io/helm/cmd/tiller/tiller.go:225 +0x120
created by main.start
        /go/src/k8s.io/helm/cmd/tiller/tiller.go:221 +0xa51
@brandond brandond added this to the v1.22.0+k3s1 milestone Aug 13, 2021
@brandond brandond self-assigned this Aug 13, 2021
brandond added a commit to brandond/helm-controller that referenced this issue Aug 13, 2021
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond added a commit to brandond/helm-controller that referenced this issue Aug 13, 2021
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond added a commit to k3s-io/helm-controller that referenced this issue Aug 13, 2021
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
brandond added a commit to k3s-io/helm-controller that referenced this issue Aug 13, 2021
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
@fapatel1 fapatel1 modified the milestones: v1.22.0+k3s1, v1.22.2+k3s1 Aug 23, 2021
@bmdepesa bmdepesa added the kind/dev-validation Dev will be validating this issue label Aug 24, 2021
@rancher-max
Copy link
Contributor

Confirmed rancher/klipper-helm:v0.6.5-build20210915 is the image used on master branch commit debb5086 which is a later version than needed for this fix and this is validated as working in CI.

@brandond
Copy link
Member Author

fwiw I'm still seeing some flakes due to the chart not deploying when tiller hangs, so I might need to take another pass at it. Haven't seen any user complaints about it though so it's low priority.

@brandond
Copy link
Member Author

Reopening, not fixed by previous change.

@brandond brandond reopened this Oct 22, 2021
@brandond brandond changed the title K3s CI failures due to tiller crashes [release-1.22] K3s CI failures due to tiller crashes Oct 22, 2021
@brandond brandond modified the milestones: v1.22.2+k3s1, v1.22.3+k3s1 Oct 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/dev-validation Dev will be validating this issue
Projects
None yet
Development

No branches or pull requests

4 participants