-
Notifications
You must be signed in to change notification settings - Fork 835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operator crash if one container in pod not created properly #1104
Comments
Can you provide yaml to reproduce this? |
If it's the latest snapshot then the panic seems to be at (have referenced a particular commit as that's what the snapshot currently is and master will move on)
Based on "Not creating container service for output-transformer" I take it there's an output-transformer in the graph? |
Hi Clive and Ryan, Thanks for response. Below is one yaml file that crashes the operator. We have another case which can also crash the operator. That one also have two images (containers). The error message is very similar ( The error log is placed after the yaml in the bottom) ================== yaml =====================
=================== error log for another SeldonDep ==================
|
Interesting, looks like the code flow goes through a line commented 'a user-supplied container may not be a pu so we may not create service for that' and then it goes on to reference the service, which is naturally nil and so it blows. Maybe it needs a continue there to not do that and instead move on to the next container. Will look into this a bit further. |
@yufengshan I notice your output-transformer is not part of the seldon graph defintion. Is that intentional? Are you calling the output-transformer directly from your model code? Just double-checking (either way what you report is a bug and we will be publishing a new snapshot with a fix). |
@ryandawsonuk Thanks Ryan for the quick response. 👍 , looking forward to testing the new image. The transformer is directly called from the model code |
Component: Operator, Version: 0.5.1-SNAPSHOT
When deploy a SeldonDep, if one of the container in pod fails to be created, the operator panic and will keep rebooting, until the deployment is completely removed.
========================
Crash log looks like below:
2019-11-13T19:45:44.464Z INFO controllers.SeldonDeployment pSvcName {"seldondeployment": "prehac-mlflow-artifact/mlflow-binary-no-orchestrator", "val": "se
ldon-9545cbc497aba25c3cb921fc8df42d7f"}
2019-11-13T19:45:44.464Z INFO controllers.SeldonDeployment Not creating container service for output-transformer {"seldondeployment": "prehac-mlflow-artifact/ml
flow-binary-no-orchestrator"}
E1113 19:45:44.464932 1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer derefe
rence)
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/runtime/runtime.go:76
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/runtime/runtime.go:65
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/panic.go:522
/usr/local/go/src/runtime/panic.go:82
/usr/local/go/src/runtime/signal_unix.go:390
/workspace/controllers/seldondeployment_controller.go:366
/workspace/controllers/seldondeployment_controller.go:1175
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.2.0/pkg/internal/controller/controller.go:216
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.2.0/pkg/internal/controller/controller.go:192
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.2.0/pkg/internal/controller/controller.go:171
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/wait/wait.go:152
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/wait/wait.go:153
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/wait/wait.go:88
/usr/local/go/src/runtime/asm_amd64.s:1337
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x120 pc=0x117bcf9]
goroutine 188 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/runtime/runtime.go:58 +0x105
panic(0x12ddce0, 0x2160910)
/usr/local/go/src/runtime/panic.go:522 +0x1b5
github.com/seldonio/seldon-core/operator/controllers.createComponents(0xc000284810, 0xc001695ba0, 0x165af00, 0xc001322760, 0x16, 0xc0003fb7a0, 0x1d)
/workspace/controllers/seldondeployment_controller.go:366 +0x799
github.com/seldonio/seldon-core/operator/controllers.(*SeldonDeploymentReconciler).Reconcile(0xc000284810, 0xc0003fb7c0, 0x16, 0xc0003fb7a0, 0x1d, 0x2174b40, 0x42bd21, 0x162f1
20, 0xc00164bd88)
/workspace/controllers/seldondeployment_controller.go:1175 +0x2e3
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00010a0a0, 0x1326c00, 0xc00000c380, 0x1326c00)
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.2.0/pkg/internal/controller/controller.go:216 +0x149
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00010a0a0, 0xc00149a600)
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.2.0/pkg/internal/controller/controller.go:192 +0xb5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc00010a0a0)
The text was updated successfully, but these errors were encountered: