-
Notifications
You must be signed in to change notification settings - Fork 834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SeldonDeployment stuck on creating when an environment variable is a reference #1211
Comments
Do you add any extra Volumes to your Pod when it get stuck or is the env the only change? You could check the Seldon manager logs to check if it thinks it needs to keep Reconciling the Deployment which is why this could be stuck. |
Only the envs change, but for Jaeger we actually have to specify it twice, once in "spec.predictors.componentSpecs.spec.containers.env", but also in "spec.predictors.spec.svcOrchSpec.env". If the environment variable is specified in either of them the aforementioned way, SeldonDeployment gets stuck in Creating. I think you are right I found an error log with message "Reconcile Error":
The relevant error seems to be "Operation cannot be fulfilled on seldondeployments.machinelearning.seldon.io "simplemodelm": the object has been modified; please apply your changes to the latest version and try again" |
I think that error is transitory and may not be it. But if you see it trying to create the deployment multiple times that could be an error. In the past this has been due to defaults added by k8s and then the Operator thinks the Deployment has changed. It could be the |
Can you test by adding the |
Sorry, maybe I misunderstood, I tried modifying the SeldonDeployment's apiVersion to v1 or machinelearning.seldon.io/v1, but I got the following error:
In our yaml it's set to "machinelearning.seldon.io/v1alpha2", or am I supposed to put apiVersion somewhere else as well? This is the whole yaml:
|
No. Here is an example I tested that works
We'll look into fixing the bug. |
Yep, that fixed it, thanks! |
We have a SeldonDeployment the we want to use with Jaeger (https://www.jaegertracing.io/) and the involves setting certain environment variables.
From what we see our issue stems from environment variables that are references. Namely we want to specify the "JAEGER_AGENT_HOST" variable whose value should come from kubernetes status like so:
However if we do this than SeldonDeployment's status is stuck on Creating:
If we change the environment variables definition to:
then status changes to Available.
Our main problem is that model service does not start at all while the status is stuck at creating. The weird thing is that event though the SeldonDeployment's status is Creating the underlying Deployment and pods start successfully.
We use seldon core operator version 0.4.0 and our model image starts from "seldonio/seldon-core-s2i-python3:0.13".
Any help on how to solve this would be appreciated.
The text was updated successfully, but these errors were encountered: