Skip to content

Failing API still "updating" in cortex get #1680

Closed
@RobertLucian

Description

@RobertLucian

Version

>= 0.23

Description

When an API is failing getting started (due to an exception in the predictor's constructor), the deployment enters a restart loop (as described by the API's deployment spec), but the API's state doesn't change in cortex get (or in the Python Client). The API's state is stuck to updating as opposed to switching to error or to anything else as expected.

Steps to reproduce

Take an iris classifier test example and add a raise in the constructor. Deploy that using any cloud provider (AWS or GCP). Then check cortex get and notice how the API's state doesn't change from updating to error.

Solution

When creating a stage 2 service with s6, if a service exits with a non-zero exit code, before sending the kill signal to all other services, export the non-zero exit code to stage 3 like in this example.

... redirfd -w 1 /var/run/s6/env-stage3/S6_STAGE2_EXITED s6-echo -n -- \${1} ...

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions