Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add /health/ping and /health/status endpoint to Python REST Microservices #1026

Merged
merged 12 commits into from
Nov 14, 2019

Conversation

kparaju
Copy link
Contributor

@kparaju kparaju commented Nov 1, 2019

This PR will add two endpoints to the python REST flask app.

/health/ping: quick check to make sure the flask app is alive
/health/status: more through check to make sure that the model is healthy

(I'm open to changing the url of these endpoints, these seemed to fit well)

There are two ways we are planning on using these endpoints:

  1. After we docker image is built, we will load it up and use /health/ping and /health/status to verify that the final artifact is good.
  2. Use it in k8s health checks to make sure that the model is up and predicting (using /health/status) and restart the pod if anything funky happens.
  3. In the future, it can also be added to the orchestrator so that clients can know the status and potentially use the Circuit Breaker pattern.

Tested by doing:

seldon-core-microservice MyModel REST --service-type MODEL --persistence 0 --workers=2

and

(base) ➜  ~ curl localhost:5000/health/ping
pong%
(base) ➜  ~ curl localhost:5000/health/status
{"data":{"names":[],"tensor":{"shape":[1],"values":[3]}},"meta":{}}
(base) ➜  ~

@seldondev
Copy link
Collaborator

Hi @kparaju. Thanks for your PR.

I'm waiting for a SeldonIO or seldonio member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kparaju kparaju marked this pull request as ready for review November 1, 2019 17:05
@axsaucedo
Copy link
Contributor

/ok-to-test

@axsaucedo
Copy link
Contributor

This is great @kparaju! Would be great to add something to the docs. I guess we can to this ourselves (unless you want to add something).

By the way, the linter seems to be failing - you can make sure it works by running make -C python setup_linter run_linter_check (details for this in https://github.com/SeldonIO/seldon-core/blob/master/CONTRIBUTING.md)

@kparaju
Copy link
Contributor Author

kparaju commented Nov 1, 2019

@axsaucedo I added some docs as well as ran the linter -- sorry about that.

@axsaucedo
Copy link
Contributor

/test integration

Copy link
Contributor

@axsaucedo axsaucedo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks solid! All tests pass so happy to land, however there's still the GRPC route missing, as currently the implementation is done in Python but not in GRPC. You can see this in this line:

prediction_pb2_grpc.add_SeldonServicer_to_server(seldon_model, server)

@ukclivecox
Copy link
Contributor

This looks solid! All tests pass so happy to land, however there's still the GRPC route missing, as currently the implementation is done in Python but not in GRPC. You can see this in this line:

prediction_pb2_grpc.add_SeldonServicer_to_server(seldon_model, server)

@axsaucedo I don't think we would change the gRPC as that would require updating the prediction protos - but maybe you didn'tmean this. Better would be the have a lightweight Flask running on a separate port that has just the health endpoints exposed on http?

@axsaucedo
Copy link
Contributor

This looks solid! All tests pass so happy to land, however there's still the GRPC route missing, as currently the implementation is done in Python but not in GRPC. You can see this in this line:

prediction_pb2_grpc.add_SeldonServicer_to_server(seldon_model, server)

@axsaucedo I don't think we would change the gRPC as that would require updating the prediction protos - but maybe you didn'tmean this. Better would be the have a lightweight Flask running on a separate port that has just the health endpoints exposed on http?

Oh yeah you are right actually. Even if the model is a GRPC one, we'd still want to have health endpoints! Ok happy with that then, will resolve request for review

@axsaucedo
Copy link
Contributor

/hold

@axsaucedo
Copy link
Contributor

Oh my bad, resolving the comment automatically approved it. @cliveseldon I've added a [slash] hold, to approve just do [slash] hold remove

@axsaucedo
Copy link
Contributor

/lgtm

@seldondev
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: axsaucedo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@seldondev
Copy link
Collaborator

New changes are detected. LGTM label has been removed.

@seldondev seldondev removed the lgtm label Nov 6, 2019
@kparaju
Copy link
Contributor Author

kparaju commented Nov 13, 2019

@cliveseldon @axsaucedo updated with documentation about how to use these problems in your SeldonDeployment YAML.

@ukclivecox
Copy link
Contributor

/hold cancel

@axsaucedo
Copy link
Contributor

/test integration

@seldondev
Copy link
Collaborator

@kparaju: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
integration 51ff940 link /test integration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@axsaucedo axsaucedo merged commit d185edc into SeldonIO:master Nov 14, 2019
@axsaucedo axsaucedo removed their assignment May 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants