Support autoscaler in SeldonDeployment #277

ChenyuanZ · 2018-10-31T16:14:43Z

SeldonDeployment predictor supports replicas. It would be great if it can support autoscaler.

The text was updated successfully, but these errors were encountered:

ukclivecox · 2018-11-12T15:58:53Z

I think this would require some changes to the SeldonDeployment specification as the /scale operation assumes a single location for "replicas" in the definition. At present "replicas" is defined on a per-predictor basis. One option is:

Add a spec.replicas field which provides the number of replicas for any predictor that has not defined a spec.predictors[].replicas

This would allow you to specify per-predictor replicas as now but use a SeldonDeployment wide replicas if you wish and allow autoscaling to use this.

Separately, people may wish to define replicas on a per PodTemplateSpec level inside each predictor. If we wanted to do this also we could:

Allow an annotation in the podTemplateSpec.metadata where you can specify the desired number of replicas.

The lowest level of replicas setting would take precedence in the order:

spec.predictors[].componentsSpecs[].metadata.annotation
spec.predictors[].replicas
spec.replicas

Feedback welcome.

sasvaritoni · 2018-12-03T16:43:23Z

I highly agree that the auto-scaling support would add great value.
I also think that @cliveseldon 's proposal makes sense.

What I am wondering: now that we have separate deployments e.g. for model images & engine in case of single model serving and other basic constructs, would we want to autoscale those together?
I mean let's suppose I have some heavy model for which I might need 10 replicas as a maximum for auto-scaling based on the load. But does it make sense to scale the xxx-svc-orch deployments to the same amount as well?

I admit that it would be pretty hard to find a good solution for this (maybe by specifying relative replica ratios for the predictors of the SeldonDeployment? ), so as a first step the solution proposed above would be fine.

sasvaritoni · 2018-12-03T16:45:58Z

Btw, could you pleaseprovide some background info or maybe point to a doc regarding why the (K8s) deployment structure was changed so that the model container and the engine are now in a separate K8S deployment?
Thx!

ukclivecox · 2018-12-03T17:02:27Z

The current latest master versions have the ability to run the service orchestrator internal to the first predictor deployment or as a separate deployment. By default the latest code will use the same deployment as the first podTemplateSpec defined in your SeldonDeployment graph. This should cover most use cases and is best for latency. We need to update docs to add the annotation to allow this configuration option.

ukclivecox · 2019-02-11T06:52:33Z

PR #437 adds the ability to add HorizontalPodAutoscaler Specs for the defined PodTemplateSpecs.
This is different from what was previously proposed which was to do with manual use of the /scale endpoint for CRDs whereas this focuses on actual autoscaling.
The WIP uses V2beta1 of the HPA API. We could wait for V2beta2 to be availble in the Kubernetes Java client.
Feedback welcome on the WIP.

ukclivecox mentioned this issue Dec 3, 2018

Auto-scaling for Seldon serving? kubeflow/kubeflow#2029

Closed

ukclivecox added Operator CRD labels Jan 27, 2019

ukclivecox added this to the 0.2.x milestone Jan 27, 2019

ukclivecox mentioned this issue Feb 11, 2019

WIP: Autoscaling #437

Merged

ukclivecox closed this as completed in #437 Apr 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support autoscaler in SeldonDeployment #277

Support autoscaler in SeldonDeployment #277

ChenyuanZ commented Oct 31, 2018

ukclivecox commented Nov 12, 2018 •

edited

Loading

sasvaritoni commented Dec 3, 2018

sasvaritoni commented Dec 3, 2018

ukclivecox commented Dec 3, 2018

ukclivecox commented Feb 11, 2019

Support autoscaler in SeldonDeployment #277

Support autoscaler in SeldonDeployment #277

Comments

ChenyuanZ commented Oct 31, 2018

ukclivecox commented Nov 12, 2018 • edited Loading

sasvaritoni commented Dec 3, 2018

sasvaritoni commented Dec 3, 2018

ukclivecox commented Dec 3, 2018

ukclivecox commented Feb 11, 2019

ukclivecox commented Nov 12, 2018 •

edited

Loading