Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow v2 as protocol name #3906

Merged
merged 7 commits into from
Feb 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions doc/source/analytics/explainers.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,13 +82,13 @@ If you were port forwarding to Ambassador on localhost:8003 then the API call wo
http://localhost:8003/seldon/seldon/income-explainer/default/api/v1.0/explain
```

The explain method is also supported for tensorflow and v2 kfserving protocols. The full list of endpoint URIs is:
The explain method is also supported for tensorflow and v2 protocols. The full list of endpoint URIs is:

| Protocol | URI |
| ------ | ----- |
| seldon | `http://<host>/<ingress-path>/api/v1.0/explain` |
| tensorflow | `http://<host>/<ingress-path>/v1/models/<model-name>:explain` |
| kfserving | `http://<host>/<ingress-path>/v2/models/<model-name>/infer` |
| v2 | `http://<host>/<ingress-path>/v2/models/<model-name>/infer` |


Note: for `tensorflow` protocol we support similar non-standard extension as for the [prediction API](../graph/protocols.md#rest-and-grpc-tensorflow-protocol), `http://<host>/<ingress-path>/v1/models/:explain`.
2 changes: 1 addition & 1 deletion doc/source/analytics/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ spec:
- name: LOGGER_KAFKA_TOPIC
value: seldon
replicas: 1
protocol: kfserving
protocol: v2

```

Expand Down
16 changes: 8 additions & 8 deletions doc/source/graph/protocols.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Seldon Core supports the following data planes:

* [REST and gRPC Seldon protocol](#rest-and-grpc-seldon-protocol)
* [REST and gRPC Tensorflow Serving Protocol](#rest-and-grpc-tensorflow-protocol)
* [REST and gRPC V2 KFServing Protocol](#v2-kfserving-protocol)
* [REST and gRPC V2 Protocol](#v2-protocol)

## REST and gRPC Seldon Protocol

Expand Down Expand Up @@ -40,17 +40,17 @@ General considerations:
* The name of the model in the `graph` section of the SeldonDeployment spec must match the name of the model loaded onto the Tensorflow Server.


## V2 KFServing Protocol
## V2 Protocol

Seldon has collaborated with the [NVIDIA Triton Server
Project](https://github.com/triton-inference-server/server) and the [KFServing
Project](https://github.com/kubeflow/kfserving) to create a new ML inference
Project](https://github.com/triton-inference-server/server) and the [KServe
Project](https://github.com/kserve) to create a new ML inference
protocol.
The core idea behind this joint effort is that this new protocol will become
the standard inference protocol and will be used across multiple inference
services.

In Seldon Core, this protocol can be used by specifying `protocol: kfserving` on
In Seldon Core, this protocol can be used by specifying `protocol: v2` on
your `SeldonDeployment`.
For example,

Expand All @@ -61,7 +61,7 @@ metadata:
name: sklearn
spec:
name: iris-predict
protocol: kfserving
protocol: v2
predictors:
- graph:
children: []
Expand All @@ -75,7 +75,7 @@ spec:
name: default
```

At present, the `kfserving` protocol is only supported in a subset of
At present, the `v2` protocol is only supported in a subset of
pre-packaged inference servers.
In particular,

Expand All @@ -86,4 +86,4 @@ In particular,
| [XGBOOST_SERVER](../servers/xgboost.md) | ✅ | [Seldon MLServer](https://github.com/seldonio/mlserver) |
| [MLFLOW_SERVER](../servers/mlflow.md) | ✅ | [Seldon MLServer](https://github.com/seldonio/mlserver) |

You can try out the `kfserving` in [this example notebook](../examples/protocol_examples.html).
You can try out the `v2` in [this example notebook](../examples/protocol_examples.html).
2 changes: 1 addition & 1 deletion doc/source/graph/svcorch.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ At present, we support the following protocols:
| --- | --- | --- | --- |
| Seldon | `seldon` | [OpenAPI spec for Seldon](https://docs.seldon.io/projects/seldon-core/en/latest/reference/apis/openapi.html) |
| Tensorflow | `tensorflow` | [REST API](https://www.tensorflow.org/tfx/serving/api_rest) and [gRPC API](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto) reference |
| KFServing | `kfserving` | [KFServing Dataplane reference](https://github.com/kubeflow/kfserving/tree/master/docs/predict-api/v2) |
| V2 | `v2` | [KServe Dataplane reference](https://github.com/kserve/kserve/tree/master/docs/predict-api/v2) |

These protocols are supported by some of our pre-packaged servers out of the
box.
Expand Down
2 changes: 1 addition & 1 deletion doc/source/python/python_component.md
Original file line number Diff line number Diff line change
Expand Up @@ -464,7 +464,7 @@ class Model:
```

#### Validation
Output of developer-defined `metadata` method will be validated to follow the [kfserving dataplane proposal](https://github.com/kubeflow/kfserving/blob/master/docs/predict-api/v2/required_api.md#model-metadata) protocol, see [this](https://github.com/SeldonIO/seldon-core/issues/1638) GitHub issue for details:
Output of developer-defined `metadata` method will be validated to follow the [V2 dataplane proposal](https://github.com/kserve/kfserve/blob/master/docs/predict-api/v2/required_api.md#model-metadata) protocol, see [this](https://github.com/SeldonIO/seldon-core/issues/1638) GitHub issue for details:
```javascript
$metadata_model_response =
{
Expand Down
6 changes: 3 additions & 3 deletions doc/source/reference/apis/metadata.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ Example response:
```


## Deep dive: SeldonMessage and kfserving metadata reference
## Deep dive: SeldonMessage and kfserving V2 metadata reference

You can define inputs/outputs of your model metadata using one of two formats:
- `v1` format that closely correlates to the current structure of `SeldonMessage`
Expand Down Expand Up @@ -300,9 +300,9 @@ custom:
```


### kfserving TensorMetadata
### V2 TensorMetadata

You can easily define metadata for your models that is compatible with [kfserving dataplane proposal](https://github.com/kubeflow/kfserving/blob/master/docs/predict-api/v2/required_api.md#model-metadata) specification.
You can easily define metadata for your models that is compatible with [kfserving V2 dataplane proposal](https://github.com/kubeflow/kfserving/blob/master/docs/predict-api/v2/required_api.md#model-metadata) specification.
```javascript
$metadata_model_response =
{
Expand Down
2 changes: 1 addition & 1 deletion doc/source/reference/apis/prediction.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ message SeldonMessageMetadata
string messagetype = 1;
google.protobuf.Value schema = 2;

// KFserving tesnor metadata fields
// v2 tensor metadata fields
string name = 3;
string datatype = 4;
repeated int64 shape = 5;
Expand Down
16 changes: 5 additions & 11 deletions doc/source/servers/mlflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,16 +85,10 @@ notebook](../examples/server_examples.html#Serve-MLflow-Elasticnet-Wines-Model)
or check our [talk at the Spark + AI Summit
2019](https://www.youtube.com/watch?v=D6eSfd9w9eA).

## V2 KFServing protocol [Incubating]

.. Warning::
Support for the V2 KFServing protocol is still considered an incubating
feature.
This means that some parts of Seldon Core may still not be supported (e.g.
tracing, graphs, etc.).
## V2 protocol

The MLFlow server can also be used to expose an API compatible with the [V2
KFServing Protocol](../graph/protocols.md#v2-kfserving-protocol).
Protocol](../graph/protocols.md#v2-protocol).
Note that, under the hood, it will use the [Seldon
MLServer](https://github.com/SeldonIO/MLServer) runtime.

Expand Down Expand Up @@ -142,8 +136,8 @@ $ gsutil cp -r ../model gs://seldon-models/test/elasticnet_wine_<uuid>
```

- deploy the model to seldon-core
In order to enable support for the V2 KFServing protocol, it's enough to
specify the `protocol` of the `SeldonDeployment` to use `kfserving`.
In order to enable support for the V2 protocol, it's enough to
specify the `protocol` of the `SeldonDeployment` to use `v2`.
For example,

```yaml
Expand All @@ -152,7 +146,7 @@ kind: SeldonDeployment
metadata:
name: mlflow
spec:
protocol: kfserving # Activate the v2 protocol
protocol: v2 # Activate the v2 protocol
name: wines
predictors:
- graph:
Expand Down
16 changes: 5 additions & 11 deletions doc/source/servers/sklearn.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,21 +82,15 @@ Acceptable values for the `method` parameter are `predict`, `predict_proba`,
`decision_function`.


## V2 KFServing protocol [Incubating]

.. Warning::
Support for the V2 KFServing protocol is still considered an incubating
feature.
This means that some parts of Seldon Core may still not be supported (e.g.
tracing, graphs, etc.).
## V2 protocol

The SKLearn server can also be used to expose an API compatible with the [V2
KFServing Protocol](../graph/protocols.md#v2-kfserving-protocol).
V2 Protocol](../graph/protocols.md#v2-protocol).
Note that, under the hood, it will use the [Seldon
MLServer](https://github.com/SeldonIO/MLServer) runtime.

In order to enable support for the V2 KFServing protocol, it's enough to
specify the `protocol` of the `SeldonDeployment` to use `kfserving`.
In order to enable support for the V2 protocol, it's enough to
specify the `protocol` of the `SeldonDeployment` to use `v2`.
For example,

```yaml
Expand All @@ -106,7 +100,7 @@ metadata:
name: sklearn
spec:
name: iris-predict
protocol: kfserving # Activate the V2 protocol
protocol: v2 # Activate the V2 protocol
predictors:
- graph:
children: []
Expand Down
36 changes: 0 additions & 36 deletions doc/source/servers/tempo.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,39 +4,3 @@

For more details see the [Tempo documentation](https://tempo.readthedocs.io/en/latest/).

An example Tempo model yaml for Seldon Core is shown below:

```python
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
annotations:
seldon.io/tempo-description: ''
seldon.io/tempo-model: '{"model_details": {"name": "numpyro-divorce", "local_folder":
"/home/clive/work/mlops/fork-tempo/docs/examples/custom-model/artifacts", "uri":
"s3://tempo/divorce", "platform": "custom", "inputs": {"args": [{"ty": "numpy.ndarray",
"name": "marriage"}, {"ty": "numpy.ndarray", "name": "age"}]}, "outputs": {"args":
[{"ty": "numpy.ndarray", "name": null}]}, "description": ""}, "protocol": "tempo.kfserving.protocol.KFServingV2Protocol",
"runtime_options": {"runtime": "tempo.seldon.SeldonKubernetesRuntime", "docker_options":
{"defaultRuntime": "tempo.seldon.SeldonDockerRuntime"}, "k8s_options": {"replicas":
1, "minReplicas": null, "maxReplicas": null, "authSecretName": "minio-secret",
"serviceAccountName": null, "defaultRuntime": "tempo.seldon.SeldonKubernetesRuntime",
"namespace": "production"}, "ingress_options": {"ingress": "tempo.ingress.istio.IstioIngress",
"ssl": false, "verify_ssl": true}}}'
labels:
seldon.io/tempo: 'true'
name: numpyro-divorce
namespace: production
spec:
predictors:
- graph:
envSecretRefName: minio-secret
implementation: TEMPO_SERVER
modelUri: s3://tempo/divorce
name: numpyro-divorce
serviceAccountName: tempo-pipeline
type: MODEL
name: default
replicas: 1
protocol: kfserving
```
2 changes: 1 addition & 1 deletion doc/source/servers/triton.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ kind: SeldonDeployment
metadata:
name: triton
spec:
protocol: kfserving
protocol: v2
predictors:
- graph:
implementation: TRITON_SERVER
Expand Down
16 changes: 5 additions & 11 deletions doc/source/servers/xgboost.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,21 +46,15 @@ spec:
You can try out a [worked notebook](../examples/server_examples.html) with a
similar example.

## V2 KFServing protocol [Incubating]

.. Warning::
Support for the V2 KFServing protocol is still considered an incubating
feature.
This means that some parts of Seldon Core may still not be supported (e.g.
tracing, graphs, etc.).
## V2 protocol

The XGBoost server can also be used to expose an API compatible with the [V2
KFServing protocol](../graph/protocols.md#v2-kfserving-protocol).
protocol](../graph/protocols.md#v2-protocol).
Note that, under the hood, it will use the [Seldon
MLServer](https://github.com/SeldonIO/MLServer) runtime.

In order to enable support for the V2 KFServing protocol, it's enough to
specify the `protocol` of the `SeldonDeployment` to use `kfserving`.
In order to enable support for the V2 protocol, it's enough to
specify the `protocol` of the `SeldonDeployment` to use `v2`.
For example,

```yaml
Expand All @@ -70,7 +64,7 @@ metadata:
name: xgboost
spec:
name: iris
protocol: kfserving # Activate the V2 protocol
protocol: v2 # Activate the V2 protocol
predictors:
- graph:
children: []
Expand Down
1 change: 1 addition & 0 deletions executor/api/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package api

const ProtocolSeldon = "seldon"
const ProtocolTensorflow = "tensorflow"
const ProtocolV2 = "v2"
const ProtocolKFServing = "kfserving"

const TransportRest = "rest"
Expand Down
4 changes: 2 additions & 2 deletions executor/api/rest/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,7 @@ func (smc *JSONRestClient) modifyMethod(method string, modelName string) string
case client.SeldonMetadataPath:
return "/v1/models/" + modelName + "/metadata"
}
case api.ProtocolKFServing:
case api.ProtocolV2, api.ProtocolKFServing:
switch method {
case client.SeldonPredictPath, client.SeldonTransformInputPath, client.SeldonTransformOutputPath:
return "/v2/models/" + modelName + "/infer"
Expand Down Expand Up @@ -346,7 +346,7 @@ func (smc *JSONRestClient) Chain(ctx context.Context, modelName string, msg payl
return msg, nil
case api.ProtocolTensorflow: // Attempt to chain tensorflow payload
return ChainTensorflow(msg)
case api.ProtocolKFServing:
case api.ProtocolV2, api.ProtocolKFServing:
return ChainKFserving(msg)
}
return nil, errors.Errorf("Unknown protocol %s", smc.Protocol)
Expand Down
2 changes: 1 addition & 1 deletion executor/api/rest/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ func (r *SeldonRestApi) Initialise() {
r.Router.NewRoute().Path("/v1/models/{"+ModelHttpPathVariable+"}/metadata").Methods("GET", "OPTIONS").HandlerFunc(r.wrapMetrics(metric.MetadataHttpServiceName, r.metadata))
// Enabling for standard seldon core feedback API endpoint with standard schema
r.Router.NewRoute().Path("/api/v1.0/feedback").Methods("OPTIONS", "POST").HandlerFunc(r.wrapMetrics(metric.FeedbackHttpServiceName, r.feedback))
case api.ProtocolKFServing:
case api.ProtocolV2, api.ProtocolKFServing:
r.Router.NewRoute().Path("/v2/models/{"+ModelHttpPathVariable+"}/infer").Methods("OPTIONS", "POST").HandlerFunc(r.wrapMetrics(metric.PredictionHttpServiceName, r.predictions))
r.Router.NewRoute().Path("/v2/models/infer").Methods("OPTIONS", "POST").HandlerFunc(r.wrapMetrics(metric.PredictionHttpServiceName, r.predictions)) // Nonstandard path - Seldon extension
r.Router.NewRoute().Path("/v2/models/{"+ModelHttpPathVariable+"}/ready").Methods("GET", "OPTIONS").HandlerFunc(r.wrapMetrics(metric.StatusHttpServiceName, r.status))
Expand Down
8 changes: 4 additions & 4 deletions executor/cmd/executor/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ func runGrpcServer(wg *sync.WaitGroup, shutdown chan bool, lis net.Listener, log
tensorflowGrpcServer := tensorflow.NewGrpcTensorflowServer(predictor, client, serverUrl, namespace)
serving.RegisterPredictionServiceServer(grpcServer, tensorflowGrpcServer)
serving.RegisterModelServiceServer(grpcServer, tensorflowGrpcServer)
case api.ProtocolKFServing:
case api.ProtocolV2, api.ProtocolKFServing:
kfservingGrpcServer := kfserving.NewGrpcKFServingServer(predictor, client, serverUrl, namespace)
kfproto.RegisterGRPCInferenceServiceServer(grpcServer, kfservingGrpcServer)
}
Expand Down Expand Up @@ -247,8 +247,8 @@ func main() {
log.Fatal("Required argument predictor missing")
}

if !(*protocol == api.ProtocolSeldon || *protocol == api.ProtocolTensorflow || *protocol == api.ProtocolKFServing) {
log.Fatal("Protocol must be seldon, tensorflow or kfserving")
if !(*protocol == api.ProtocolSeldon || *protocol == api.ProtocolTensorflow || *protocol == api.ProtocolV2 || *protocol == api.ProtocolKFServing) {
log.Fatal("Protocol must be seldon, tensorflow or v2")
}

if *serverType == "kafka" {
Expand Down Expand Up @@ -384,7 +384,7 @@ func main() {
clientGrpc = seldon.NewSeldonGrpcClient(predictor, *sdepName, annotations)
case api.ProtocolTensorflow:
clientGrpc = tensorflow.NewTensorflowGrpcClient(predictor, *sdepName, annotations)
case api.ProtocolKFServing:
case api.ProtocolV2, api.ProtocolKFServing:
clientGrpc = kfserving.NewKFServingGrpcClient(predictor, *sdepName, annotations)
default:
log.Fatalf("Failed to create grpc client. Unknown protocol %s: %v", *protocol, err)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ kind: SeldonDeployment
metadata:
name: triton
spec:
protocol: kfserving
protocol: v2
predictors:
- graph:
children: []
Expand Down
10 changes: 5 additions & 5 deletions helm-charts/seldon-core-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -119,15 +119,15 @@ predictor_servers:
seldon:
defaultImageVersion: "1.13.0-dev"
image: seldonio/mlflowserver
kfserving:
v2:
defaultImageVersion: "1.0.0.rc2-mlflow"
image: seldonio/mlserver
SKLEARN_SERVER:
protocols:
seldon:
defaultImageVersion: "1.13.0-dev"
image: seldonio/sklearnserver
kfserving:
v2:
defaultImageVersion: "1.0.0.rc2-sklearn"
image: seldonio/mlserver
TENSORFLOW_SERVER:
Expand All @@ -143,17 +143,17 @@ predictor_servers:
seldon:
defaultImageVersion: "1.13.0-dev"
image: seldonio/xgboostserver
kfserving:
v2:
defaultImageVersion: "1.0.0.rc2-xgboost"
image: seldonio/mlserver
TRITON_SERVER:
protocols:
kfserving:
v2:
defaultImageVersion: "21.08-py3"
image: nvcr.io/nvidia/tritonserver
TEMPO_SERVER:
protocols:
kfserving:
v2:
defaultImageVersion: "1.0.0.rc2-slim"
image: seldonio/mlserver

Expand Down
Loading