Skip to content

Commit

Permalink
Predict protocol V2 proposal (kubeflow#709)
Browse files Browse the repository at this point in the history
* Predict protocol V2 proposal

* Add clarifying statement about use of platform

* Model version should be specified as a string

* Add endorsements

* Fix typo

* Update docs/predict-api/v2/required_api.md

Co-Authored-By: Animesh Singh <singhan@us.ibm.com>

* Update required_api.md

Co-authored-by: Animesh Singh <singhan@us.ibm.com>
  • Loading branch information
deadeyegoodwin and animeshsingh authored Mar 9, 2020
1 parent 3eaf7c6 commit 073f1bf
Show file tree
Hide file tree
Showing 3 changed files with 857 additions and 4 deletions.
12 changes: 8 additions & 4 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,18 @@ The InferenceService Data Plane architecture consists of a static graph of compo
**Transformer**: The transformer enables users to define a pre and post processing step before the prediction and explanation workflows. Like the explainer, it is configured with relevant environment variables too. For common use cases, KFServing provides out-of-the-box transformers like Feast.

# Data Plane (V1)
KFServing has a standardized prediction workflow across all model frameworks.

Note: We are actively developing a V2 data plane protocol to improve performance (i.e. GRPC).
KFServing has a standardized prediction workflow across all model frameworks.

## Predict
All InferenceServices speak the Tensorflow V1 HTTP API: https://www.tensorflow.org/tfx/serving/api_rest#predict_api.
All InferenceServices speak the Tensorflow V1 HTTP API: https://www.tensorflow.org/tfx/serving/api_rest#predict_api.

Note: Only Tensorflow models support the fields "signature_name" and "inputs".

## Explain
All InferenceServices that are deployed with an Explainer support a standardized explanation API. This interface is identical to the Tensorflow V1 HTTP API with the addition of an ":explain" verb.

# Data Plane (V2)
The second version of the data-plane protocol addresses several issues found with the V1 data-plane protocol, including performance and generality across a large number of model frameworks and servers.

## Predict
The V2 protocol proposes both HTTP/REST and GRPC APIs. See the [complete proposal](/docs/predict-api/v2) for more information.
19 changes: 19 additions & 0 deletions docs/predict-api/v2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Predict Protocol - Version 2

The *Predict Protocol, version 2* is a set of HTTP/REST and GRPC APIs
for inference / prediction servers. By implementing this protocol both
inference clients and servers will increase their utility and
portability by being able to operate seamlessly on platforms that have
standardized around this protocol.

The protocol is composed of a required set of APIs that must be
implemented by a compliant server. This required set of APIs is
described in [required_api.md](./required_api.md).

The protocol supports an extension mechanism as a required part of the
API, but no specific extensions are required to be implemented by a
compliant server.

The protocol is not yet finalized and so feedback is welcome. To
provide feedback open an issue and prepend the title with "[Predict
Protocol V2]".
Loading

0 comments on commit 073f1bf

Please sign in to comment.