Skip to content

Commit

Permalink
Updated docs based on comments
Browse files Browse the repository at this point in the history
  • Loading branch information
rszper committed Jul 12, 2022
1 parent a86ba96 commit 91038b3
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ You can use Apache Beam with the RunInference API to use machine learning (ML) m

## Why use the RunInference API?

RunInference leverages existing Apache Beam concepts, such as the the `BatchElements` transform and the `Shared` class, and it allows you to build multi-model pipelines. In addition, the RunInference API allows you to find the input that determined the prediction without returning to the full input data.
RunInference leverages existing Apache Beam concepts, such as the the `BatchElements` transform and the `Shared` class, and it allows you to build multi-model pipelines. In addition, the RunInference API has built in capabilities for dealing with [keyed values](#use-the-prediction-results-object).

### BatchElements PTransform

Expand All @@ -34,12 +34,12 @@ For more information, see the [`BatchElements` transform documentation](https://

### Shared helper class

Instead of loading a model for each thread in a worker, we use the `Shared` class, which allows us to load one model that is shared across all threads of each worker in a DoFn. For more information, see the
Instead of loading a model for each thread in the process, we use the `Shared` class, which allows us to load one model that is shared across all threads of each worker in a DoFn. For more information, see the
[`Shared` class documentation](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/shared.py#L20).

### Multi-model pipelines

The RunInference API allows you to build complex multi-model pipelines with minimum effort. Multi-model pipelines are useful for A/B testing and for building out ensembles for tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, language detection, coreference resolution, and more.
The RunInference API can be composed into multi-model pipelines. Multi-model pipelines are useful for A/B testing and for building out ensembles for tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, language detection, coreference resolution, and more.

### Prediction results

Expand Down Expand Up @@ -99,7 +99,7 @@ with pipeline as p:
with pipeline as p:
data = p | 'Read' >> beam.ReadFromSource('a_source')
model_a_predictions = data | RunInference(ModelHandlerA)
model_b_predictions = data | RunInference(ModelHandlerB)
model_b_predictions = model_a_predictions | beam.Map(some_post_processing) | RunInference(ModelHandlerB)
```

### Use a key handler
Expand Down Expand Up @@ -182,4 +182,5 @@ the same size. Depending on the language model and encoding technique, this opti
## Related links

* [RunInference transforms](/documentation/transforms/python/elementwise/runinference)
* [RunInference API pipeline examples](https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/inference)
* [RunInference API pipeline examples](https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/inference)
* [apache_beam.ml.inference package](/releases/pydoc/current/apache_beam.ml.inference.html#apache_beam.ml.inference.RunInference)
2 changes: 1 addition & 1 deletion website/www/site/content/en/documentation/sdks/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ language-specific implementation guidance.

## Using Beam Python SDK in your ML pipelines

To use the Beam Python SDK with your machine learning pipelines, you can either use the RunInference API or TensorFlow.
To use the Beam Python SDK with your machine learning pipelines, use the RunInference API for PyTorch and Sklearn models. If using Tensorflow model, you can make use of the library from `tfx_bsl`. Further integrations for tensorflow are planned.

You can create multiple types of transforms using the RunInference API: the API takes multiple types of setup parameters from model handlers, and the parameter type determines the model implementation. For more information,
see [Machine Learning](/documentation/sdks/python-machine-learning).
Expand Down

0 comments on commit 91038b3

Please sign in to comment.