Skip to content

Commit

Permalink
Apply suggestions
Browse files Browse the repository at this point in the history
  • Loading branch information
michaelbenayoun committed Dec 12, 2022
1 parent 147caea commit a4c180e
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
4 changes: 2 additions & 2 deletions docs/source/onnxruntime/usage_guides/models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -70,10 +70,10 @@ Sequence-to-sequence (Seq2Seq) models can also be used when running inference wi
are exported to the ONNX format, they are decomposed into three parts that are later combined during inference:
- The encoder part of the model
- The decoder part of the model + the language modeling head
- The same decoder part of the + language modeling head but taking and using pre-computed key / values as inputs and
- The same decoder part of the model + language modeling head but taking and using pre-computed key / values as inputs and
outputs. This makes inference faster.

Here is an example on how you can load a T5 model to the ONNX format and run inference for a translation task:
Here is an example of how you can load a T5 model to the ONNX format and run inference for a translation task:


```python
Expand Down
8 changes: 4 additions & 4 deletions docs/source/onnxruntime/usage_guides/pipelines.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,10 @@ There are tags on the Model Hub that allow you to filter for a model you'd like
<Tip>

To be able to load the model with the ONNX Runtime backend, the export to ONNX needs
to be supported for the architecture considered.
to be supported for the considered architecture.

You can check the list of the supported architectures
[here](/exporters/onnx/package_reference/configuration#Supported_architectures).
You can check the list of supported architectures
[here](/exporters/onnx/package_reference/configuration#Supported-architectures).

</Tip>

Expand Down Expand Up @@ -143,7 +143,7 @@ For example, here is how you can load the [`~onnxruntime.ORTModelForQuestionAnsw
The [`~pipelines.pipeline`] function can not only run inference on vanilla ONNX Runtime checkpoints - you can also use
checkpoints optimized with the [`~optimum.onnxruntime.ORTQuantizer`] and the [`~optimum.onnxruntime.ORTOptimizer`].

Below you can find two examples on how you could the [`~optimum.onnxruntime.ORTOptimizer`] and the
Below you can find two examples of how you could use the [`~optimum.onnxruntime.ORTOptimizer`] and the
[`~optimum.onnxruntime.ORTQuantizer`] to optimize/quantize your model and use it for inference afterwards.

### Quantizing with the `ORTQuantizer`
Expand Down

0 comments on commit a4c180e

Please sign in to comment.