Skip to content

Commit

Permalink
Merge branch 'master' into pd_enable_sin_and_cos
Browse files Browse the repository at this point in the history
  • Loading branch information
xczhai authored Aug 28, 2023
2 parents ba6314d + b7d73cb commit 0fec5cc
Show file tree
Hide file tree
Showing 58 changed files with 1,541 additions and 254 deletions.
20 changes: 13 additions & 7 deletions docs/notebooks/002-openvino-api-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -292,13 +292,15 @@ TensorFlow Model
TensorFlow models saved in frozen graph format can also be passed to
``read_model`` starting in OpenVINO 2022.3.

**NOTE**: Directly loading TensorFlow models is available as a
.. note::

Directly loading TensorFlow models is available as a
preview feature in the OpenVINO 2022.3 release. Fully functional
support will be provided in the upcoming 2023 releases. Currently
support is limited to only frozen graph inference format. Other
TensorFlow model formats must be converted to OpenVINO IR using
`model conversion
API <https://docs.openvino.ai/2023.0/openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_TensorFlow.html>`__.
`model conversion API <https://docs.openvino.ai/2023.0/openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_TensorFlow.html>`__.


.. code:: ipython3
Expand Down Expand Up @@ -563,9 +565,11 @@ classes (``C``). The output is returned as 32-bit floating point.
Doing Inference on a Model
--------------------------

**NOTE** this notebook demonstrates only the basic synchronous
inference API. For an async inference example, please refer to `Async
API notebook <115-async-api-with-output.html>`__
.. note::

This notebook demonstrates only the basic synchronous
inference API. For an async inference example, please refer to
`Async API notebook <115-async-api-with-output.html>`__

The diagram below shows a typical inference pipeline with OpenVINO

Expand Down Expand Up @@ -926,7 +930,9 @@ model will be loaded to the GPU. After running this cell once, the model
will be cached, so subsequent runs of this cell will load the model from
the cache.

*Note: Model Caching is also available on CPU devices*
.. note::

Model Caching is also available on CPU devices

.. code:: ipython3
Expand Down
20 changes: 12 additions & 8 deletions docs/notebooks/102-pytorch-to-openvino-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -237,14 +237,17 @@ Optimizer Python API should be used for these purposes. More details
regarding PyTorch model conversion can be found in OpenVINO
`documentation <https://docs.openvino.ai/2023.0/openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_PyTorch.html>`__

**Note**: Please, take into account that direct support PyTorch
.. note::

Please, take into account that direct support PyTorch
models conversion is an experimental feature. Model coverage will be
increased in the next releases. For cases, when PyTorch model
conversion failed, you still can try to export the model to ONNX
format. Please refer to this
format. Please, refer to this
`tutorial <102-pytorch-to-openvino-with-output.html>`__
which explains how to convert PyTorch model to ONNX, then to OpenVINO


The ``convert_model`` function accepts the PyTorch model object and
returns the ``openvino.runtime.Model`` instance ready to load on a
device using ``core.compile_model`` or save on disk for next usage using
Expand Down Expand Up @@ -501,8 +504,8 @@ Run OpenVINO Model Inference with Static Input Shape `⇑ <#top>`__
5: hamper - 2.35%
Benchmark OpenVINO Model Inference with Static Input Shape
`<#top>`__
Benchmark OpenVINO Model Inference with Static Input Shape `<#top>`__
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

.. code:: ipython3
Expand Down Expand Up @@ -645,8 +648,9 @@ OpenVINO IR is similar to the original PyTorch model.
5: hamper - 2.35%
Benchmark OpenVINO Model Inference Converted From Scripted Model
`<#top>`__
Benchmark OpenVINO Model Inference Converted From Scripted Model `<#top>`__
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


.. code:: ipython3
Expand Down Expand Up @@ -772,8 +776,8 @@ similar to the original PyTorch model.
5: hamper - 2.35%
Benchmark OpenVINO Model Inference Converted From Traced Model
`<#top>`__
Benchmark OpenVINO Model Inference Converted From Traced Model `<#top>`__
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

.. code:: ipython3
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Source of the

**Table of contents**:

- `Preparation <#1preparation>`__
- `Preparation <#preparation>`__

- `Imports <#imports>`__
- `Settings <#settings>`__
Expand Down
12 changes: 6 additions & 6 deletions docs/notebooks/104-model-tools-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -423,7 +423,9 @@ In the next cell, define the ``benchmark_model()`` function that calls
``benchmark_app``. This makes it easy to try different combinations. In
the cell below that, you display available devices on the system.

**Note**: In this notebook, ``benchmark_app`` runs for 15 seconds to
.. note::

In this notebook, ``benchmark_app`` runs for 15 seconds to
give a quick indication of performance. For more accurate
performance, it is recommended to run inference for at least one
minute by setting the ``t`` parameter to 60 or higher, and run
Expand All @@ -432,6 +434,7 @@ the cell below that, you display available devices on the system.
command prompt where you have activated the ``openvino_env``
environment.


.. code:: ipython3
def benchmark_model(model_xml, device="CPU", seconds=60, api="async", batch=1):
Expand Down Expand Up @@ -523,9 +526,7 @@ Benchmark command:
.. code:: ipython3
benchmark_model(model_path, device="GPU", seconds=15, api="async")
benchmark_model(model_path, device="GPU", seconds=15, api="async")
.. raw:: html

Expand All @@ -534,8 +535,7 @@ Benchmark command:

.. code:: ipython3
benchmark_model(model_path, device="MULTI:CPU,GPU", seconds=15, api="async")
benchmark_model(model_path, device="MULTI:CPU,GPU", seconds=15, api="async")
.. raw:: html
Expand Down
5 changes: 4 additions & 1 deletion docs/notebooks/105-language-quantize-bert-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -593,7 +593,9 @@ Finally, measure the inference performance of OpenVINO ``FP32`` and
`Benchmark Tool <https://docs.openvino.ai/2023.0/openvino_inference_engine_tools_benchmark_tool_README.html>`__
in OpenVINO.

**Note**: The ``benchmark_app`` tool is able to measure the
.. note::

The ``benchmark_app`` tool is able to measure the
performance of the OpenVINO Intermediate Representation (OpenVINO IR)
models only. For more accurate performance, run ``benchmark_app`` in
a terminal/command prompt after closing other applications. Run
Expand All @@ -602,6 +604,7 @@ in OpenVINO.
Run ``benchmark_app --help`` to see an overview of all command-line
options.


.. code:: ipython3
# Inference FP32 model (OpenVINO IR)
Expand Down
6 changes: 3 additions & 3 deletions docs/notebooks/106-auto-device-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,18 +36,18 @@ first inference.

- `Import modules and create Core <#import-modules-and-create-core>`__
- `Convert the model to OpenVINO IR format <#convert-the-model-to-openvino-ir-format>`__
- `(1) Simplify selection logic <#1-simplify-selection-logic>`__
- `(1) Simplify selection logic <#simplify-selection-logic>`__

- `Default behavior of Core::compile_model API without device_name <#default-behavior-of-core::compile_model-api-without-device_name>`__
- `Explicitly pass AUTO as device_name to Core::compile_model API <#explicitly-pass-auto-as-device_name-to-core::compile_model-api>`__

- `(2) Improve the first inference latency <#2-improve-the-first-inference-latency>`__
- `(2) Improve the first inference latency <#improve-the-first-inference-latency>`__

- `Load an Image <#load-an-image>`__
- `Load the model to GPU device and perform inference <#load-the-model-to-gpu-device-and-perform-inference>`__
- `Load the model using AUTO device and do inference <#load-the-model-using-auto-device-and-do-inference>`__

- `(3) Achieve different performance for different targets <#3-achieve-different-performance-for-different-targets>`__
- `(3) Achieve different performance for different targets <#achieve-different-performance-for-different-targets>`__

- `Class and callback definition <#class-and-callback-definition>`__
- `Inference with THROUGHPUT hint <#inference-with-throughput-hint>`__
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -342,11 +342,11 @@ Create a quantized model from the pre-trained ``FP16`` model and the
calibration dataset. The optimization process contains the following
steps:

::

1. Create a Dataset for quantization.
2. Run `nncf.quantize` for getting an optimized model. The `nncf.quantize` function provides an interface for model quantization. It requires an instance of the OpenVINO Model and quantization dataset. Optionally, some additional parameters for the configuration quantization process (number of samples for quantization, preset, ignored scope, etc.) can be provided. For more accurate results, we should keep the operation in the postprocessing subgraph in floating point precision, using the `ignored_scope` parameter. `advanced_parameters` can be used to specify advanced quantization parameters for fine-tuning the quantization algorithm. In this tutorial we pass range estimator parameters for activations. For more information see [Tune quantization parameters](https://docs.openvino.ai/2023.0/basic_quantization_flow.html#tune-quantization-parameters).
3. Serialize OpenVINO IR model using `openvino.runtime.serialize` function.
1. Create a Dataset for quantization.
2. Run ``nncf.quantize`` for getting an optimized model. The ``nncf.quantize`` function provides an interface for model quantization. It requires an instance of the OpenVINO Model and quantization dataset. Optionally, some additional parameters for the configuration quantization process (number of samples for quantization, preset, ignored scope, etc.) can be provided. For more accurate results, we should keep the operation in the postprocessing subgraph in floating point precision, using the ``ignored_scope`` parameter. ``advanced_parameters`` can be used to specify advanced quantization parameters for fine-tuning the quantization algorithm. In this tutorial we pass range estimator parameters for activations. For more information see
`Tune quantization parameters <https://docs.openvino.ai/2023.0/basic_quantization_flow.html#tune-quantization-parameters>`__.
3. Serialize OpenVINO IR model using ``openvino.runtime.serialize`` function.

.. code:: ipython3

Expand Down Expand Up @@ -663,7 +663,9 @@ Tool <https://docs.openvino.ai/latest/openvino_inference_engine_tools_benchmark_
is used to measure the inference performance of the ``FP16`` and
``INT8`` models.

**NOTE**: For more accurate performance, it is recommended to run
.. note::

For more accurate performance, it is recommended to run
``benchmark_app`` in a terminal/command prompt after closing other
applications. Run ``benchmark_app -m model.xml -d CPU`` to benchmark
async inference on CPU for one minute. Change ``CPU`` to ``GPU`` to
Expand Down
7 changes: 5 additions & 2 deletions docs/notebooks/108-gpu-device-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -553,16 +553,19 @@ manually specify devices to use. Below is an example showing how to use

``compiled_model = core.compile_model(model=model, device_name="AUTO", config={"PERFORMANCE_HINT": "CUMULATIVE_THROUGHPUT"})``

**Important**: **The “THROUGHPUT”, “MULTI”, and
.. important::

The “THROUGHPUT”, “MULTI”, and
“CUMULATIVE_THROUGHPUT” modes are only applicable to asynchronous
inferencing pipelines. The example at the end of this article shows
how to set up an asynchronous pipeline that takes advantage of
parallelism to increase throughput.** To learn more, see
parallelism to increase throughput. To learn more, see
`Asynchronous
Inferencing <https://docs.openvino.ai/2023.0/openvino_docs_ie_plugin_dg_async_infer_request.html>`__
in OpenVINO as well as the `Asynchronous Inference
notebook <https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/115-async-api>`__.


Performance Comparison with benchmark_app `<#top>`__
###############################################################################################################################

Expand Down
24 changes: 15 additions & 9 deletions docs/notebooks/109-latency-tricks-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,20 +21,24 @@ many hints simultaneously, like more inference threads + shared memory.
It should give even better performance, but we recommend testing it
anyway.

**NOTE**: We especially recommend trying
.. note::

We especially recommend trying
``OpenVINO IR model + CPU + shared memory in latency mode`` or
``OpenVINO IR model + CPU + shared memory + more inference threads``.

The quantization and pre-post-processing API are not included here as
they change the precision (quantization) or processing graph
(prepostprocessor). You can find examples of how to apply them to
optimize performance on OpenVINO IR files in
`111-detection-quantization <../111-detection-quantization>`__ and
`118-optimize-preprocessing <../118-optimize-preprocessing>`__.
`111-detection-quantization <111-yolov5-quantization-migration-with-output.html>`__ and
`118-optimize-preprocessing <118-optimize-preprocessing-with-output.html>`__.

|image0|

**NOTE**: Many of the steps presented below will give you better
.. note::

Many of the steps presented below will give you better
performance. However, some of them may not change anything if they
are strongly dependent on either the hardware or the model. Please
run this notebook on your computer with your model to learn which of
Expand All @@ -45,7 +49,7 @@ optimize performance on OpenVINO IR files in
result in different performance.

A similar notebook focused on the throughput mode is available
`here <109-throughput-tricks.ipynb>`__.
`here <109-throughput-tricks-with-output.html>`__.

**Table of contents**:

Expand Down Expand Up @@ -193,7 +197,9 @@ Hardware `⇑ <#top>`__
The code below lists the available hardware we will use in the
benchmarking process.

**NOTE**: The hardware you have is probably completely different from
.. note::

The hardware you have is probably completely different from
ours. It means you can see completely different results.

.. code:: ipython3
Expand Down Expand Up @@ -606,9 +612,9 @@ Other tricks `⇑ <#top>`__
There are other tricks for performance improvement, such as quantization
and pre-post-processing or dedicated to throughput mode. To get even
more from your model, please visit
`111-detection-quantization <../111-detection-quantization>`__,
`118-optimize-preprocessing <../118-optimize-preprocessing>`__, and
`109-throughput-tricks <109-throughput-tricks.ipynb>`__.
`111-detection-quantization <111-yolov5-quantization-migration-with-output.html>`__,
`118-optimize-preprocessing <118-optimize-preprocessing-with-output.html>`__, and
`109-throughput-tricks <109-latency-tricks-with-output.html>`__.

Performance comparison `<#top>`__
###############################################################################################################################
Expand Down
24 changes: 15 additions & 9 deletions docs/notebooks/109-throughput-tricks-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,14 @@ The quantization and pre-post-processing API are not included here as
they change the precision (quantization) or processing graph
(prepostprocessor). You can find examples of how to apply them to
optimize performance on OpenVINO IR files in
`111-detection-quantization <../111-detection-quantization>`__ and
`118-optimize-preprocessing <../118-optimize-preprocessing>`__.
`111-detection-quantization <111-yolov5-quantization-migration-with-output.html>`__ and
`118-optimize-preprocessing <118-optimize-preprocessing-with-output.html>`__.

|image0|

**NOTE**: Many of the steps presented below will give you better
.. note::

Many of the steps presented below will give you better
performance. However, some of them may not change anything if they
are strongly dependent on either the hardware or the model. Please
run this notebook on your computer with your model to learn which of
Expand All @@ -42,7 +44,7 @@ optimize performance on OpenVINO IR files in
result in different performance.

A similar notebook focused on the latency mode is available
`here <109-latency-tricks.ipynb>`__.
`here <109-latency-tricks-with-output.html>`__.

**Table of contents**:

Expand Down Expand Up @@ -180,7 +182,9 @@ Hardware `⇑ <#top>`__
The code below lists the available hardware we will use in the
benchmarking process.

**NOTE**: The hardware you have is probably completely different from
.. note::

The hardware you have is probably completely different from
ours. It means you can see completely different results.

.. code:: ipython3
Expand Down Expand Up @@ -616,7 +620,9 @@ automatically spawns the pool of InferRequest objects (also called
“jobs”) and provides synchronization mechanisms to control the flow of
the pipeline.

**NOTE**: Asynchronous processing cannot guarantee outputs to be in
.. note::

Asynchronous processing cannot guarantee outputs to be in
the same order as inputs, so be careful in case of applications when
the order of frames matters, e.g., videos.

Expand Down Expand Up @@ -662,9 +668,9 @@ options, quantization and pre-post-processing or dedicated to latency
mode. To get even more from your model, please visit `advanced
throughput
options <https://docs.openvino.ai/2023.0/openvino_docs_deployment_optimization_guide_tput_advanced.html>`__,
`109-latency-tricks <109-latency-tricks.ipynb>`__,
`111-detection-quantization <../111-detection-quantization>`__, and
`118-optimize-preprocessing <../118-optimize-preprocessing>`__.
`109-latency-tricks <109-latency-tricks-with-output.html>`__,
`111-detection-quantization <111-yolov5-quantization-migration-with-output.html>`__, and
`118-optimize-preprocessing <118-optimize-preprocessing-with-output.html>`__.

Performance comparison `<#top>`__
###############################################################################################################################
Expand Down
5 changes: 4 additions & 1 deletion docs/notebooks/110-ct-scan-live-inference-with-output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,9 @@ To measure the inference performance of the IR model, use
is a command-line application that can be run in the notebook with
``! benchmark_app`` or ``%sx benchmark_app`` commands.

**Note**: The ``benchmark_app`` tool is able to measure the
.. note::

The ``benchmark_app`` tool is able to measure the
performance of the OpenVINO Intermediate Representation (OpenVINO IR)
models only. For more accurate performance, run ``benchmark_app`` in
a terminal/command prompt after closing other applications. Run
Expand All @@ -125,6 +127,7 @@ is a command-line application that can be run in the notebook with
Run ``benchmark_app --help`` to see an overview of all command-line
options.


.. code:: ipython3
core = Core()
Expand Down
Loading

0 comments on commit 0fec5cc

Please sign in to comment.