Skip to content

Commit

Permalink
Refactor build_model() out of benchmark_model() (#48)
Browse files Browse the repository at this point in the history
* Refactor build out of benchmarking

Signed-off-by: Jeremy Fowers <jeremy.fowers@amd.com>
  • Loading branch information
jeremyfowers committed Dec 6, 2023
1 parent 6fe1442 commit 2e84a3d
Show file tree
Hide file tree
Showing 20 changed files with 286 additions and 565 deletions.
9 changes: 8 additions & 1 deletion .github/workflows/publish-to-test-pypi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ on:
branches: ["main", "canary"]
tags:
- v*
- RC*
pull_request:
branches: ["main", "canary"]

Expand Down Expand Up @@ -33,7 +34,13 @@ jobs:
models=$(turnkey models location --quiet)
turnkey $models/selftest/linear.py
- name: Publish distribution package to PyPI
if: startsWith(github.ref, 'refs/tags')
if: startsWith(github.ref, 'refs/tags/v')
uses: pypa/gh-action-pypi-publish@release/v1
with:
password: ${{ secrets.PYPI_API_TOKEN }}
- name: Publish distribution package to Test PyPI
if: startsWith(github.ref, 'refs/tags/RC')
uses: pypa/gh-action-pypi-publish@release/v1
with:
password: ${{ secrets.TEST_PYPI_API_TOKEN }}
repository_url: https://test.pypi.org/legacy/
3 changes: 0 additions & 3 deletions .github/workflows/test_turnkey.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,6 @@ jobs:
# turnkey examples
# Note: we clear the default cache location prior to each example run
rm -rf ~/.cache/turnkey
python examples/model_api/hello_world.py
rm -rf ~/.cache/turnkey
python examples/files_api/onnx_opset.py --onnx-opset 15
rm -rf ~/.cache/turnkey
turnkey examples/cli/scripts/hello_world.py
Expand All @@ -71,7 +69,6 @@ jobs:
cd test/
python cli.py
python analysis.py
python model_api.py
- name: Test example plugins
shell: bash -el {0}
run: |
Expand Down
11 changes: 5 additions & 6 deletions docs/code.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ The TurnkeyML source code has a few major top-level directories:
- `models`: the corpora of models that makes up the TurnkeyML models (see [the models readme](https://github.com/onnx/turnkeyml/blob/main/models/readme.md)).
- Each subdirectory under `models` represents a corpus of models pulled from somewhere on the internet. For example, `models/torch_hub` is a corpus of models from [Torch Hub](https://github.com/pytorch/hub).
- `src/turnkey`: source code for the TurnkeyML tools (see [Benchmarking Tools](#benchmarking-tools) for a description of how the code is used).
- `src/turnkeyml/analyze`: functions for profiling a model script, discovering model instances, and invoking `benchmark_model()` on those instances.
- `src/turnkeyml/run`: implements the runtime and device plugin APIs and the built-in runtimes and devices.
- `src/turnkeyml/analyze`: functions for profiling a model script, discovering model instances, and invoking `build_model()` and/or `BaseRT.benchmark()` on those instances.
- `src/turnkeyml/run`: implements `BaseRT`, an abstract base class that defines TurnkeyML's vendor-agnostic benchmarking functionality. This module also includes the runtime and device plugin APIs and the built-in runtimes and devices.
- `src/turnkeyml/cli`: implements the `turnkey` CLI and reporting tool.
- `src/turnkeyml/common`: functions common to the other modules.
- `src/turnkeyml/version.py`: defines the package version number.
Expand All @@ -29,10 +29,9 @@ TurnkeyML provides two main tools, the `turnkey` CLI and benchmarking APIs. Inst
1. The default command for `turnkey` CLI runs the `benchmark_files()` API, which is implemented in [files_api.py](https://github.com/onnx/turnkeyml/blob/main/src/turnkeyml/files_api.py).
- Other CLI commands are also implemented in `cli/`, for example the `report` command is implemented in `cli/report.py`.
1. The `benchmark_files()` API takes in a set of scripts, each of which should invoke at least one model instance, to evaluate and passes each into the `evaluate_script()` function for analysis, which is implemented in [analyze/script.py](https://github.com/onnx/turnkeyml/blob/main/src/turnkeyml/analyze/script.py).
1. `evaluate_script()` uses a profiler to discover the model instances in the script, and passes each into the `benchmark_model()` API, which is defined in [model_api.py](https://github.com/onnx/turnkeyml/blob/main/src/turnkeyml/model_api.py).
1. The `benchmark_model()` API prepares the model for benchmarking (e.g., exporting and optimizing an ONNX file), which creates an instance of a `*Model` class, where `*` can be CPU, GPU, etc. The `*Model` classes are defined in [run/](https://github.com/onnx/turnkeyml/blob/main/src/turnkeyml/run/).
1. The `*Model` classes provide a `.benchmark()` method that benchmarks the model on the device and returns an instance of the `MeasuredPerformance` class, which includes the performance statistics acquired during benchmarking.
1. `benchmark_model()` and the `*Model` classes are built using [`build_model()`](#model-build-tool)
1. `evaluate_script()` uses a profiler to discover the model instances in the script, and passes each into the `build_model()` API, which is defined in [build_api.py](https://github.com/onnx/turnkeyml/blob/main/src/turnkeyml/build_api.py).
1. The `build_model()` API prepares the model for benchmarking (e.g., exporting and optimizing an ONNX file).
1. `evaluate_script()` passes the build into `BaseRT.benchmark()` to benchmarks the model on the device and returns an instance of the `MeasuredPerformance` class, which includes the performance statistics acquired during benchmarking.

# Model Build Tool

Expand Down
2 changes: 1 addition & 1 deletion docs/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
This directory contains documentation for the TurnkeyML project:
- [code.md](https://github.com/onnx/turnkeyml/blob/main/docs/code.md): Code organization for the benchmark and tools.
- [install.md](https://github.com/onnx/turnkeyml/blob/main/docs/install.md): Installation instructions for the tools.
- [tools_user_guide.md](https://github.com/onnx/turnkeyml/blob/main/docs/tools_user_guide.md): User guide for the tools: `turnkey` CLI, `benchmark_files()`, and `benchmark_model()`.
- [tools_user_guide.md](https://github.com/onnx/turnkeyml/blob/main/docs/tools_user_guide.md): User guide for the tools: the `turnkey` CLI and the `benchmark_files()` and `build_model()` APIs.
- [versioning.md](https://github.com/onnx/turnkeyml/blob/main/docs/versioning.md): Defines the semantic versioning rules for the `turnkey` package.

There is more useful documentation available in:
Expand Down
58 changes: 15 additions & 43 deletions docs/tools_user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,8 @@ Where `your_script.py` is a Python script that instantiates and executes a PyTor

The `turnkey` CLI performs the following steps:
1. [Analysis](#analysis): profile the Python script to identify the PyTorch models within
2. [Build](#build): call the `benchmark_files()` [API](#the-turnkey-api) to prepare each model for benchmarking
3. [Benchmark](#benchmark): call the `benchmark_model()` [API](#the-turnkey-api) on each model to gather performance statistics
2. [Build](#build): call the `build_models()` [API](#the-turnkey-api) to prepare each model for benchmarking
3. [Benchmark](#benchmark): call the `BaseRT.benchmark()` method on each model to gather performance statistics

_Note_: The benchmarking methodology is defined [here](#benchmark). If you are looking for more detailed instructions on how to install turnkey, you can find that [here](https://github.com/onnx/turnkeyml/blob/main/docs/install.md).

Expand All @@ -64,31 +64,11 @@ _Note_: The benchmarking methodology is defined [here](#benchmark). If you are l

Most of the functionality provided by the `turnkey` CLI is also available in the the API:
- `turnkey.benchmark_files()` provides the same benchmarking functionality as the `turnkey` CLI: it takes a list of files and target device, and returns performance results.
- `turnkey.benchmark_model()` provides a subset of this functionality: it takes a model and its inputs, and returns performance results.
- The main difference is that `benchmark_model()` does not include the [Analysis](#analysis) feature, and `benchmark_files()` does.
- `turnkey.build_model(model, inputs)` is used to programmatically [build](#build) a model instance through a sequence of model-to-model transformations (e.g., starting with an fp32 PyTorch model and ending with an fp16 ONNX model).

Generally speaking, the `turnkey` CLI is a command line interface for the `benchmark_files()` API, which internally calls `benchmark_model()`, which in turn calls `build_model()`. You can read more about this code organization [here](https://github.com/onnx/turnkeyml/blob/main/docs/code.md).
Generally speaking, the `turnkey` CLI is a command line interface for the `benchmark_files()` API which in turn calls `build_model()` and then performs benchmarking using `BaseRT.benchmark()`. You can read more about this code organization [here](https://github.com/onnx/turnkeyml/blob/main/docs/code.md).

For an example of `benchmark_model()`, the following script:

```python
from turnkeyml import benchmark_model

model = YourModel() # Instantiate a torch.nn.module
results = model(**inputs)
perf = benchmark_model(model, inputs)
```

Will print an output like this:

```
> Performance of YourModel on device Intel® Xeon® Platinum 8380 is:
> latency: 0.033 ms
> throughput: 21784.8 ips
```

`benchmark_model()` returns a `MeasuredPerformance` object that includes members:
`BaseRT.benchmark()` returns a `MeasuredPerformance` object that includes members:
- `latency_units`: unit of time used for measuring latency, which is set to `milliseconds (ms)`.
- `mean_latency`: average benchmarking latency, measured in `latency_units`.
- `throughput_units`: unit used for measuring throughput, which is set to `inferences per second (IPS)`.
Expand Down Expand Up @@ -135,7 +115,7 @@ A **runtime** is a piece of software that executes a model on a device.

**Analysis** is the process by which `benchmark_files()` inspects a Python script or ONNX file and identifies the models within.

`benchmark_files()` performs analysis by running and profiling your file(s). When a model object (see [Model](#model) is encountered, it is inspected to gather statistics (such as the number of parameters in the model) and/or pass it to the `benchmark_model()` API for benchmarking.
`benchmark_files()` performs analysis by running and profiling your file(s). When a model object (see [Model](#model) is encountered, it is inspected to gather statistics (such as the number of parameters in the model) and/or passed to the build and benchmark APIs.

> _Note_: the `turnkey` CLI and `benchmark_files()` API both run your entire python script(s) whenever python script(s) are passed as input files. Please ensure that these scripts are safe to run, especially if you got them from the internet.
Expand Down Expand Up @@ -205,12 +185,14 @@ The *build cache* is a location on disk that holds all of the artifacts from you

## Benchmark

*Benchmark* is the process by which the `benchmark_model()` API collects performance statistics about a [model](#model). Specifically, `benchmark_model()` takes a [build](#build) of a model and executes it on a target device using target runtime software (see [Devices and Runtimes](#devices-and-runtimes)).
*Benchmark* is the process by which `BaseRT.benchmark()` collects performance statistics about a [model](#model). `BaseRT` is an abstract base class that defines the common benchmarking infrastructure that TurnkeyML provides across devices and runtimes.

Specifically, `BaseRT.benchmark()` takes a [build](#build) of a model and executes it on a target device using target runtime software (see [Devices and Runtimes](#devices-and-runtimes)).

By default, `benchmark_model()` will run the model 100 times to collect the following statistics:
By default, `BaseRT.benchmark()` will run the model 100 times to collect the following statistics:
1. Mean Latency, in milliseconds (ms): the average time it takes the runtime/device combination to execute the model/inputs combination once. This includes the time spent invoking the device and transferring the model's inputs and outputs between host memory and the device (when applicable).
1. Throughput, in inferences per second (IPS): the number of times the model/inputs combination can be executed on the runtime/device combination per second.
> - _Note_: `benchmark_model()` is not aware of whether `inputs` is a single input or a batch of inputs. If your `inputs` is actually a batch of inputs, you should multiply `benchmark_model()`'s reported IPS by the batch size.
> - _Note_: `BaseRT.benchmark()` is not aware of whether `inputs` is a single input or a batch of inputs. If your `inputs` is actually a batch of inputs, you should multiply `BaseRT.benchmark()`'s reported IPS by the batch size.
# Devices and Runtimes

Expand All @@ -226,7 +208,7 @@ If you are using a remote machine, it must:
- include the target device
- have `miniconda`, `python>=3.8`, and `docker>=20.10` installed

When you call `turnkey` CLI or `benchmark_model()`, the following actions are performed on your behalf:
When you call `turnkey` CLI or `benchmark_files()`, the following actions are performed on your behalf:
1. Perform a `build`, which exports all models from the script to ONNX and prepares for benchmarking.
1. Set up the benchmarking environment by loading a container and/or setting up a conda environment.
1. Run the benchmarks.
Expand All @@ -253,7 +235,6 @@ Valid values of `TYPE` include:
Also available as API arguments:
- `benchmark_files(device=...)`
- `benchmark_model(device=...)`.

> For a detailed example, see the [CLI Nvidia tutorial](https://github.com/onnx/turnkeyml/blob/main/examples/cli/readme.md#nvidia-benchmarking).
Expand All @@ -274,9 +255,8 @@ Each device type has its own default runtime, as indicated below.

This feature is also be available as an API argument:
- `benchmark_files(runtime=[...])`
- `benchmark_model(runtime=...)`

> _Note_: Inputs to `torch-eager` and `torch-compiled` are not downcasted to FP16 by default. Downcast inputs before benchmarking for a fair comparison between runtimes.
> _Note_: Inputs to `torch-eager` and `torch-compiled` are not downcasted to FP16 by default. You must perform your own downcast or quantization of inputs if needed for apples-to-apples comparisons with other runtimes.
# Additional Commands and Options

Expand Down Expand Up @@ -381,7 +361,6 @@ Process isolation mode applies a timeout to each subprocess. The default timeout

Also available as API arguments:
- `benchmark_files(cache_dir=...)`
- `benchmark_model(cache_dir=...)`
- `build_model(cache_dir=...)`

> See the [Cache Directory tutorial](https://github.com/onnx/turnkeyml/blob/main/examples/cli/cache.md#cache-directory) for a detailed example.
Expand All @@ -392,7 +371,6 @@ Also available as API arguments:

Also available as API arguments:
- `benchmark_files(lean_cache=True/False, ...)` (default False)
- `benchmark_model(lean_cache=True/False, ...)` (default False)

> _Note_: useful for benchmarking many models, since the `build` artifacts from the models can take up a significant amount of hard drive space.
Expand All @@ -409,7 +387,6 @@ Takes one of the following values:

Also available as API arguments:
- `benchmark_files(rebuild=...)`
- `benchmark_model(rebuild=...)`
- `build_model(rebuild=...)`

### Sequence
Expand All @@ -421,7 +398,6 @@ Usage:

Also available as API arguments:
- `benchmark_files(sequence=...)`
- `benchmark_model(sequence=...)`
- `build_model(sequence=...)`

### Set Script Arguments
Expand Down Expand Up @@ -460,7 +436,6 @@ Usage:

Also available as API arguments:
- `benchmark_files(onnx_opset=...)`
- `benchmark_model(onnx_opset=...)`
- `build_model(onnx_opset=...)`

> _Note_: ONNX opset can also be set by an environment variable. The --onnx-opset argument takes precedence over the environment variable. See [TURNKEY_ONNX_OPSET](#set-the-onnx-opset).
Expand All @@ -474,11 +449,10 @@ Usage:

Also available as API arguments:
- `benchmark_files(iterations=...)`
- `benchmark_model(iterations=...)`

### Analyze Only

Instruct `turnkey` or `benchmark_model()` to only run the [Analysis](#analysis) phase of the `benchmark` command.
Instruct `turnkey` or `benchmark_files()` to only run the [Analysis](#analysis) phase of the `benchmark` command.

Usage:
- `turnkey benchmark INPUT_FILES --analyze-only`
Expand All @@ -493,7 +467,7 @@ Also available as an API argument:
### Build Only

Instruct `turnkey`, `benchmark_files()`, or `benchmark_model()` to only run the [Analysis](#analysis) and [Build](#build) phases of the `benchmark` command.
Instruct `turnkey` or `benchmark_files()` to only run the [Analysis](#analysis) and [Build](#build) phases of the `benchmark` command.

Usage:
- `turnkey benchmark INPUT_FILES --build-only`
Expand All @@ -503,7 +477,6 @@ Usage:
Also available as API arguments:
- `benchmark_files(build_only=True/False)` (default False)
- `benchmark_model(build_only=True/False)` (default False)

> See the [Build Only tutorial](https://github.com/onnx/turnkeyml/blob/main/examples/cli/build.md#build-only) for a detailed example.
Expand All @@ -515,7 +488,6 @@ None of the built-in runtimes support such arguments, however plugin contributor

Also available as API arguments:
- `benchmark_files(rt_args=Dict)` (default None)
- `benchmark_model(rt_args=Dict)` (default None)

## Cache Commands

Expand Down Expand Up @@ -635,7 +607,7 @@ export TURNKEY_DEBUG=True

### Set the ONNX Opset

By default, `turnkey`, `benchmark_files()`, and `benchmark_model()` will use the default ONNX opset defined in `turnkey.common.build.DEFAULT_ONNX_OPSET`. You can set a different default ONNX opset by setting the `TURNKEY_ONNX_OPSET` environment variable.
By default, `turnkey`, `benchmark_files()`, and `build_model()` will use the default ONNX opset defined in `turnkey.common.build.DEFAULT_ONNX_OPSET`. You can set a different default ONNX opset by setting the `TURNKEY_ONNX_OPSET` environment variable.

For example:

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""
This script is an example of a sequence.py file for Sequence Plugin. Such a sequence.py
can be used to redefine the build phase of the turnkey CLI, benchmark_files(),
and benchmark_model() to have any custom behavior.
and build_model() to have any custom behavior.
In this example sequence.py file we are setting the build sequence to simply
export from pytorch to ONNX. This differs from the default build sequence, which
Expand Down
62 changes: 0 additions & 62 deletions examples/model_api/hello_world.py

This file was deleted.

1 change: 0 additions & 1 deletion examples/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,5 @@

This directory contains examples to help you learn how to use the tools. The examples are split up into two sub-directories:
1. `examples/cli`: a tutorial series for the `turnkey` CLI. This is the recommended starting point.
1. `examples/model_api`: scripts that demonstrate how to use the `turnkey.benchmark_model()` API.
1. `examples/files_api`: scripts that demonstrate how to use the `turnkey.benchmark_files()` API.
1. `examples/build_api`: scripts that demonstrate how to use the `turnkey.build_model()` API.
1 change: 0 additions & 1 deletion src/turnkeyml/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from turnkeyml.version import __version__

from .files_api import benchmark_files
from .model_api import benchmark_model
from .cli.cli import main as turnkeycli
from .build_api import build_model
from .common.build import load_state
Loading

0 comments on commit 2e84a3d

Please sign in to comment.