Begin testing models from the ONNX Model Zoo. (#23)

Progress on #6. A sample test report HTML file is available here: https://scotttodd.github.io/iree-test-suites/onnx_models/report_2024_09_17.html These new tests * Download models from https://github.com/onnx/models * Extract metadata from the models to determine which functions to call with random data * Run the models through [ONNX Runtime](https://onnxruntime.ai/) as a reference implementation * Import the models using `iree-import-onnx` (until we have a better API: iree-org/iree#18289) * Compile the models using `iree-compile` (currently just for `llvm-cpu` but this could be parameterized later) * Run the models using `iree-run-module`, checking outputs using `--expected_output` and the reference data Tests are written in Python using a set of pytest helper functions. As the tests run, they can log details about what commands they are running. When run locally, the `artifacts/` directory will contain all the relevant files. More can be done in follow-up PRs to improve the ergonomics there (like generating flagfiles). Each test case can use XFAIL like `@pytest.mark.xfail(raises=IreeRunException)`. As we test across multiple backends or want to configure the test suite from another repo (e.g. [iree-org/iree](https://github.com/iree-org/iree)), we can explore more expressive marks. Note that unlike the ONNX _operator_ tests, these tests use `onnxruntime` and `iree-import-onnx` at test time. The operator tests handle that as an infrequently ran offline step. We could do something similar here, but the test inputs and outputs can be rather large for real models and that gets into Git LFS or cloud storage territory. If this test authoring model works well enough, we can do something similar for other ML frameworks like TFLite (#5).
iree-org · Sep 19, 2024 · 7b8bdf7 · 7b8bdf7
1 parent a02aca9
commit 7b8bdf7
Show file tree

Hide file tree

Showing 13 changed files with 816 additions and 1 deletion.
diff --git a/.github/workflows/test_onnx_models.yml b/.github/workflows/test_onnx_models.yml
@@ -0,0 +1,61 @@
+# Copyright 2024 The IREE Authors
+#
+# Licensed under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+name: Test ONNX Models
+on:
+  push:
+    branches:
+      - main
+    paths:
+      - ".github/workflows/test_onnx_models.yml"
+      - "onnx_models/**"
+  pull_request:
+    paths:
+      - ".github/workflows/test_onnx_models.yml"
+      - "onnx_models/**"
+  workflow_dispatch:
+  schedule:
+    # Runs at 3:00 PM UTC, which is 8:00 AM PST
+    - cron: "0 15 * * *"
+
+concurrency:
+  # A PR number if a pull request and otherwise the commit hash. This cancels
+  # queued and in-progress runs for the same PR (presubmit) or commit
+  # (postsubmit). The workflow name is prepended to avoid conflicts between
+  # different workflows.
+  group: ${{ github.workflow }}-${{ github.event.number || github.sha }}
+  cancel-in-progress: true
+
+jobs:
+  test-onnx-models:
+    runs-on: ubuntu-24.04
+    env:
+      VENV_DIR: ${{ github.workspace }}/.venv
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+
+      # Install Python packages.
+      - name: Setup Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+      - name: Setup Python venv
+        run: python3 -m venv ${VENV_DIR}
+      - name: Install IREE nightly release Python packages
+        run: |
+          source ${VENV_DIR}/bin/activate
+          python3 -m pip install -r onnx_models/requirements-iree.txt
+
+      # Run tests.
+      - name: Run ONNX models test suite
+        run: |
+          source ${VENV_DIR}/bin/activate
+          pytest onnx_models/ \
+            -rA \
+            --log-cli-level=info \
+            --timeout=120 \
+            --durations=0
diff --git a/README.md b/README.md
@@ -18,11 +18,19 @@ See https://groups.google.com/g/iree-discuss/c/GIWyj8hmP0k/ for context.
 * Built with [cmake](https://cmake.org/) and run via
   [ctest](https://cmake.org/cmake/help/latest/manual/ctest.1.html) (for now?).
 
+### [onnx_models/](onnx_models/) : Open Neural Network Exchange models
+
+[![Test ONNX Models](https://github.com/iree-org/iree-test-suites/actions/workflows/test_onnx_models.yml/badge.svg?branch=main)](https://github.com/iree-org/iree-test-suites/actions/workflows/test_onnx_models.yml?query=branch%3Amain)
+
+* Tests that import, compile, and run ONNX models through IREE then compare
+  the outputs against a reference (ONNX Runtime).
+* Runnable via [pytest](https://docs.pytest.org/).
+
 ### [onnx_ops/](onnx_ops/) : Open Neural Network Exchange operations
 
 [![Test ONNX Ops](https://github.com/iree-org/iree-test-suites/actions/workflows/test_onnx_ops.yml/badge.svg?branch=main)](https://github.com/iree-org/iree-test-suites/actions/workflows/test_onnx_ops.yml?query=branch%3Amain)
 
 * 1250+ tests for [ONNX](https://onnx.ai/) framework
   [operators](https://onnx.ai/onnx/operators/).
-* Runnable via [pytest](https://docs.pytest.org/en/stable/) using a
+* Runnable via [pytest](https://docs.pytest.org/) using a
   configurable set of flags to `iree-compile` and `iree-run-module`.
diff --git a/onnx_models/.gitignore b/onnx_models/.gitignore
@@ -0,0 +1 @@
+artifacts/*
diff --git a/onnx_models/README.md b/onnx_models/README.md
@@ -0,0 +1,171 @@
+# ONNX Model Tests
+
+This test suite exercises ONNX (Open Neural Network Exchange: https://onnx.ai/)
+models. Most pretrained models are sourced from https://github.com/onnx/models.
+
+Testing follows several stages:
+
+```mermaid
+graph LR
+  Model --> ImportMLIR["Import into MLIR"]
+  ImportMLIR --> CompileIREE["Compile with IREE"]
+  CompileIREE --> RunIREE["Run with IREE"]
+  RunIREE --> Check
+
+  Model --> LoadONNX["Load into ORT"]
+  LoadONNX --> RunONNX["Run with ORT"]
+  RunONNX --> Check
+
+  Check["Compare results"]
+```
+
+## Quickstart
+
+1. Set up your virtual environment and install requirements:
+
+    ```bash
+    python -m venv .venv
+    source .venv/bin/activate
+    python -m pip install -r requirements.txt
+    ```
+
+    * To use `iree-compile` and `iree-run-module` from Python packages:
+
+        ```bash
+        python -m pip install -r requirements-iree.txt
+        ```
+
+    * To use local versions of `iree-compile` and `iree-run-module`, put them on
+      your `$PATH` ahead of your `.venv/Scripts` directory:
+
+        ```bash
+        export PATH=path/to/iree-build:$PATH
+        ```
+
+2. Run pytest using typical flags:
+
+    ```bash
+    pytest \
+      -rA \
+      --log-cli-level=info
+      --durations=0 \
+    ```
+
+    See https://docs.pytest.org/en/stable/how-to/usage.html for other options.
+
+## Advanced pytest usage
+
+* The `log-cli-level` level can also be set to `debug`, `warning`, or `error`.
+  See https://docs.pytest.org/en/stable/how-to/logging.html.
+* Run only tests matching a name pattern:
+
+    ```bash
+    pytest -k resnet
+    ```
+
+* Skip "medium" sized tests using custom markers
+  (https://docs.pytest.org/en/stable/example/markers.html):
+
+    ```bash
+    pytest -m "not size_medium"
+    ```
+
+* Ignore xfail marks
+  (https://docs.pytest.org/en/stable/how-to/skipping.html#ignoring-xfail):
+
+    ```bash
+    pytest --runxfail
+    ```
+
+* Run tests in parallel using https://pytest-xdist.readthedocs.io/
+  (note that this swallows some logging):
+
+    ```bash
+    # Run with an automatic number of threads (usually one per CPU core).
+    pytest -n auto
+
+    # Run on an explicit number of threads.
+    pytest -n 4
+    ```
+
+* Create an HTMl report using https://pytest-html.readthedocs.io/en/latest/index.html
+
+    ```bash
+    pytest --html=report.html --self-contained-html --log-cli-level=info
+    ```
+
+    See also
+    https://docs.pytest.org/en/latest/how-to/output.html#creating-junitxml-format-files
+
+## Debugging tests outside of pytest
+
+Each test generates some files as it runs:
+
+```text
+├── artifacts
+│   └── vision
+│       └── classification
+│           ├── mnist-12_version17_cpu.vmfb      (Program compiled using IREE's llvm-cpu target)
+│           ├── mnist-12_version17_input_0.bin   (Random input generated using numpy)
+│           ├── mnist-12_version17_output_0.bin  (Reference output from onnxruntime)
+│           ├── mnist-12_version17.mlir          (The model imported to MLIR)
+│           ├── mnist-12_version17.onnx          (The model upgraded to a minimum supported version)
+│           └── mnist-12.onnx                    (The downloaded ONNX model)
+```
+
+Running a test with logging enabled will show what the test is doing:
+
+```console
+pytest --log-cli-level=debug -k mnist
+
+======================================= test session starts =======================================
+platform win32 -- Python 3.11.2, pytest-8.3.3, pluggy-1.5.0
+rootdir: D:\dev\projects\iree-test-suites\onnx_models
+configfile: pytest.ini
+plugins: reportlog-0.4.0, timeout-2.3.1, xdist-3.6.1
+collected 17 items / 16 deselected / 1 selected
+
+tests/vision/classification_models_test.py::test_mnist
+------------------------------------------ live log call ------------------------------------------
+INFO     onnx_models.utils:utils.py:125 Upgrading 'artifacts\vision\classification\mnist-12.onnx' to 'artifacts\vision\classification\mnist-12_version17.onnx'
+DEBUG    onnx_models.conftest:conftest.py:90 Session input [0]
+DEBUG    onnx_models.conftest:conftest.py:91   name: 'Input3'
+DEBUG    onnx_models.conftest:conftest.py:94   shape: [1, 1, 28, 28]
+DEBUG    onnx_models.conftest:conftest.py:95   numpy shape: (1, 1, 28, 28)
+DEBUG    onnx_models.conftest:conftest.py:96   type: 'tensor(float)'
+DEBUG    onnx_models.conftest:conftest.py:97   iree parameter: 1x1x28x28xf32
+DEBUG    onnx_models.conftest:conftest.py:129 Session output [0]
+DEBUG    onnx_models.conftest:conftest.py:130   name: 'Plus214_Output_0'
+DEBUG    onnx_models.conftest:conftest.py:131   shape (actual): (1, 10)
+DEBUG    onnx_models.conftest:conftest.py:132   type (numpy): 'float32'
+DEBUG    onnx_models.conftest:conftest.py:133   iree parameter: 1x10xf32
+DEBUG    onnx_models.conftest:conftest.py:217 OnnxModelMetadata(inputs=[IreeModelParameterMetadata(name='Input3', type='1x1x28x28xf32', data_file=WindowsPath('D:/dev/projects/iree-test-suites/onnx_models/artifacts/vision/classification/mnist-12_version17_input_0.bin'))], outputs=[IreeModelParameterMetadata(name='Plus214_Output_0', type='1x10xf32', data_file=WindowsPath('D:/dev/projects/iree-test-suites/onnx_models/artifacts/vision/classification/mnist-12_version17_output_0.bin'))])
+INFO     onnx_models.utils:utils.py:135 Importing 'artifacts\vision\classification\mnist-12_version17.onnx' to 'artifacts\vision\classification\mnist-12_version17.mlir'
+INFO     onnx_models.conftest:conftest.py:160 Launching compile command:
+  cd D:\dev\projects\iree-test-suites\onnx_models && iree-compile artifacts\vision\classification\mnist-12_version17.mlir --iree-hal-target-backends=llvm-cpu -o artifacts\vision\classification\mnist-12_version17_cpu.vmfb
+INFO     onnx_models.conftest:conftest.py:180 Launching run command:
+  cd D:\dev\projects\iree-test-suites\onnx_models && iree-run-module --module=artifacts\vision\classification\mnist-12_version17_cpu.vmfb --device=local-task --input=1x1x28x28xf32=@artifacts\vision\classification\mnist-12_version17_input_0.bin --expected_output=1x10xf32=@artifacts\vision\classification\mnist-12_version17_output_0.bin
+PASSED                                                                                       [100%]
+
+================================ 1 passed, 16 deselected in 1.81s =================================
+```
+
+For this test case there is one input with shape/type `1x1x28x28xf32` stored at
+`artifacts/vision/classification/mnist-12_version17_input_0.bin` and one output
+with shape/type `1x10xf32` stored at
+`artifacts/vision/classification/mnist-12_version17_output_0.bin`.
+
+We can reproduce the compile and run commands with:
+
+```bash
+iree-compile \
+  artifacts/vision/classification/mnist-12_version17.mlir \
+  --iree-hal-target-backends=llvm-cpu \
+  -o artifacts/vision/classification/mnist-12_version17_cpu.vmfb
+
+iree-run-module \
+  --module=artifacts/vision/classification/mnist-12_version17_cpu.vmfb \
+  --device=local-task \
+  --input=1x1x28x28xf32=@artifacts/vision/classification/mnist-12_version17_input_0.bin \
+  --expected_output=1x10xf32=@artifacts/vision/classification/mnist-12_version17_output_0.bin
+```
diff --git a/onnx_models/__init__.py b/onnx_models/__init__.py