Add image height and width to ONNX dynamic axes #18915

lewtun · 2022-09-07T07:59:07Z

What does this PR do?

This PR enables dynamic axes for image height / width of ONNX vision models. This allows users to change the height and width of their inputs at runtime with values different from those used to trace the model during the export (usually 224 x 224 pixels)

Here's an example with ResNet and optimum:

import requests
from PIL import Image
from optimum.onnxruntime import ORTModelForImageClassification
from transformers import AutoFeatureExtractor

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
# Raw image size 480 x 640 pixels
image = Image.open(requests.get(url, stream=True).raw)
# Resize image to 40 x 40 pixels
preprocessor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50",  do_resize=True, size=40)
model = ORTModelForImageClassification.from_pretrained("onnx")
inputs = preprocessor(images=image, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
logits.shape

I've also checked the slow tests pass:

RUN_SLOW=1 pytest tests/onnx/test_onnx_v2.py -k "beit or clip or convnext or data2vec-vision or deit or detr or layoutlmv3 or levit or mobilevit or resnet or vit" -s

lewtun · 2022-09-07T07:59:44Z

src/transformers/models/clip/configuration_clip.py

@@ -332,7 +332,7 @@ def inputs(self) -> Mapping[str, Mapping[int, str]]:
        return OrderedDict(
            [
                ("input_ids", {0: "batch", 1: "sequence"}),
-                ("pixel_values", {0: "batch"}),
+                ("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),


I noticed that the CLIP export was also missing num_channels as a dynamic axis, so included it here as well

lewtun · 2022-09-07T08:00:57Z

src/transformers/models/layoutlmv3/configuration_layoutlmv3.py

@@ -203,7 +203,7 @@ def inputs(self) -> Mapping[str, Mapping[int, str]]:
                    ("input_ids", {0: "batch", 1: "sequence"}),
                    ("attention_mask", {0: "batch", 1: "sequence"}),
                    ("bbox", {0: "batch", 1: "sequence"}),
-                    ("pixel_values", {0: "batch", 1: "sequence"}),
+                    ("pixel_values", {0: "batch", 1: "num_channels", 2: "height", 3: "width"}),


Following #17976 I've renamed sequence to num_channels

HuggingFaceDocBuilderDev · 2022-09-07T08:13:41Z

The documentation is not available anymore as the PR was closed or merged.

regisss

LGTM thanks @lewtun!!

sgugger

LGTM, thanks for working on this!

Add image height and width to ONNX dynamic axes

d67d2f3

lewtun commented Sep 7, 2022

View reviewed changes

lewtun requested review from regisss, sgugger and michaelbenayoun September 7, 2022 08:01

regisss approved these changes Sep 7, 2022

View reviewed changes

sgugger approved these changes Sep 7, 2022

View reviewed changes

lewtun merged commit 6519150 into main Sep 7, 2022

lewtun deleted the lewtun/add-onnx-vision-features branch September 7, 2022 20:42

oneraghavan pushed a commit to oneraghavan/transformers that referenced this pull request Sep 26, 2022

Add image height and width to ONNX dynamic axes (huggingface#18915)

70f08f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add image height and width to ONNX dynamic axes #18915

Add image height and width to ONNX dynamic axes #18915

lewtun commented Sep 7, 2022 •

edited

Loading

lewtun Sep 7, 2022

lewtun Sep 7, 2022

HuggingFaceDocBuilderDev commented Sep 7, 2022 •

edited

Loading

regisss left a comment

sgugger left a comment

Add image height and width to ONNX dynamic axes #18915

Add image height and width to ONNX dynamic axes #18915

Conversation

lewtun commented Sep 7, 2022 • edited Loading

What does this PR do?

lewtun Sep 7, 2022

Choose a reason for hiding this comment

lewtun Sep 7, 2022

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Sep 7, 2022 • edited Loading

regisss left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

lewtun commented Sep 7, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 7, 2022 •

edited

Loading