Skip to content

"Failed to convert to numpy array for model '_cortex_default'" when making a prediction on an ONNX model #1186

Closed
@RobertLucian

Description

@RobertLucian

Version

Version 0.18.0

Description

When the input shape of an ONNX model has been set to a string (thus indicating that the axes are dynamic), then making a prediction will give an error of this kind:

cortex.lib.exceptions.UserException: error: key 'input_ids' for model '_cortex_default': failed to convert to NumPy array for model '_cortex_default': cannot reshape array of size 6 into shape (1,1)

Here's an example of a model's input shapes:

model input      type    shape
attention_mask   int64   (batch, sequence)
input_ids        int64   (batch, sequence)

Steps to reproduce

Within a given directory, run all the following steps.

Creating environment/model

Create a virtual environment for Python 3.6.9 and install the following pip dependencies:

onnxruntime==1.3.0
torch==1.5.0
transformers==3.0.0
scipy==1.4.1

Within that environment, run the following instructions to export the XLM-Roberta model in ONNX format:

from transformers.convert_graph_to_onnx import convert
convert(framework="pt", model="xlm-roberta-base", output="./output/xlm-roberta-base.onnx", opset=11)

Now, let's run the following:

python -m onnxruntime_tools.optimizer_cli --input ./output/xlm-roberta-base.onnx --output ./output/xlm-roberta-base.onnx --model_type bert --float16
Creating the Cortex deployment

Create a cortex.yaml config file with the following content:

# cortex.yaml

- name: api
  predictor:
    type: onnx
    model_path: ./output/xlm-roberta-base.onnx
    path: predictor.py
    image: cortexlabs/onnx-predictor-cpu:0.18.0

Create a predictor.py script with the following content:

# predictor.py

from transformers import XLMRobertaTokenizer
from scipy.special import softmax
import time


class ONNXPredictor:
    def __init__(self, onnx_client, config):
        self.client = onnx_client
        self.tokenizer = XLMRobertaTokenizer.from_pretrained("xlm-roberta-base")

    def predict(self, payload):
        start = time.time()
        model_inputs = self.tokenizer.encode_plus(payload["text"], max_length=512, return_tensors="pt", truncation=True)
        inputs_onnx = {k: v.cpu().detach().numpy() for k, v in model_inputs.items()}
        print(self.client._signatures)

        output = self.client.predict(inputs_onnx)
        output = softmax(output[0], axis=1)[0].tolist()

        end = time.time()
        return {"output": output, "time": end - start}

Copy-paste the pip dependencies as mentioned above into a requirements.txt file and within the same directory as that of the cortex.yaml config file, run cortex deploy -e local. Wait for the API to be live and then run:

curl http://localhost:8888 -X POST -H "Content-Type: application/json" -d '{"text": "That is a nice"}'
Error

The above command will return a non-200 response code. Inspect the logs with cortex get api. The expected error is:

cortex.lib.exceptions.UserException: error: key 'input_ids' for model '_cortex_default': failed to convert to numpy array for model '_cortex_default': cannot reshape array of size 6 into shape (1,1)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions