"Failed to convert to numpy array for model '_cortex_default'" when making a prediction on an ONNX model

#### Version

Version 0.18.0

#### Description

When the input shape of an ONNX model has been set to a string (thus indicating that the axes are dynamic), then making a prediction will give an error of this kind:
```text
cortex.lib.exceptions.UserException: error: key 'input_ids' for model '_cortex_default': failed to convert to NumPy array for model '_cortex_default': cannot reshape array of size 6 into shape (1,1)
```

Here's an example of a model's input shapes:
```
model input      type    shape
attention_mask   int64   (batch, sequence)
input_ids        int64   (batch, sequence)
```

#### Steps to reproduce

Within a given directory, run all the following steps.

##### Creating environment/model

Create a virtual environment for Python 3.6.9 and install the following pip dependencies:
```text
onnxruntime==1.3.0
torch==1.5.0
transformers==3.0.0
scipy==1.4.1
```

Within that environment, run the following instructions to export the XLM-Roberta model in ONNX format:
```python
from transformers.convert_graph_to_onnx import convert
convert(framework="pt", model="xlm-roberta-base", output="./output/xlm-roberta-base.onnx", opset=11)
```

Now, let's run the following:
```bash
python -m onnxruntime_tools.optimizer_cli --input ./output/xlm-roberta-base.onnx --output ./output/xlm-roberta-base.onnx --model_type bert --float16
```

##### Creating the Cortex deployment

Create a `cortex.yaml` config file with the following content:
```yaml
# cortex.yaml

- name: api
  predictor:
    type: onnx
    model_path: ./output/xlm-roberta-base.onnx
    path: predictor.py
    image: cortexlabs/onnx-predictor-cpu:0.18.0
```

Create a `predictor.py` script with the following content:
```python
# predictor.py

from transformers import XLMRobertaTokenizer
from scipy.special import softmax
import time


class ONNXPredictor:
    def __init__(self, onnx_client, config):
        self.client = onnx_client
        self.tokenizer = XLMRobertaTokenizer.from_pretrained("xlm-roberta-base")

    def predict(self, payload):
        start = time.time()
        model_inputs = self.tokenizer.encode_plus(payload["text"], max_length=512, return_tensors="pt", truncation=True)
        inputs_onnx = {k: v.cpu().detach().numpy() for k, v in model_inputs.items()}
        print(self.client._signatures)

        output = self.client.predict(inputs_onnx)
        output = softmax(output[0], axis=1)[0].tolist()

        end = time.time()
        return {"output": output, "time": end - start}
```

Copy-paste the pip dependencies as mentioned above into a `requirements.txt` file and within the same directory as that of the `cortex.yaml` config file, run `cortex deploy -e local`. Wait for the API to be live and then run:
```bash
curl http://localhost:8888 -X POST -H "Content-Type: application/json" -d '{"text": "That is a nice"}'
```

##### Error

The above command will return a non-200 response code. Inspect the logs with `cortex get api`. The expected error is:
```
cortex.lib.exceptions.UserException: error: key 'input_ids' for model '_cortex_default': failed to convert to numpy array for model '_cortex_default': cannot reshape array of size 6 into shape (1,1)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"Failed to convert to numpy array for model '_cortex_default'" when making a prediction on an ONNX model #1186

Version

Description

Steps to reproduce

Creating environment/model

Creating the Cortex deployment

Error

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

"Failed to convert to numpy array for model '_cortex_default'" when making a prediction on an ONNX model #1186

Description

Version

Description

Steps to reproduce

Creating environment/model

Creating the Cortex deployment

Error

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions