Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is There a Sample Showing How to Convert to ONNX? #41

Closed
SeymourNickelson opened this issue Nov 9, 2023 · 8 comments
Closed

Is There a Sample Showing How to Convert to ONNX? #41

SeymourNickelson opened this issue Nov 9, 2023 · 8 comments

Comments

@SeymourNickelson
Copy link

This looks like a really cool project. Thanks for your hard work.

Could someone please provide a sample of how to convert to ONNX? I'm new to this and I'm having a hard time figuring out how to provide the sample input. I see some others were having the same problem in this closed issue (#23). While some said they were able to export to ONNX, nobody provided a code sample of the export formula.

I see the forward method in ForwardTransformer class takes a dictionary with tensors for the following keys: text, start_index, and text_len.'

Since the model takes text on input at variable length (not fixed size) and returns output I assume I have to tell orch.onnx.export that the input for the "text" entry is a dynamic shape and that the output is a dynamic shape. I tried setting dynamic_axes but didn't have any success. If anyone could provide a sample it would be much appreciated.

@SeymourNickelson SeymourNickelson changed the title Is there a sample of How to Convert to ONNX? Is There a Sample Showing How to Convert to ONNX? Nov 9, 2023
@SeymourNickelson
Copy link
Author

SeymourNickelson commented Nov 14, 2023

I was able to export a DeepPhonemizer model to onnx but the shapes on the exported model aren't dynamic (and I therefore can't just feed it text of arbitrary length). I used the following to create the onnx mode:

Define a dictionary input that matches the model's expected input format

sample_input = {
    'text': torch.tensor([[2, 23, 23, 23, 15, 15, 15, 22, 22, 22, 21, 21, 21, 12, 12, 12, 20, 20,
                           20, 16, 16, 16, 33, 33, 33, 16, 16, 16, 21, 21, 21, 14, 14, 14, 6, 0,
                           0, 0],
                          [2, 16, 16, 16, 20, 20, 20, 23, 23, 23, 22, 22, 22, 26, 26, 26, 16, 16,
                           16, 20, 20, 20, 23, 23, 23, 8, 8, 8, 9, 9, 9, 19, 19, 19, 12, 12,
                           12, 6]]),
    'text_len': torch.tensor([35, 38]),
    'start_index': torch.tensor([2, 2])
}

output_names = ["phonemes"]

input_names = ["text"]

dynamicAxes = {"text": {0: "text"},
               "phonemes": {0: "batch_size"}
              }

 # Export the model to ONNX
torch.onnx.export(theModel,
                  (sample_input,{}),
                  onnx_path,
                  verbose=True,
                  input_names=input_names,
                  output_names=output_names,
                  dynamic_axes=dynamicAxes)

I experimented with also including 'text_len' and 'start_index' in the input_names array but I couldn't get the model to export to onnx with those keys included. Clearly I'm not properly telling the converter I want a dynamic shape. Any help would be appreciated. It would be awesome to be able to deploy this with the onnx runtime.

@SeymourNickelson
Copy link
Author

I had better luck with the other newer ONNX exporter provided by PyTorch.

@NextDevX
Copy link

NextDevX commented May 16, 2024

Hi @SeymourNickelson ,
Have you found a way to convert the model to onnx format and use it?

@debasish-mihup
Copy link

@NextDevX Kafan1986 here, replying to you from another account. As mentioned earlier, this version won't give you token probabilities separately and you need to manually remove consecutive duplicates from the final output. Should be quite easy.

import torch
import torch.nn as nn
from dp.preprocessing.text import Preprocessor
from torch.nn import TransformerEncoderLayer, LayerNorm, TransformerEncoder
from dp.model.utils import _make_len_mask, PositionalEncoding


class MihupPostProcessing(torch.nn.Module):
    def __init__(self) -> None:
        super().__init__()

    def forward(self, x: torch.Tensor) -> torch.Tensor:         # shape: [T, N]
        x = x.transpose(0, 1)
        y = x.argmax(2)
        y = y[y != 0]
        #y = torch.unique_consecutive(x) # This part was not supported by onnx. Have to write own logic on model output to mimic same
        return y


class ForwardTransformerHelper(nn.Module):
    def __init__(self,
                 encoder_vocab_size: int,
                 decoder_vocab_size: int,
                 d_model=512,
                 d_fft=1024,
                 layers=4,
                 dropout=0.1,
                 heads=1) -> None:
        super().__init__()

        self.d_model = d_model

        self.embedding = nn.Embedding(encoder_vocab_size, d_model)
        self.pos_encoder = PositionalEncoding(d_model, dropout)

        encoder_layer = TransformerEncoderLayer(d_model=d_model,
                                                nhead=heads,
                                                dim_feedforward=d_fft,
                                                dropout=dropout,
                                                activation='relu')
        encoder_norm = LayerNorm(d_model)
        self.encoder = TransformerEncoder(encoder_layer=encoder_layer,
                                          num_layers=layers,
                                          norm=encoder_norm)

        self.fc_out = nn.Linear(d_model, decoder_vocab_size)
        self.custom_mihup_postprocessing = MihupPostProcessing()

    def forward(self, input_tensor: torch.tensor):
        print()
        x = input_tensor.transpose(0, 1)  # shape: [T, N]
        src_pad_mask = _make_len_mask(x).to(x.device)
        x = self.embedding(x)
        x = self.pos_encoder(x)
        x = self.encoder(x, src_key_padding_mask=src_pad_mask)
        x = self.fc_out(x)
        x = self.custom_mihup_postprocessing(x)
        return x

    @classmethod
    def from_config(cls, config: dict) -> 'ForwardTransformerHelper':
        preprocessor = Preprocessor.from_config(config)
        return ForwardTransformerHelper(
            encoder_vocab_size=preprocessor.text_tokenizer.vocab_size,
            decoder_vocab_size=preprocessor.phoneme_tokenizer.vocab_size,
            d_model=config['model']['d_model'],
            d_fft=config['model']['d_fft'],
            layers=config['model']['layers'],
            dropout=config['model']['dropout'],
            heads=config['model']['heads']
        )

_checkpoint_path = "trans_small_wer0.374_per0.074.pt"
_device = torch.device('cpu')
checkpoint = torch.load(_checkpoint_path, map_location=_device)
model = ForwardTransformerHelper.from_config(config=checkpoint['config'])
model.load_state_dict(checkpoint['model'])
model.eval()

model_step = checkpoint['step']
print('Initializing phonemizer with model step {0}'.format(model_step))

input_dummy = torch.tensor([[1, 20, 20, 20, 23, 23, 23, 16, 16, 16, 16, 16, 16, 11, 11, 11, 16, 16, 16,  9,  9,  9,  2]])
# Torch inference
torch_output = model.forward(input_dummy)
print("torch_output: ", torch_output)

torch.onnx.export(model,               # model being run
                  input_dummy,                         # model input (or a tuple for multiple inputs)
                  "trans_small_wer0.374_per0.074.onnx",   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=17,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names=['embedding'],
                  output_names=['custom_mihup_postprocessing'],
                  dynamic_axes={'embedding': {1: 'input_length'}, 'custom_mihup_postprocessing': {0:  'output_length'}})

import onnxmltools
import onnx
from onnxmltools.utils.float16_converter import convert_float_to_float16
onnx_model = onnxmltools.utils.load_model("trans_small_wer0.374_per0.074.onnx")
onnx_model = convert_float_to_float16(onnx_model)
onnx.save(onnx_model,"trans_small_wer0.374_per0.074_fp16.onnx")
print("fp16 quant complete")

print("Model converted successfully")

# Inference example
import onnxruntime
x = torch.tensor([[ 1,  4,  4,  4, 10, 10, 10,  3,  3,  3, 21, 21, 21, 25, 25, 25,  3,  3,
          3, 22, 22, 22, 11, 11, 11,  2]])
onnx_runtime_input = x.detach().numpy()
ort_session = onnxruntime.InferenceSession("trans_small_wer0.374_per0.074_fp16.onnx")
ort_inputs = {ort_session.get_inputs()[0].name: onnx_runtime_input}
ort_outs = ort_session.run(None, ort_inputs)
print(ort_outs)

@NextDevX
Copy link

NextDevX commented May 16, 2024

@debasish-mihup

Thank you for your answer.
Unfortunately I am getting this error. I'm stuck here because I'm unfamiliar with onnx.


[ 1 ; 3 1 m 2 0 2 4 - 0 5 - 1 7   0 1 : 5 7 : 3 1 . 4 5 3 0 3 1 3   [ E : o n n x r u n t i m e : ,   s e q u e n t i a l _ e x e c u t o r . c c : 5 1 4   o n n x r u n t i m e : : E x e c u t e K e r n e l ]   N o n - z e r o   s t a t u s   c o d e   r e t u r n e d   w h i l e   r u n n i n g   R e s h a p e   n o d e .   N a m e : ' / e n c o d e r / l a y e r s . 0 / s e l f _ a t t n / R e s h a p e _ 4 '   S t a t u s   M e s s a g e :   C : \ a \ _ w o r k \ 1 \ s \ o n n x r u n t i m e \ c o r e \ p r o v i d e r s \ c p u \ t e n s o r \ r e s h a p e _ h e l p e r . h : 4 5   o n n x r u n t i m e : : R e s h a p e H e l p e r : : R e s h a p e H e l p e r   i n p u t _ s h a p e _ s i z e   = =   s i z e   w a s   f a l s e .   T h e   i n p u t   t e n s o r   c a n n o t   b e   r e s h a p e d   t o   t h e   r e q u e s t e d   s h a p e .   I n p u t   s h a p e : { 2 6 , 1 , 5 1 2 } ,   r e q u e s t e d   s h a p e : { 2 3 , 4 , 1 2 8 } 
[ m 
 Traceback (most recent call last):
  File "C:\Users\IdeaPad\Desktop\DeepPhonemizer\test.py", line 10, in <module>
    ort_outs = ort_session.run(None, ort_inputs)
  File "C:\Users\IdeaPad\Desktop\DeepPhonemizer\venv\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 220, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'/encoder/layers.0/self_attn/Reshape_4' Status Message: C:\a\_work\1\s\onnxruntime\core\providers\cpu\tensor\reshape_helper.h:45 onnxruntime::ReshapeHelper::ReshapeHelper input_shape_size == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{26,1,512}, requested shape:{23,4,128}

@artnoage
Copy link

@NextDevX Did you manage to make it run after all?

@NextDevX
Copy link

@NextDevX Did you manage to make it run after all?

Yes, I can use onnxruntime and run it.

@artnoage
Copy link

artnoage commented Jul 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants