Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TFLite] Support PRelu #4298

Merged
merged 1 commit into from
Nov 10, 2019
Merged

[TFLite] Support PRelu #4298

merged 1 commit into from
Nov 10, 2019

Conversation

FrozenGene
Copy link
Member

As discuss forum topic: https://discuss.tvm.ai/t/missing-tflite-operators/3150/8, we want to have PRelu support in TFLite

@apivovarov

@tqchen tqchen merged commit 2f65a87 into apache:master Nov 10, 2019
@tqchen
Copy link
Member

tqchen commented Nov 10, 2019

Thanks @FrozenGene

@apivovarov
Copy link
Contributor

apivovarov commented Nov 12, 2019

@FrozenGene
I tried to compile my model and got the following unable to unify errors. ( I also added model visualization screenshots at the end)

  %0 = nn.pad(%input_1, pad_width=[[0, 0], [0, 1], [0, 1], [0, 0]]);
  %1 = nn.conv2d(%0, %v_param_1, strides=[2, 2], channels=16, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO");
  %2 = nn.bias_add(%1, %v_param_2, axis=3);
  %3 = nn.prelu(%2, %v_param_3, axis=3) tensor type `Tensor[(16), float32]` has 1 dimensions, while `Tensor[(1, 1, 16), float32]` has 3 dimensions; unable to unify: `Tensor[(16), float32]` and `Tensor[(1, 1, 16), float32]`; ;
  %4 = nn.conv2d(%3, %v_param_4, channels=8, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO");
  %5 = nn.bias_add(%4, %v_param_5, axis=3);
  %6 = nn.prelu(%5, %v_param_6, axis=3) tensor type `Tensor[(8), float32]` has 1 dimensions, while `Tensor[(1, 1, 8), float32]` has 3 dimensions; unable to unify: `Tensor[(8), float32]` and `Tensor[(1, 1, 8), float32]`; ;
  %7 = nn.pad(%6, pad_width=[[0, 0], [1, 1], [1, 1], [0, 0]]);
  %112 = nn.conv2d(%111, %v_param_89, channels=32, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO");
  %113 = nn.bias_add(%112, %v_param_90, axis=3);
  %114 = add(%105, %113);
  %115 = nn.prelu(%114, %v_param_91, axis=3) tensor type `Tensor[(32), float32]` has 1 dimensions, while `Tensor[(1, 1, 32), float32]` has 3 dimensions; unable to unify: `Tensor[(32), float32]` and `Tensor[(1, 1, 32), float32]`; ;
  %116 = nn.conv2d(%115, %v_param_92, channels=16, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO");
  %117 = nn.bias_add(%116, %v_param_93, axis=3);
  %118 = nn.prelu(%117, %v_param_94, axis=3) tensor type `Tensor[(16), float32]` has 1 dimensions, while `Tensor[(1, 1, 16), float32]` has 3 dimensions; unable to unify: `Tensor[(16), float32]` and `Tensor[(1, 1, 16), float32]`; ;
  %119 = nn.pad(%118, pad_width=[[0, 0], [1, 1], [1, 1], [0, 0]]);
  %538 = nn.conv2d(%537, %v_param_425, channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO");
  %539 = nn.bias_add(%538, %v_param_426, axis=3);
  %540 = add(%531, %539);
  %541 = nn.prelu(%540, %v_param_427, axis=3) tensor type `Tensor[(256), float32]` has 1 dimensions, while `Tensor[(1, 1, 256), float32]` has 3 dimensions; unable to unify: `Tensor[(256), float32]` and `Tensor[(1, 1, 256), float32]`; ;
  %542 = nn.max_pool2d(%541, pool_size=[2, 2], strides=[2, 2], layout="NHWC");
  %543 = nn.conv2d(%541, %v_param_428, strides=[2, 2], channels=128, kernel_size=[2, 2], data_layout="NHWC", kernel_layout="HWIO");
  %544 = nn.bias_add(%543, %v_param_429, axis=3);
  %545 = nn.prelu(%544, %v_param_430, axis=3) tensor type `Tensor[(128), float32]` has 1 dimensions, while `Tensor[(1, 1, 128), float32]` has 3 dimensions; unable to unify: `Tensor[(128), float32]` and `Tensor[(1, 1, 128), float32]`; ;
  %546 = nn.pad(%545, pad_width=[[0, 0], [1, 1], [1, 1], [0, 0]]);
  %620 = nn.bias_add(%619, %v_param_490, axis=3);
  %621 = add(%612, %620);
  %622 = nn.prelu(%621, %v_param_491, axis=3) tensor type `Tensor[(256), float32]` has 1 dimensions, while `Tensor[(1, 1, 256), float32]` has 3 dimensions; unable to unify: `Tensor[(256), float32]` and `Tensor[(1, 1, 256), float32]`; ;
  %623 = nn.conv2d(%622, %v_param_492, channels=128, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO");
  %624 = nn.bias_add(%623, %v_param_493, axis=3);
  %625 = nn.prelu(%624, %v_param_494, axis=3) tensor type `Tensor[(128), float32]` has 1 dimensions, while `Tensor[(1, 1, 128), float32]` has 3 dimensions; unable to unify: `Tensor[(128), float32]` and `Tensor[(1, 1, 128), float32]`; ;
  %626 = nn.pad(%625, pad_width=[[0, 0], [1, 1], [1, 1], [0, 0]]);
  %627 = nn.conv2d(%626, %v_param_495, groups=128, channels=128, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWOI");
  %628 = nn.bias_add(%627, %v_param_496, axis=3);
  %629 = nn.conv2d(%628, %v_param_497, channels=256, kernel_size=[1, 1], data_layout="NHWC", kernel_layout="HWIO");
  %630 = nn.bias_add(%629, %v_param_498, axis=3);
  %631 = add(%622, %630);
  %632 = nn.prelu(%631, %v_param_499, axis=3) tensor type `Tensor[(256), float32]` has 1 dimensions, while `Tensor[(1, 1, 256), float32]` has 3 dimensions; unable to unify: `Tensor[(256), float32]` and `Tensor[(1, 1, 256), float32]`; ;
  %633 = nn.conv2d(%632, %v_param_502, channels=42, kernel_size=[2, 2], data_layout="NHWC", kernel_layout="HWIO");

https://www.dropbox.com/s/lr9wmv0dminvd10/Screenshot%202019-11-12%2013.53.12.png?dl=0

https://www.dropbox.com/s/6jppn19hae2yzte/Screenshot%202019-11-12%2013.54.17.png?dl=0

# Tensors
index    name         type       shape            buffer  quantization
0    input_1          FLOAT32    [1, 256, 256, 3]    0    None
1    conv2d/Kernel    FLOAT32    [16, 3, 3, 3]       1    None
2    conv2d/Bias      FLOAT32    [16]                2    None
3    conv2d           FLOAT32    [1, 128, 128, 16]   0    None
4    p_re_lu/Alpha    FLOAT32    [1, 1, 16]          3    None
5    p_re_lu          FLOAT32    [1, 128, 128, 16]   0    None
6    conv2d_1/Kernel  FLOAT32    [8, 1, 1, 16]       4    None
7    conv2d_1/Bias    FLOAT32    [8]                 5    None
8    conv2d_1         FLOAT32    [1, 128, 128, 8]    0    None
9    p_re_lu_1/Alpha  FLOAT32    [1, 1, 8]           6    None
10    p_re_lu_1       FLOAT32    [1, 128, 128, 8]    0    None
11    depthwise_conv2d/Kernel    FLOAT32    [1, 3, 3, 8]    7    None

@apivovarov
Copy link
Contributor

apivovarov commented Nov 12, 2019

PRELU docs says that H and W can be "shared". That is what we see in my custom tflite model.
PRELU Alpha tensor with index 4 has shape [1,1,16]. It means that H and W are 1 - H and W are shared, alpha parameters only exist for color dimension.
Also, Alpha tensor is 3D, not 4D. It is because Batch dimension is probably always shared and simply omitted in Alpha tensor shape. If TVM Relay needs PRELU Alpha tensor dimension size to be the same as main input tensor dimension size then we should add 1 to Alpha tensor shape for batch dimension. [1,1,16] -> [1,1,1,16]

zxy844288792 pushed a commit to neo-ai/tvm that referenced this pull request Nov 13, 2019
@FrozenGene
Copy link
Member Author

could you share your custom tflite model of prelu? Seems that we should reshape alpha tensor to 4D make share meet TVM requirement.

@apivovarov
Copy link
Contributor

I can not share the file unfortunately. I'll try to create similar tflite file to reproduce the issue.

@FrozenGene
Copy link
Member Author

Ok. You could refer my unittest and see how the prelu is constructed, because tf doesn’t have prelu, tflite just recognize op pattern to produce prelu. Then we could see how to solve your model problem.

@apivovarov
Copy link
Contributor

Will test it. BTW the model is hand_landmark.tflite. It is publicly available https://github.com/google/mediapipe/blob/master/mediapipe/models/hand_landmark.tflite

@apivovarov
Copy link
Contributor

I tried the fix with hand_landmark.tflite model. Compilation works. The outputs are the same as when using TFLite runtime. TVM performance is 2.5x faster on cpu. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants