-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slice op for NC4HW4 input tensor does not respect C4 data layout #2967
Comments
Can you use testMNNFromOnnx.py to reproduce the error? |
Reproduced, debuging... |
It's bug for region fuse, can modify this code to solve it. --- a/source/core/TensorUtils.cpp
|
Thank you for the patch - it works! |
For easy you can close region fuse: diff --git a/source/core/TensorUtils.cpp b/source/core/TensorUtils.cpp // fuse srcRegion and dstRegion to dstRegion if return true
|
Well, it works, but unfortunately networks with slice op that were unaffected by this bug (slice channels count is multiple of 4) has performance degradation ~15% |
Best way is to update... |
MNN: 2.9.3
API: C++
I encountered problem that Slicing op does not respect data layout for models converted from Caffe framework (NC4HW4).
Here is simple Caffe prototxt
It slices 1x8x6x6 tensor into 4 tensors 1x1x6x6, 1x4x6x6, 1x2x6x6, 1x1x6x6
After converting it with MNNConvert tool it turned out that 4th channel of the second slice contains wrong values.
After debugging MNN code I found that it probably happens because data format for Caffe models is NC4HW4, so input data is packed into two 1x4x6x6 chunks and the second slice should take 3 channels from the first chunk and forth channel from the second chunk.
However, when it goes to final method that does data copying MNNTranspose32Bit it does not care about C4 data chunking.
For this particular model I tried to add quick hack after this line:
to force proceeding to next C4 chunk and results are as expected.
Interestingly, the temp output Tensor in CPURaster has NCHW format that is converted to NC4HW4 as the last step. Maybe it is forgotten to convert input tensor to NCHW layout before slicing.
Also, I tried to convert ONNX model that does the same and it works normally. However, debugging shows that it uses different tensor layout internally: NCHW.
I attach sample models (slice_caffe.mnn - converted from Caffe, slice_onnx.mnn - converted from ONNX) and sample code (inference.cpp) that can be used to check the bug (it creates 1x8x6x6 tensor filled from 0 to 287 passes through slice model and checks that outputs are the same as input):
slice_caffe.mnn.zip
slice_onnx.mnn.zip
inference.cpp.zip
The text was updated successfully, but these errors were encountered: