Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VTA][Relay] Reshape mismatch and Compile fail when running yolo-v3-tiny model in tutorial deploy_detection.py #7301

Closed
codgeek opened this issue Jan 17, 2021 · 5 comments

Comments

@codgeek
Copy link

codgeek commented Jan 17, 2021

Reproduce code:
config set(USE_VTA_FSIM ON) in build/config.cmake , build tvm with vta on host

cd ${TVM_HOME} && python vta/tutorials/frontend/legacy/deploy_detection.py

description:
I am using the newest version code on 'Fri Jan 15 07:59:28 2021' with pynq Z1 board. Error occurs inside graph_pack() when running yolo-v3-tiny in deploy_detection.py.

File "vta/tutorials/frontend/legacy/deploy_detection.py", line 243, in <module>
    stop_name_idx=pack_dictMODEL_NAME,
  File "tvm/vta/python/vta/top/graphpack.py", line 599, in graph_pack
    expr = run_opt_pass(expr, transform.InferType())
  File "tvm/vta/python/vta/top/graphpack.py", line 30, in run_opt_pass
    mod = opt_pass(mod)
  File "tvm/python/tvm/ir/transform.py", line 127, in call
    return _ffi_transform_api.RunPass(self, mod)
  File "tvm/python/tvm/ffi/ctypes/packed_func.py", line 237, in call
    raise get_last_ffi_error()
File "tvm/src/relay/analysis/type_solver.cc", line 622
TVMError: 
  Check failed: false == false: [00:55:37]  tvm/src/relay/op/tensor/transform.cc:703: 
  Check failed: oshape_sum == data_shape_sum (172380 vs. 173056) : Input tensor shape and reshaped shape are not compatible, 
reshape data_shape:[1, 1, 16, 16, 26, 26], oshape:[1, 255, 26, 26]

I've added more shape info in src/relay/op/tensor/transform.cc as follows:

diff --git a/src/relay/op/tensor/transform.cc b/src/relay/op/tensor/transform.cc
index ecfde359d..7e150e2c9 100644
--- a/src/relay/op/tensor/transform.cc
+++ b/src/relay/op/tensor/transform.cc
@@ -386,7 +386,7 @@ bool TransposeRel(const Array<Type>& types, int num_inputs, const Attrs& attrs,
   // check dimension match
   ICHECK(!axes.defined() || static_cast<int>(axes.size()) == ndim)
       << "Dimension mismatch: axes has " << axes.size() << " elements"
-      << ", but data.ndim = " << ndim;
+      << ", but data.ndim = " << ndim << ", transpose data_shape:" << data->shape << ", axes:" << axes;
   // construct int_axes
   std::vector<int> int_axes;
   int_axes.reserve(ndim);
@@ -701,7 +701,7 @@ bool ReshapeRel(const Array<Type>& types, int num_inputs, const Attrs& attrs,
   }
   if (!found_dynamic) {
     ICHECK_EQ(oshape_sum, data_shape_sum)
-        << "Input tensor shape and reshaped shape are not compatible";
+        << "Input tensor shape and reshaped shape are not compatible" << ", reshape data_shape:" << data_shape << ", oshape:" << oshape;;
   }
 
   reporter->Assign(types[1], TensorType(oshape, data->dtype));

problem:
A relay IR is trying to reshape source tensor with shape [1, 1, 16, 16, 26, 26] in to shape [1, 255, 26, 26] while the high demision is 16*16=256.
I've tried to find the error op, it seems the %162-th op in compiled module has output shape of (1, 255, 26, 26), The error problem is why it's shape is transformed into 256 at previous stages?
Hoping programers knowing vta or relay could offer some help, thanks a lot!

            print("quant modules:", mod)
            # Perform graph packing and constant folding for VTA target
            mod = graph_pack(
                mod["main"],
                env.BATCH,
                env.BLOCK_OUT,
                env.WGT_WIDTH,
                start_name=pack_dict[MODEL_NAME][0],
                stop_name=pack_dict[MODEL_NAME][1],
                start_name_idx=pack_dict[MODEL_NAME][2],
                stop_name_idx=pack_dict[MODEL_NAME][3],
            )

%160 = (%156, %157, %159);
%161 = concatenate(%160, axis=2) /* ty=Tensor[(1, 3, 85, 26, 26), float32] /;
%162 = reshape(%161, newshape=[-1, 255, 26, 26]) /
ty=Tensor[(1, 255, 26, 26), float32] /;
%163 = cast(%111, dtype="int8") /
ty=Tensor[(1, 256, 13, 13), int8] */;

related changes:
after some searching, this is some related issues i’ve found.
@fprotopapa had fixed another issue #7059 in pack_dict configure, change the cast inedx from 185 to 186. 👍 Did your deploy_detection.py run successfully after this change? looking forward to your practice, many thanks!

@huajsj had add the yolo-v3-tiny support in this commit

some additional configs are:
I‘ve verified deploy_detection.py my macbook proDarwin Kernel Version 16.4.0: and ubuntu linux server Ubuntu 18.04.2 LTS (GNU/Linux 5.4.0-58-generic x86_64), the same problem occurs.

target: xilinx pynq Z1.

host: macbook pro
sys-platform: x86_64, macOS-10.12.3-x86_64-i386-64bit
os-version: Darwin Kernel Version 16.4.0:

python: 3.6
llvm: 10.0.0
cmake: 3.15.3
tvm:  build from newest version code  on 'Fri Jan 15 07:59:28 2021': (https://github.com/apache/tvm/tree/c9474639dd3761b78a457ab274603d87a3dcf9b8)

@codgeek codgeek changed the title [VTA][Relay] Reshape mismatch and Compile fail when running yolo-v3-tiny model in deploy_detection.py [VTA][Relay] Reshape mismatch and Compile fail when running yolo-v3-tiny model in tutorial deploy_detection.py Jan 17, 2021
@codgeek
Copy link
Author

codgeek commented Jan 17, 2021

some new discoveries:
After reset to version when yolo-v3-tiny from @huajsj is merged 09c55fd1f3354d2280bb792a252590ac6bd68e58. the detection task works! Inference time on pynq Z1 is reduced to 615.49ms from 4520ms on arm_cpu version.

As an strong infer , some changes in 'Relay transform or vta packing' might lead to the error on latest version wrote in the bug above. could any 'good soul' offer some help ?

Cannot find config for target=ext_dev -device=vta -keys=cpu -model=pynq_1x16_i8w8a32_15_15_18_17, workload=('conv2d_nchw_winograd.arm_cpu', A fallback configuration is used, which may bring great performance regression.
yolov3-tiny inference graph built in 14.41s!
Performed inference in 615.49ms (std = 0.02) for 1 samples
Average per sample inference time: 615.49ms

@huajsj
Copy link
Contributor

huajsj commented Aug 12, 2021

@codgeek , thanks for report this issue, please check if this PR #8731 fixed the said issue.

@codgeek
Copy link
Author

codgeek commented Dec 9, 2021

@codgeek , thanks for report this issue, please check if this PR #8731 fixed the said issue.

sorry to reply after several months, It works now! thank you so much! 👍👍

@codgeek codgeek closed this as completed Dec 9, 2021
@rassB
Copy link

rassB commented Jul 23, 2023

Hey i am trying to deploy my own NN on VTA and i am having this exact problem, how did @huajsj fix this ?

@codgeek
Copy link
Author

codgeek commented Jul 28, 2023

Hey i am trying to deploy my own NN on VTA and i am having this exact problem, how did @huajsj fix this ?

see this PR #8731 : #8731

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants