Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Add Android Tutorial #2977

Merged
merged 6 commits into from
Apr 16, 2019
Merged

[DOC] Add Android Tutorial #2977

merged 6 commits into from
Apr 16, 2019

Conversation

tkat0
Copy link
Contributor

@tkat0 tkat0 commented Apr 6, 2019

Hello. I am the author of tkat0/chainer-nnvm-example.

Thank you for referring from the official document.

I wanted to include a tutorial for Android in the official Tutorial, and I sent a PR.

The above repository used NNVM, but I created a tutorial using Relay.

I build a document with the following command and uploaded it here.
I confirmed that the document looks good.

docker/bash.sh tvmai/ci-gpu:v0.51 TVM_TUTORIAL_EXEC_PATTERN=deploy_model_on_android\.py make -C docs html

Thank you.

@tkat0
Copy link
Contributor Author

tkat0 commented Apr 6, 2019

Jenkins had failed when building a document, so I tried to write minimal code that reproduced the Jenkins build.
I can reproduce it, so I will fix it.

Command to reproduce
# login ci container
docker/bash.sh tvmai/ci-gpu:v0.51

# build tvm
rm -fr build
mkdir -p build
cd build
cp ../cmake/config.cmake .
echo set\(USE_CUBLAS ON\) >> config.cmake
echo set\(USE_CUDNN ON\) >> config.cmake
echo set\(USE_CUDA ON\) >> config.cmake
echo set\(USE_OPENGL ON\) >> config.cmake
echo set\(USE_LLVM llvm-config-6.0\) >> config.cmake
echo set\(USE_NNPACK ON\) >> config.cmake
echo set\(NNPACK_PATH /NNPACK/build/\) >> config.cmake
echo set\(USE_RPC ON\) >> config.cmake
echo set\(USE_SORT ON\) >> config.cmake
echo set\(USE_GRAPH_RUNTIME ON\) >> config.cmake
echo set\(USE_STACKVM_RUNTIME ON\) >> config.cmake
echo set\(USE_GRAPH_RUNTIME_DEBUG ON\) >> config.cmake
echo set\(USE_ANTLR ON\) >> config.cmake
echo set\(USE_BLAS openblas\) >> config.cmake
echo set\(CMAKE_CXX_COMPILER g++\) >> config.cmake
echo set\(CMAKE_CXX_FLAGS -Werror\) >> config.cmake
cmake ..
make -j12

# build documents
rm -rf docs/_build/
mkdir -p docs/_build/html
rm -rf docs/tutorials
rm -rf python/tvm/*.pyc python/tvm/*/*.pyc python/tvm/*/*/*.pyc
PYTHONPATH=`pwd`/../python make -C docs html 

However, When I limited this tutorial to build, it completed successfully.

-PYTHONPATH=`pwd`/../python make -C docs html
+PYTHONPATH=`pwd`/../python TVM_TUTORIAL_EXEC_PATTERN=deploy_model_on_android\.py make -C docs html

The error is happening at relay.build, and the Relay graph seems strange.
Since the MobileNetV2 has only one input, it is strange that input_2 appears.

WARNING: /workspace/tutorials/frontend/deploy_model_on_android.py failed to execute correctly: Traceback (most recent call last):
  File "/workspace/tutorials/frontend/deploy_model_on_android.py", line 249, in <module> 
    target_host=target_host, params=params)
  File "/workspace/docs/../python/tvm/relay/build_module.py", line 276, in build
    func = optimize(func, target, params)
  File "/workspace/docs/../python/tvm/relay/build_module.py", line 163, in optimize
    func = ir_pass.infer_type(func)
  File "/workspace/docs/../python/tvm/relay/ir_pass.py", line 353, in infer_type
    return _ir_pass.infer_type(expr, mod)
  File "/workspace/docs/../python/tvm/_ffi/_ctypes/function.py", line 190, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) /workspace/build/libtvm.so(+0x4ac55f) [0x7fd6fda7c55f]
  [bt] (7) /workspace/build/libtvm.so(+0x65defd) [0x7fd6fdc2defd]
  [bt] (6) /workspace/build/libtvm.so(+0x65ea8d) [0x7fd6fdc2ea8d]
  [bt] (5) /workspace/build/libtvm.so(+0x49349a) [0x7fd6fda6349a]
  [bt] (4) /workspace/build/libtvm.so(+0x4861c4) [0x7fd6fda561c4]
  [bt] (3) /workspace/build/libtvm.so(+0x4a5de3) [0x7fd6fda75de3]
  [bt] (2) /workspace/build/libtvm.so(+0x4a5b03) [0x7fd6fda75b03]
  [bt] (1) /workspace/build/libtvm.so(+0x4a3017) [0x7fd6fda73017]
  [bt] (0) /workspace/build/libtvm.so(+0x151512) [0x7fd6fd721512]
  File "/workspace/src/relay/pass/type_infer.cc", line 620
TVMError: Check failed: checked_type.as<IncompleteTypeNode>() == nullptr: Cannot resolve type of Var(input_2) at (nullptr)

@tkat0 tkat0 changed the title [DOC] Add Android Tutorial [WIP][DOC] Add Android Tutorial Apr 6, 2019
@tkat0
Copy link
Contributor Author

tkat0 commented Apr 6, 2019

I found by dumping Relay IR that when called independently it would be input_1 but otherwise it would be input_2 🤔

-%212 = fn (%input_1: Tensor[(1, 3, 224, 224), float32], %v_param_1: Tensor[(16, 3, 3, 3), float32], 
+%212 = fn (%input_2, %v_param_1: Tensor[(16, 3, 3, 3), float32], 

@tkat0 tkat0 changed the title [WIP][DOC] Add Android Tutorial [DOC] Add Android Tutorial Apr 6, 2019
@tkat0
Copy link
Contributor Author

tkat0 commented Apr 7, 2019

I have fixed by inserting the following, to clear the TF session.
could you please review?

keras.backend.clear_session()

@tqchen
Copy link
Member

tqchen commented Apr 7, 2019

@grwlf @eqy please review

Copy link
Contributor

@eqy eqy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this looks good at a high level. The main issue I run into with android is that it is often possible to generate libraries that will cause an application to crash when loaded/run on an actual device.

ctx = remote.cpu(0)
elif test_target == 'opencl':
ctx = remote.cl(0)
elif test_target == 'vulkan':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm if Vulkan works on your device? What device are you targeting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My device is Xiaomi Mi 5. It has Snapdragon 820 / Adreno 530 and supports Vulkan.

I found the same device report. https://vulkan.gpuinfo.org/displayreport.php?id=3653

shape_dict = {input_name: x.shape}
func, params = relay.frontend.from_keras(keras_mobilenet_v2, shape_dict)

with relay.build_config(opt_level=1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use opt_level 1 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was developing, there was a model that could not be built with opt_level = 1 , but with the current sample it also worked with opt_level = 3

So I want to change it. As a result, Vulkan got a little faster.

-with relay.build_config(opt_level=1):
+with relay.build_config(opt_level=3):

#    # cpu
#    TVM prediction top-1: tiger cat
#    Evaluate inference time cost...
-#    Mean inference time (std dev): 38.89 ms (6.48 ms)
+#    Mean inference time (std dev): 37.92 ms (19.67 ms)
#
#    # opencl
#    TVM prediction top-1: tiger cat
#    Evaluate inference time cost...
-#    Mean inference time (std dev): 418.82 ms (3.65 ms)
+#    Mean inference time (std dev): 419.83 ms (7.49 ms)
#
#    # vulkan
#    TVM prediction top-1: tiger cat
#    Evaluate inference time cost...
-#    Mean inference time (std dev): 511.98 ms (4.53 ms)
+#    Mean inference time (std dev): 465.80 ms (4.52 ms)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done -> 1f908a9

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not an important detail for this PR, but did you do tuning for the GPU configs? It seems strange that the performance would be so far behind CPU.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not this model, but I have used auto tuning. But unfortunately I remember that it was not faster than CPU.

Now, setting target = 'opencl' seems to use a schedule for CUDA. So I think I need a schedule for Adreno, and I'm writing a bit for myself, using this as a reference.

Do you know if there are plans to develop for Adreno at dmlc/tvm or neo-ai/tvm?

@eqy eqy merged commit 0f8686e into apache:master Apr 16, 2019
@eqy
Copy link
Contributor

eqy commented Apr 16, 2019

thanks @tkat0, this is merged

@tkat0 tkat0 deleted the android-tutorial branch April 16, 2019 01:14
@tkat0 tkat0 restored the android-tutorial branch April 18, 2019 15:02
wweic pushed a commit to wweic/tvm that referenced this pull request May 13, 2019
* fix APP_STL for latest android ndk

* add vulkan sdk for tutorial

* add android tutorial

* fix of invalid input layer name

* update relay build opt_level 1 -> 3
wweic pushed a commit to neo-ai/tvm that referenced this pull request May 13, 2019
* fix APP_STL for latest android ndk

* add vulkan sdk for tutorial

* add android tutorial

* fix of invalid input layer name

* update relay build opt_level 1 -> 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants