Add documentation for Adreno deployment #22

dsbarinov1 · 2022-10-21T17:34:31Z

echuraev · 2022-10-24T06:16:51Z

docs/how_to/deploy/adreno.rst

+.. |High-level overview of the Adreno A5x architecture for OpenCL| image:: https://i.ibb.co/yXm6CkG/2022-10-21-14-39-08.png
+.. |Android deployment pipeline| image:: https://i.ibb.co/xMQrgLn/Untitled-Frame-2.jpg


Could you please try to find and check where the image files should be located if you use them in the documentation? I believe that it should be somewhere in TVM repository.

echuraev · 2022-10-24T06:18:12Z

docs/how_to/deploy/adreno.rst

+Introduction
+------------
+
+Adreno is a series of graphicfdgs processing unit (GPU) semiconductor


Suggested change

Adreno is a series of graphicfdgs processing unit (GPU) semiconductor

Adreno is a series of graphics processing unit (GPU) semiconductor

echuraev · 2022-10-24T06:52:26Z

docs/how_to/deploy/adreno.rst

+
+The Adreno GPU accelerates the rendering of complex geometries to
+deliver high-performance graphics and a rich user experience with low
+power consumption


Suggested change

power consumption

power consumption.

echuraev · 2022-10-24T06:54:21Z

docs/how_to/deploy/adreno.rst

+deliver high-performance graphics and a rich user experience with low
+power consumption
+
+This guide will demonstrate the benefits of using textures with Adreno,


We can add links to the corresponding sections of the document? https://sublime-and-sphinx-guide.readthedocs.io/en/latest/references.html#links-to-sections-in-the-same-document

echuraev · 2022-10-24T07:51:50Z

docs/how_to/deploy/adreno.rst

+the moment, TVM is able to benefit from this by having texture support
+for Adreno. The graph below shows the Adreno A5x architecture.
+
+|High-level overview of the Adreno A5x architecture for OpenCL| Fig. 1


Move description on the next line. And probably do it italic

echuraev · 2022-10-24T08:49:14Z

gallery/how_to/deploy_models/deploy_model_on_adreno.py

+# .. code-block:: bash
+#
+#   find ~/Android/sdk/ndk/ -name libc++_shared.so
+#
+# (Linux)
+# .. code-block:: bash
+#
+#   find ~/Library/Android/sdk/ndk/ -name libc++_shared.so
+#
+# (MacOS)


Suggested change

# .. code-block:: bash

#

# find ~/Android/sdk/ndk/ -name libc++_shared.so

#

# (Linux)

# .. code-block:: bash

#

# find ~/Library/Android/sdk/ndk/ -name libc++_shared.so

#

# (MacOS)

# .. code-block:: bash

#

# find ${ANDROID_NDK_HOME} -name libc++_shared.so

#

echuraev · 2022-10-24T08:54:12Z

gallery/how_to/deploy_models/deploy_model_on_adreno.py

+import  os
+import  numpy  as  np
+import  mxnet.gluon  as  gluon
+import  tvm
+from  tvm  import  relay, rpc
+from  tvm.contrib  import  utils, ndk
+from  tvm.contrib  import  graph_executor


Suggested change

import os

import numpy as np

import mxnet.gluon as gluon

import tvm

from tvm import relay, rpc

from tvm.contrib import utils, ndk

from tvm.contrib import graph_executor

import os

import numpy as np

import mxnet.gluon as gluon

import tvm

from tvm import relay, rpc

from tvm.contrib import utils, ndk

from tvm.contrib import graph_executor

echuraev · 2022-10-24T08:55:43Z

gallery/how_to/deploy_models/deploy_model_on_adreno.py

+target="opencl -device=adreno"
+target_host="llvm -mtriple=arm64-linux-android"


Should we add a description about target_host?

BTW, this is depricated usage of the target/target host. Currently you have to create python object Target and initialize it by apprpriate target_host

echuraev · 2022-10-24T08:57:14Z

gallery/how_to/deploy_models/deploy_model_on_adreno.py

+######################################################################
+# Load a test image
+# -----------------
+# As an example we would use classical cat image from ImageNet


Display in the document this image.

echuraev · 2022-10-24T08:57:47Z

gallery/how_to/deploy_models/deploy_model_on_adreno.py

+#    Evaluate inference time cost...
+#    Execution time summary:
+#     mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
+#      76.6101      76.6053      77.6530      75.5720       0.6789


Did you try to convert this script into the web page?

It must not be converted manually. It is done automatically when documentation is built

Yes, I know, but probably it would be better to convert it locally and be sure that everything is fine in the final document.

the main value is py script that is potentially can be executed by user. Why do we care about doc? It should be converted automatically...

Because from this .py script, web page will be automatically generated and presented in the official TVM documentation, and I want to be sure that everything looks correctly. This is why I asked this question. I don't think that there is the problem to build this page locally and just check that everything is fine.

dsbarinov1 · 2022-10-26T12:06:33Z

@echuraev @elvin-n some fixes done

docs/how_to/deploy/adreno.rst

gallery/how_to/deploy_models/deploy_model_on_adreno.py

elvin-n · 2022-10-27T09:22:17Z

gallery/how_to/deploy_models/deploy_model_on_adreno.py

+target="opencl -device=adreno"
+target_host="llvm -mtriple=arm64-linux-android"


BTW, this is depricated usage of the target/target host. Currently you have to create python object Target and initialize it by apprpriate target_host

docs/how_to/deploy/adreno.rst

dsbarinov1 · 2022-11-01T20:50:31Z

Please review "Fix docs" commit @elvin-n @echuraev

adding Jyotsna to reviewers list

This reverts commit e9e8c4b.

…o reduce tuning time (apache#13259) * [MetaSchedule] Swap the order of RewriteTensorize and VerifyGPUCode to reduce tuning time * add comment

See issue apache#13227. Co-authored-by: driazati <9407960+driazati@users.noreply.github.com>

echuraev · 2022-10-31T06:39:05Z

docs/how_to/deploy/adreno.rst


 Since the process of building TVM for Adreno is exactly the same as the
 process of building TVM for Android, please refer to these instructions:
 `TVM RPC
-Server <https://github.com/apache/tvm/tree/main/apps/cpp_rpc>`__
+Server <https://github.com/apache/tvm/tree/main/apps/cpp_rpc>`_.


Do we need _ in the end of the line?

echuraev · 2022-11-02T11:36:05Z

docs/how_to/deploy/adreno.rst

+       )
+
+.. |High-level overview of the Adreno A5x architecture for OpenCL| image:: images/adreno_architecture.png
+.. |Android deployment pipeline| image:: images/android_deployment_pipeline.jpg


Probably do the same sizes for boxes Pre-trained model and Compiled library (.so). Also, the top and bottom arrows in GraphExecutor have different locations. The bottom arrow is lefter than the top. Probably it would be better if they will be on the same vertical.

echuraev · 2022-11-02T11:36:36Z

docs/how_to/deploy/adreno.rst

+           mod, target_host=target_host, target=target, params=params
+       )
+
+.. |High-level overview of the Adreno A5x architecture for OpenCL| image:: images/adreno_architecture.png


Probably add reference where you took this image (I mean on Qualcomm guide)

echuraev · 2022-11-02T11:41:43Z

docs/how_to/deploy/adreno.rst

+enabled. It will also provide :ref:`example code<Build and deploy model for Adreno>` to better understand the differences with compiling and deploying models
+on Adreno devices.


I'm not sure but think it sounds better:

Suggested change

enabled. It will also provide :ref:`example code<Build and deploy model for Adreno>` to better understand the differences with compiling and deploying models

on Adreno devices.

enabled. It will also provide :ref:`example code<Build and deploy model for Adreno>` to better understand the differences in compiling and deploying models

for Adreno devices.

echuraev · 2022-11-02T11:43:29Z

docs/how_to/deploy/adreno.rst

+Advantages of the Textures
+--------------------------
+
+One of the advantages of Adreno is its clever handling of textures. At


Suggested change

One of the advantages of Adreno is its clever handling of textures. At

One of the Adreno's advantages is clever handling of textures. At

echuraev · 2022-11-02T12:39:32Z

docs/how_to/deploy/adreno.rst

+---------------------------------
+
+In this section we will focus on target, needed to compile and deploy models for Adreno, demonstrate
+the generation of kernels with and without textures and, in addition, the


Suggested change

the generation of kernels with and without textures and, in addition, the

the differences in generated kernels with and without textures and, in addition, the

echuraev · 2022-11-02T12:43:18Z

docs/how_to/deploy/adreno.rst

+|Android deployment pipeline|
+
+*Fig.2 Deployment pipeline on Adreno devices*


Will we describe what the reader can see on this picture?

echuraev · 2022-11-02T13:15:45Z

docs/how_to/deploy/adreno.rst

+
+   __kernel void tvmgen_default_fused_nn_conv2d_kernel0(__write_only image2d_t pad_temp_global_texture, __read_only image2d_t p0) {
+   // body..
+


I would add some more details about the differences, e.g.:

image2d_t is a built-in OpenCL types that represents two-dimensional image object and provides several additional functions. Also, when we use image2d_t we read 4 elements at one time, and it helps to utilize hardware in a more efficient way.

echuraev · 2022-11-02T13:19:34Z

gallery/how_to/deploy_models/deploy_model_on_adreno.py

+"""
+
+
+#################################################################


Why before you use """ for commenting and now #? Probably use one comment style for huge multiline comments?

echuraev · 2022-11-02T13:26:40Z

gallery/how_to/deploy_models/deploy_model_on_adreno.py

+#   export TVM_TRACKER_HOST=0.0.0.0
+#   export TVM_TRACKER_PORT=9190
+#
+# check that the tracker is running and the device is available
+#
+# .. code-block:: bash
+#
+#     python -m tvm.exec.query_rpc_tracker --port 9190


IMHO, using environment variables is not the most convenient way to configure tracker. You can also pass all these configurations through command line.

This commit ensures that constant folding is applied when a desired layout is selected during compilation. It ensures that `layout_transform` operations are removed where possible so that pattern matching for BYOC backends can work effectively. A test has been added to check this regression.

…3252) This commit applies additional write permission to the "tvm-venv" group virtual environment. Currently after entering a container from a newly built image it dosn't seem possible to install/update Python packages. E.g. updating pip will give errors such as: ``` $ pip install --upgrade pip ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/venv/apache-tvm-py3.7/bin/pip' Check the permissions. ``` Enabling write access for this group fixes this as long as the current user is a member of the "tvm-venv" group.

* [Hexagon] Tests pylint * fix error * Fix buffer name

- Fix clang 15.0.3 '-Wunused-but-set-variable' and '-Wunused-lambda-capture' warnings by removing / commenting-out code.

…ing function_def_to_graph_def (apache#13260) [TF2] Import graph_def to default graph before calling function_def_to_graph_def

apache#13247) There are a local variable referenced before assignment in convert_interpolate function. I think varible 'size' is real want to be referenced.

…he#13274) This reverts commit 5acf3f9. Reverting since this is causing some spam from the ASF Infra bot related to https://issues.apache.org/jira/browse/INFRA-23834. As in that issue the protections have been applied manually by ASF Infra so this revert shouldn't have any real effect

dsbarinov1 · 2022-11-03T11:28:51Z

@echuraev small fixes done

Minimal dependencies for Fedora/CentOS This commit indicates how to install minimal set of dependencies for building Apache TVM on Fedora and CentOS. It supplements existing information for Ubuntu and MacOS.

Fix occurrences of clang's `-Wdocumentation-unknown-command` warning.

Fix code to address a valid `-Wredundant-move` clang warning.

* [ETHOSN] Inline non-compute-intensive partitions Adds a pass that analyzes functions partitioned for the NPU and inlines those that are deemed "non-compute-intensive" back to the main function so that they can be considered for other backends. The current heurisic for deciding a non-compute-intensive function is to collectively check all of the operations in the function have no multiply accumulate operations. This heuristic is not optimial; optimization is left for future exploration. This pass is inspired by the "IsComputeIntensiveGraph" pass in the TensorRT integration. Change-Id: I20c197702f5252f102cfc1e4b4635ab836aa7835 * Address comments * 'inline_non_compute_intensive_partitions' -> 'is_inline_non_compute _intensive_partitions_enabled'. * remove no MAC operations. * fix network test. Change-Id: Ie1015b27f37e47544bed6f0aff819ee4649de579 * Fix failing unit tests due to optimization Change-Id: I0ee0af071dc77c91e0ef0f6753506cb40d1d1859 * Add future exploration suggestions Change-Id: Ie918d7f1059f032282f1f5eeffda38f4febcd59c

* [ETHOSN] Throw error message when inference fails Previously the runtime would silently skip interence failures and return random values as the result. This can make spotting inference failures challenging. The runtime now throws a fatal error when inference did not complete successfully along with an error message that gives some details about the error that occurred. Change-Id: Iadb6da04ad1c906e3ec49959eb3da0978295aebf * Address comments * clarify test file brief * add test case for running status * add driver stack reference to WaitStatus class Change-Id: I792742892b761534904816135ae2ffcb3f028b2c

This PR introduces a new argument for EvolutionarySearch that limits the failures (defined as rounds of no new generated candidate) in the `SampleInitPopulation` stage. In this way we can avoid the task to be hanging forever in special cases, e.g., some postproc always fails. This should fix apache#12330.

…he#13269) Current type checker for TIR schedule had issue with typing for Python 3.9. This simple patch fixes this problem.

…marking (apache#13255) This PR adds features to the `python/tvm/meta_schedule/testing/torchbench/run.py`. - Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory. - Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure. - Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models. - Add option to choose search strategy in MetaSchedule. - Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed. - Save subgraphs and their example input for debug purpose. - Print MetaSchedule profiling information at the end of execution. - Detach PyTorch tensor before exporting to dlpack. - Fix the sys path to avoid conflict with the `benchmarks` package installed by TorchBench dependency. - Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args. - Empty cuda cache before starting the actual benchmark.

Add tensor rank check for `nn.instance_norm`.

…el workload (apache#13334) * [MetaSchedule] Add a new schedule rule to inline all scalar constants * add doc * reorg * identify constant block by its structure, not by name

…che#13354) This PR introduces a check to prevent records with run time of zero into the training data of cost model. This is because when working on microTVM there're cases where the run time of certain successful runs is very tiny, such that it got recorded as zero. In such cases, the runtime of 0 would break XGBoost model because it introduces infinite running speed in GFLOPs. A regression test was also added.

It seems like there is some inconsistency across the python versions and make PR apache#13269 fails at Python 3.10. This patch fixes this issue. Co-authored-by: Junru Shao <junrushao1994@gmail.com>

…unc is not found (apache#13346)

…e#13356)

…tiLevelTilingTensorCore` (apache#13357) * Fuse shared to global store loops in MultiLevelTilingTensorCore * update test

…etConsumers() (apache#13344) Currently there are two versions of `GetConsumers()` and `GetProducers()` implementation. Make them consistent to avoid possible bug when there are WAR dependencies.

…pache#13343)

As part of effort of more formal TIR semantics, we want to more explicitly differentiate TIR AST nodes (defined in `tir/expr.h`) and TIR ops (defined in `tir/op.h`). A naming convention is that: - Lowercased methods, for example, `tvm.tir.mul`, means an TIR op, which will be eagerly constant-folded, i.e. `mul(1, 2)` returns `3` immediately rather than creating an AST node. - Capitalized callable, for example, `Mul`, means creating an AST node without constant folding. This PR makes this behavior more explictly by printing `T.Mul(a, b)` directly when `a` and `b` are both constants, rather than sugaring it into `mul(a. b)` or `a * b`, so that the difference between an op and an AST node is clarified. Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com> Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>

[FQ2I] Add cast back to output data type after AvgPool2d This commit fixes the following issue: For the sequence of qnn.dequantize -> avg_pool2d -> conv2d -> qnn.quantize FQ2I pass inserts qnn.requantize (or cast) to int32 unconditionally before AvgPool2d. As a result fake quantized qnn.conv2d gets input as int32 dtype, but it is forbidden for qnn.conv2d (supports only uint8/int8/int16). This commit adds the following: Add cast back to output data type after AvgPool2d. This preserve input dtype == output dtype for this op.

dsbarinov1 · 2022-11-11T10:50:27Z

@elvin-n please, review final adjustments :)

This PR adds all common TIR intrinsics like `T.int32x4`, `T.floatx4`. Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>

apache#13345) Fix 2 issues of cache related primitives: * Fix region_cover checking for cache related primitives * Fix CacheLocDetector for nested SeqStmt Co-authored-by: Min Chen <chen.min@intellif.com>

This PR introduces some minor restructuring of the `python/tvm/script` folder structure to make it more convenient for future upstreaming. Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>

In this PR, the skipped tests script will also check if tests in the `required_tests_to_run.json` have not been skipped. If there are skipped tests, they will be added to the returned comment. I am not entirely sure where it's best to place the `required_tests_to_run` file, so I left it in `tvm/ci/scripts/`. I am happy to take suggestions. Aims to prevent situations such as apache#12529

…13368) This PR is a duplicate of apache#12940 and apache#12941. For some reason, I am unable to reopen apache#12940.

…pache#13326) Preivously, the block SREF reuse only included a single step of changes, and would have an incorrect mapping if multiple sequential changes to the TIR block occurred. This could happen if a `BufferStore` was updated, followed by replacement of `Block` iter vars/values. This commit tracks the Block replacements across each usage, to ensure the SREF instances remain valid.

Merging apache#13368 caused CI to pass but run more than it needed to due to some failures in determination. This fixes the interpolation to use `"` which should correctly pass through the variables Co-authored-by: driazati <driazati@users.noreply.github.com>

This PR does not merge `main` if CI is running already on `main`. It aims to avoid a case where a race happens between two subsequent commits, and one of them merges the other. Fixes apache#12392.

…he#13383)

) This enables int64 biases for quantized fully connected, requantize and transpose convolution in TFLite networks. It goes on top of existing int16 support for TFLite frontend. Add a test case using DS_CNN int16 quantized.

echuraev reviewed Oct 24, 2022

View reviewed changes

elvin-n requested changes Oct 27, 2022

View reviewed changes

dsbarinov1 force-pushed the dbarinov/adreno_docs branch from 92babd6 to eb1e257 Compare November 1, 2022 19:32

dsbarinov1 changed the base branch from main to dbarinov/main November 1, 2022 20:23

dsbarinov1 force-pushed the dbarinov/adreno_docs branch from eb1e257 to 836b28b Compare November 1, 2022 20:46

tmoreau89 and others added 4 commits November 2, 2022 06:10

[COMMUNITY] Jyotsna Verma -> Reviewer (apache#13251)

6551b71

adding Jyotsna to reviewers list

[skip ci] Revert "fix GPU other build (apache#13235)" (apache#13261)

87f52af

This reverts commit e9e8c4b.

[MetaSchedule] Swap the order of RewriteTensorize and VerifyGPUCode t…

7536068

…o reduce tuning time (apache#13259) * [MetaSchedule] Swap the order of RewriteTensorize and VerifyGPUCode to reduce tuning time * add comment

[CI] Skip failing Caffe tests due to broken URL (apache#13228)

84fadc4

See issue apache#13227. Co-authored-by: driazati <9407960+driazati@users.noreply.github.com>

echuraev reviewed Nov 2, 2022

View reviewed changes

lhutton1 and others added 8 commits November 2, 2022 14:15

Fix a typo in rpc/client.py (apache#12842)

da4bb4a

[Hexagon] Add pylint on tests (apache#13233)

d261fa8

* [Hexagon] Tests pylint * fix error * Fix buffer name

[build][relay][te][tir] remove unused vars / args (apache#13266)

404d95f

- Fix clang 15.0.3 '-Wunused-but-set-variable' and '-Wunused-lambda-capture' warnings by removing / commenting-out code.

[Frontend][Tensorflow2] Import graph_def to default graph before call…

ff6aaeb

…ing function_def_to_graph_def (apache#13260) [TF2] Import graph_def to default graph before calling function_def_to_graph_def

[Frontend][PaddlePaddle] Fix UnboundLocalError: local variable 'shape… (

d998187

apache#13247) There are a local variable referenced before assignment in convert_interpolate function. I think varible 'size' is real want to be referenced.

bkmgit and others added 9 commits November 3, 2022 20:55

[Docs] Minimal dependencies for Fedora/CentOS (apache#13248)

f15afd2

Minimal dependencies for Fedora/CentOS This commit indicates how to install minimal set of dependencies for building Apache TVM on Fedora and CentOS. It supplements existing information for Ubuntu and MacOS.

[build][doc] Fix clang doxygen warnings (apache#13270)

9df3a33

Fix occurrences of clang's `-Wdocumentation-unknown-command` warning.

[build][tir] fix clang redundant-move warning (apache#13268)

0d55312

Fix code to address a valid `-Wredundant-move` clang warning.

[Bugfix][TIR] Fix version conflict with typing for Python 3.9 (apac…

215f0e2

…he#13269) Current type checker for TIR schedule had issue with typing for Python 3.9. This simple patch fixes this problem.

[Relay] Add tensor rank check for nn.instance_norm (apache#13280)

90ed632

Add tensor rank check for `nn.instance_norm`.

masahi and others added 10 commits November 11, 2022 09:08

[MetaSchedule] Improve inlining and VerifyGPUCode for quantized mod…

93fdf83

…el workload (apache#13334) * [MetaSchedule] Add a new schedule rule to inline all scalar constants * add doc * reorg * identify constant block by its structure, not by name

[Bugfix][TIR] Patch for PR#13269 to support Python 3.10 (apache#13350)

6d68aff

It seems like there is some inconsistency across the python versions and make PR apache#13269 fails at Python 3.10. This patch fixes this issue. Co-authored-by: Junru Shao <junrushao1994@gmail.com>

[MetaSchedule] Unannotate schedule_rule if corresponding schedule f…

a156636

…unc is not found (apache#13346)

[MetaSchedule] Skip empty fx graph in TorchBench tuning script (apach…

f3eb239

…e#13356)

[MetaSchedule] Fuse loops around shared to global store block in `Mul…

5364e5a

…tiLevelTilingTensorCore` (apache#13357) * Fuse shared to global store loops in MultiLevelTilingTensorCore * update test

[TIR][Schedule] Make consistent implementation for GetProducers() & G…

4532712

…etConsumers() (apache#13344) Currently there are two versions of `GetConsumers()` and `GetProducers()` implementation. Make them consistent to avoid possible bug when there are WAR dependencies.

[Relay] Refactor constant folding over expr into a utility function (a…

f9ed60a

…pache#13343)

junrushao and others added 18 commits November 11, 2022 11:40

[IRBuilder][Minor] Add intrinsics like T.int32x4 (apache#13361)

8897983

This PR adds all common TIR intrinsics like `T.int32x4`, `T.floatx4`. Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>

[TIR][Schedule] Fix cache_read loc detecting and region_cover checking (

3877117

apache#13345) Fix 2 issues of cache related primitives: * Fix region_cover checking for cache related primitives * Fix CacheLocDetector for nested SeqStmt Co-authored-by: Min Chen <chen.min@intellif.com>

[TVMScript] Reorganize the folder structure (apache#12496)

b20b7c4

This PR introduces some minor restructuring of the `python/tvm/script` folder structure to make it more convenient for future upstreaming. Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>

[CI] Separate the ci scripts into Github and Jenkins scripts (apache#…

5a767d0

…13368) This PR is a duplicate of apache#12940 and apache#12941. For some reason, I am unable to reopen apache#12940.

[CI] Do not merge before running CI on main (apache#13372)

41a2243

This PR does not merge `main` if CI is running already on `main`. It aims to avoid a case where a race happens between two subsequent commits, and one of them merges the other. Fixes apache#12392.

[Codegen] Fix CUDA codegen for int64 Ramp (apache#13382)

3aa16f7

[MetaSchedule] Add from-target Defaults for LLVM VNNI Targets (apac…

647be2b

…he#13383)

Add documentation for Adreno deployment

2d3a5b5

Fix text, fix links, add images

1e7e790

Fix text, fix links, add images

8b27975

Fix docs

a91e052

Fix docs v2

4f20a25

Release

cb5e183

ToMixedPrecision fix

7c682ac

dsbarinov1 force-pushed the dbarinov/adreno_docs branch from 685744f to 7c682ac Compare November 15, 2022 12:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add documentation for Adreno deployment #22

Add documentation for Adreno deployment #22

dsbarinov1 commented Oct 21, 2022

echuraev Oct 24, 2022

echuraev Oct 24, 2022

echuraev Oct 24, 2022

echuraev Oct 24, 2022

echuraev Oct 24, 2022

echuraev Oct 24, 2022

echuraev Oct 24, 2022

echuraev Oct 24, 2022

elvin-n Oct 27, 2022

echuraev Oct 24, 2022

echuraev Oct 24, 2022

elvin-n Oct 24, 2022

echuraev Oct 24, 2022

elvin-n Oct 24, 2022

echuraev Oct 24, 2022

dsbarinov1 commented Oct 26, 2022

elvin-n Oct 27, 2022

dsbarinov1 commented Nov 1, 2022

echuraev Oct 31, 2022

echuraev Nov 2, 2022

echuraev Nov 2, 2022

TejashShah Nov 4, 2022

echuraev Nov 2, 2022

echuraev Nov 2, 2022

echuraev Nov 2, 2022

echuraev Nov 2, 2022

echuraev Nov 2, 2022

echuraev Nov 2, 2022

echuraev Nov 2, 2022

dsbarinov1 commented Nov 3, 2022

dsbarinov1 commented Nov 11, 2022

		.. \|High-level overview of the Adreno A5x architecture for OpenCL\| image:: https://i.ibb.co/yXm6CkG/2022-10-21-14-39-08.png
		.. \|Android deployment pipeline\| image:: https://i.ibb.co/xMQrgLn/Untitled-Frame-2.jpg

	Adreno is a series of graphicfdgs processing unit (GPU) semiconductor
	Adreno is a series of graphics processing unit (GPU) semiconductor

		target="opencl -device=adreno"
		target_host="llvm -mtriple=arm64-linux-android"

		enabled. It will also provide :ref:`example code<Build and deploy model for Adreno>` to better understand the differences with compiling and deploying models
		on Adreno devices.

	One of the advantages of Adreno is its clever handling of textures. At
	One of the Adreno's advantages is clever handling of textures. At

	the generation of kernels with and without textures and, in addition, the
	the differences in generated kernels with and without textures and, in addition, the

		\|Android deployment pipeline\|

		Fig.2 Deployment pipeline on Adreno devices


		__kernel void tvmgen_default_fused_nn_conv2d_kernel0(__write_only image2d_t pad_temp_global_texture, __read_only image2d_t p0) {
		// body..

		"""


		#################################################################

Add documentation for Adreno deployment #22

Are you sure you want to change the base?

Add documentation for Adreno deployment #22

Conversation

dsbarinov1 commented Oct 21, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsbarinov1 commented Oct 26, 2022

Choose a reason for hiding this comment

dsbarinov1 commented Nov 1, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsbarinov1 commented Nov 3, 2022

dsbarinov1 commented Nov 11, 2022