-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation for Adreno deployment #22
base: dbarinov/main
Are you sure you want to change the base?
Commits on Nov 1, 2022
-
[COMMUNITY] Jyotsna Verma -> Reviewer (apache#13251)
adding Jyotsna to reviewers list
Configuration menu - View commit details
-
Copy full SHA for 6551b71 - Browse repository at this point
Copy the full SHA 6551b71View commit details -
[skip ci] Revert "fix GPU other build (apache#13235)" (apache#13261)
This reverts commit e9e8c4b.
Configuration menu - View commit details
-
Copy full SHA for 87f52af - Browse repository at this point
Copy the full SHA 87f52afView commit details
Commits on Nov 2, 2022
-
[MetaSchedule] Swap the order of RewriteTensorize and VerifyGPUCode t…
…o reduce tuning time (apache#13259) * [MetaSchedule] Swap the order of RewriteTensorize and VerifyGPUCode to reduce tuning time * add comment
Configuration menu - View commit details
-
Copy full SHA for 7536068 - Browse repository at this point
Copy the full SHA 7536068View commit details -
[CI] Skip failing Caffe tests due to broken URL (apache#13228)
See issue apache#13227. Co-authored-by: driazati <9407960+driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 84fadc4 - Browse repository at this point
Copy the full SHA 84fadc4View commit details -
[TVMC] Apply constant folding when converting layout (apache#13216)
This commit ensures that constant folding is applied when a desired layout is selected during compilation. It ensures that `layout_transform` operations are removed where possible so that pattern matching for BYOC backends can work effectively. A test has been added to check this regression.
Configuration menu - View commit details
-
Copy full SHA for 4ecf303 - Browse repository at this point
Copy the full SHA 4ecf303View commit details -
Apply group write permissions to Python virtual environment (apache#1…
…3252) This commit applies additional write permission to the "tvm-venv" group virtual environment. Currently after entering a container from a newly built image it dosn't seem possible to install/update Python packages. E.g. updating pip will give errors such as: ``` $ pip install --upgrade pip ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/venv/apache-tvm-py3.7/bin/pip' Check the permissions. ``` Enabling write access for this group fixes this as long as the current user is a member of the "tvm-venv" group.
Configuration menu - View commit details
-
Copy full SHA for c3c1454 - Browse repository at this point
Copy the full SHA c3c1454View commit details -
Configuration menu - View commit details
-
Copy full SHA for da4bb4a - Browse repository at this point
Copy the full SHA da4bb4aView commit details -
[Hexagon] Add pylint on tests (apache#13233)
* [Hexagon] Tests pylint * fix error * Fix buffer name
Configuration menu - View commit details
-
Copy full SHA for d261fa8 - Browse repository at this point
Copy the full SHA d261fa8View commit details -
[build][relay][te][tir] remove unused vars / args (apache#13266)
- Fix clang 15.0.3 '-Wunused-but-set-variable' and '-Wunused-lambda-capture' warnings by removing / commenting-out code.
Christian Convey authoredNov 2, 2022 Configuration menu - View commit details
-
Copy full SHA for 404d95f - Browse repository at this point
Copy the full SHA 404d95fView commit details -
[Frontend][Tensorflow2] Import graph_def to default graph before call…
…ing function_def_to_graph_def (apache#13260) [TF2] Import graph_def to default graph before calling function_def_to_graph_def
Configuration menu - View commit details
-
Copy full SHA for ff6aaeb - Browse repository at this point
Copy the full SHA ff6aaebView commit details
Commits on Nov 3, 2022
-
[Frontend][PaddlePaddle] Fix UnboundLocalError: local variable 'shape… (
apache#13247) There are a local variable referenced before assignment in convert_interpolate function. I think varible 'size' is real want to be referenced.
Configuration menu - View commit details
-
Copy full SHA for d998187 - Browse repository at this point
Copy the full SHA d998187View commit details -
[skip ci] Revert "[ci] Protect release branches (apache#13208)" (apac…
…he#13274) This reverts commit 5acf3f9. Reverting since this is causing some spam from the ASF Infra bot related to https://issues.apache.org/jira/browse/INFRA-23834. As in that issue the protections have been applied manually by ASF Infra so this revert shouldn't have any real effect
Configuration menu - View commit details
-
Copy full SHA for e9ba986 - Browse repository at this point
Copy the full SHA e9ba986View commit details -
[Docs] Minimal dependencies for Fedora/CentOS (apache#13248)
Minimal dependencies for Fedora/CentOS This commit indicates how to install minimal set of dependencies for building Apache TVM on Fedora and CentOS. It supplements existing information for Ubuntu and MacOS.
Configuration menu - View commit details
-
Copy full SHA for f15afd2 - Browse repository at this point
Copy the full SHA f15afd2View commit details -
[build][doc] Fix clang doxygen warnings (apache#13270)
Fix occurrences of clang's `-Wdocumentation-unknown-command` warning.
Christian Convey authoredNov 3, 2022 Configuration menu - View commit details
-
Copy full SHA for 9df3a33 - Browse repository at this point
Copy the full SHA 9df3a33View commit details -
[build][tir] fix clang redundant-move warning (apache#13268)
Fix code to address a valid `-Wredundant-move` clang warning.
Christian Convey authoredNov 3, 2022 Configuration menu - View commit details
-
Copy full SHA for 0d55312 - Browse repository at this point
Copy the full SHA 0d55312View commit details -
[ETHOSN] Inline non-compute-intensive partitions (apache#13092)
* [ETHOSN] Inline non-compute-intensive partitions Adds a pass that analyzes functions partitioned for the NPU and inlines those that are deemed "non-compute-intensive" back to the main function so that they can be considered for other backends. The current heurisic for deciding a non-compute-intensive function is to collectively check all of the operations in the function have no multiply accumulate operations. This heuristic is not optimial; optimization is left for future exploration. This pass is inspired by the "IsComputeIntensiveGraph" pass in the TensorRT integration. Change-Id: I20c197702f5252f102cfc1e4b4635ab836aa7835 * Address comments * 'inline_non_compute_intensive_partitions' -> 'is_inline_non_compute _intensive_partitions_enabled'. * remove no MAC operations. * fix network test. Change-Id: Ie1015b27f37e47544bed6f0aff819ee4649de579 * Fix failing unit tests due to optimization Change-Id: I0ee0af071dc77c91e0ef0f6753506cb40d1d1859 * Add future exploration suggestions Change-Id: Ie918d7f1059f032282f1f5eeffda38f4febcd59c
Configuration menu - View commit details
-
Copy full SHA for 75921fb - Browse repository at this point
Copy the full SHA 75921fbView commit details -
[ETHOSN] Throw error message when inference fails (apache#13022)
* [ETHOSN] Throw error message when inference fails Previously the runtime would silently skip interence failures and return random values as the result. This can make spotting inference failures challenging. The runtime now throws a fatal error when inference did not complete successfully along with an error message that gives some details about the error that occurred. Change-Id: Iadb6da04ad1c906e3ec49959eb3da0978295aebf * Address comments * clarify test file brief * add test case for running status * add driver stack reference to WaitStatus class Change-Id: I792742892b761534904816135ae2ffcb3f028b2c
Configuration menu - View commit details
-
Copy full SHA for 47da418 - Browse repository at this point
Copy the full SHA 47da418View commit details -
[MetaSchedule] Fix Task Hanging in EvolutionarySearch (apache#13246)
This PR introduces a new argument for EvolutionarySearch that limits the failures (defined as rounds of no new generated candidate) in the `SampleInitPopulation` stage. In this way we can avoid the task to be hanging forever in special cases, e.g., some postproc always fails. This should fix apache#12330.
Configuration menu - View commit details
-
Copy full SHA for 1d1db35 - Browse repository at this point
Copy the full SHA 1d1db35View commit details -
[Bugfix][TIR] Fix version conflict with
typing
for Python 3.9 (apac……he#13269) Current type checker for TIR schedule had issue with typing for Python 3.9. This simple patch fixes this problem.
Configuration menu - View commit details
-
Copy full SHA for 215f0e2 - Browse repository at this point
Copy the full SHA 215f0e2View commit details -
[MetaSchedule] Improve the script for TorchBench model tuning & bench…
…marking (apache#13255) This PR adds features to the `python/tvm/meta_schedule/testing/torchbench/run.py`. - Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory. - Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure. - Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models. - Add option to choose search strategy in MetaSchedule. - Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed. - Save subgraphs and their example input for debug purpose. - Print MetaSchedule profiling information at the end of execution. - Detach PyTorch tensor before exporting to dlpack. - Fix the sys path to avoid conflict with the `benchmarks` package installed by TorchBench dependency. - Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args. - Empty cuda cache before starting the actual benchmark.
Configuration menu - View commit details
-
Copy full SHA for b98b9f9 - Browse repository at this point
Copy the full SHA b98b9f9View commit details -
[Relay] Add tensor rank check for
nn.instance_norm
(apache#13280)Add tensor rank check for `nn.instance_norm`.
Configuration menu - View commit details
-
Copy full SHA for 90ed632 - Browse repository at this point
Copy the full SHA 90ed632View commit details -
[Relay] Enhancement for fold_scale_axis and simplify_expr (apache#13275)
add(%1, %1) convert to multiply(%1, 2f); enhance fold_scale_axis to fold multiply(%1, 2f) into conv Signed-off-by: Lei Wen <wenlei03@qiyi.com> Co-authored-by: Lei Wen <wenlei03@qiyi.com>
Configuration menu - View commit details
-
Copy full SHA for b1a099b - Browse repository at this point
Copy the full SHA b1a099bView commit details
Commits on Nov 4, 2022
-
[skip-ci][COMMUNITY] New committer Ashutosh Parkhi (apache#13286)
[COMMUNITY] New committer Ashutosh Parkhi
Configuration menu - View commit details
-
Copy full SHA for de8a79d - Browse repository at this point
Copy the full SHA de8a79dView commit details -
[TIR][Arith] Use TryCompare to narrow inequalities if possible (apach…
…e#13024) Prior to this commit, the result of TryCompare would only be used if it could definitively prove a conditional to be either true or false. For example, if it is known that `0 <= i`, a conditional of `i <= 0` would be left as-is. This commit introduces rewrite rules to preferentially simplify into more restrictive conditions. Using the same example, if it is known that `0 <= i`, a conditional of `i <= 0` would be simplified into `i == 0`. Similarly, if it is known that `0 <= i`, a conditional of `i != 0` would be simplified into `0 < i`. Because this change does not introduce significant overhead, as the results of `RewriteSimplifier::Impl::TryCompare` are already available, this change is enabled for all use cases and does not require a call to `RewriteSimplifier::SetEnabledExtensions`.
Configuration menu - View commit details
-
Copy full SHA for ccb7d07 - Browse repository at this point
Copy the full SHA ccb7d07View commit details -
[build][hexagon] remove unused variable (apache#13291)
Remove unused member variable in the `SimulatorRPCChannel` class. Fixes a clang warning.
Christian Convey authoredNov 4, 2022 Configuration menu - View commit details
-
Copy full SHA for e860884 - Browse repository at this point
Copy the full SHA e860884View commit details -
[BugFix][Pattern] Fixed a crash when AltPattern and FunctionPattern a…
…re used nested (apache#13278) The PatternGroup doesn not check if the FunctionPattern is matched while processing the FunctionPattern, but when FunctionPattern is nested with AltPattern, the FunctionPattern may not be matched, resulting in a crash when looking up matched nodes. This commit adds a check at handling FunctionPattern to fix this crash.
Configuration menu - View commit details
-
Copy full SHA for 6da298b - Browse repository at this point
Copy the full SHA 6da298bView commit details -
[build][tir] suppress -Woverloaded-virtual warning (apache#13267)
- Address a (valid) warning from clang-15.0.3 regarding the `tvm::tir::DataTypeRewriter` class. - Make some class methods `protected` rather than `public` to better reflect authors' intent.
Christian Convey authoredNov 4, 2022 Configuration menu - View commit details
-
Copy full SHA for dec74cb - Browse repository at this point
Copy the full SHA dec74cbView commit details -
[Tensorize] Add logs to comparator to make debugging tensorize failur…
…es easier (apache#13285) * [TIR][Tensorize] Add error logs to IR comparator to display what caused tensorization to fail * lint issues
Configuration menu - View commit details
-
Copy full SHA for be44e9c - Browse repository at this point
Copy the full SHA be44e9cView commit details
Commits on Nov 5, 2022
-
[Hexagon] Lint tests part 2 (apache#13271)
* Hexagon test lint part 2 * fix import * fix global variable * fix import issue * fix import * fix exception error * address comments
Configuration menu - View commit details
-
Copy full SHA for 62fadac - Browse repository at this point
Copy the full SHA 62fadacView commit details -
[TE] Make
elem_offset
of the buffers created byte.extern
a varia……ble to avoid crash (apache#13297) * make elem_offset of the buffers created by te.extern a variable Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai> * add test * fix te extern create_prim_func test Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
Configuration menu - View commit details
-
Copy full SHA for 56878fa - Browse repository at this point
Copy the full SHA 56878faView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1e79364 - Browse repository at this point
Copy the full SHA 1e79364View commit details -
[TIR] Preserve loop annotation after loop partitioning (apache#13292)
Preserve loop annotations when the loop is get partitioned. Also we bind the loop region info to the analyzer for some cases some partition condition could not get solved due to unknown (but trivial) loop region.
Configuration menu - View commit details
-
Copy full SHA for 732e34f - Browse repository at this point
Copy the full SHA 732e34fView commit details -
[FIX] Handle matmul where one inner dimension is unknown (apache#13287)
Unify the two inner dimensions in the type checker so if one is unknown it will be filled in.
Tristan Konolige authoredNov 5, 2022 Configuration menu - View commit details
-
Copy full SHA for b51c491 - Browse repository at this point
Copy the full SHA b51c491View commit details
Commits on Nov 6, 2022
-
[DOCS][TVMC] Use correct argument to reuse tuning records (apache#13302)
Update tvmc tutorial code to use correct argument for reusing tuning records. Specifically, current code uses tuning_records, which is meant for saving the generated tuning results, not reusing prior results. We should use prior_records instead.
Configuration menu - View commit details
-
Copy full SHA for f2a7403 - Browse repository at this point
Copy the full SHA f2a7403View commit details
Commits on Nov 7, 2022
-
[Hexagon] Fix Hexagon external libs check (apache#13257)
When building tvm runtime with hexagon we face the below error if USE_HEXAGON_EXTERNAL_LIBS is not defined. This happens because USE_HEXAGON_EXTERNAL_LIBS=OFF is defined as the default in CMakeLists.txt. The modified condition can check for all cases including undefined variable, empty string and OFF CMake Error at cmake/modules/Hexagon.cmake:203 (message): Invalid use of USE_HEXAGON_EXTERNAL_LIBS=OFF; USE_HEXAGON_EXTERNAL_LIBS only supports absolute paths and git repository urls Call Stack (most recent call first): CMakeLists.txt:477 (include)
Configuration menu - View commit details
-
Copy full SHA for 60e2c98 - Browse repository at this point
Copy the full SHA 60e2c98View commit details -
[Relay][Op] Add support for large index fp16 mean and var (apache#13289)
Add support for large index fp16 mean and var.
Josh Fromm authoredNov 7, 2022 Configuration menu - View commit details
-
Copy full SHA for dd257e4 - Browse repository at this point
Copy the full SHA dd257e4View commit details -
[Bugfix][Runtime] Fix sched_setaffinity in Android (apache#13158)
* fix sched_setaffinity error on Android * fix sched_setaffinity error on Android * fix sched_setaffinity error on Android * clang format * add ndk api verion macro * clang format
Configuration menu - View commit details
-
Copy full SHA for 6b238c4 - Browse repository at this point
Copy the full SHA 6b238c4View commit details -
[Torch] Fix advanced indexing with boolean mask (apache#13306)
* [Torch] Fix advanced indexing with boolean mask * add comment
Configuration menu - View commit details
-
Copy full SHA for e398d16 - Browse repository at this point
Copy the full SHA e398d16View commit details -
Configuration menu - View commit details
-
Copy full SHA for ce777fd - Browse repository at this point
Copy the full SHA ce777fdView commit details -
Configuration menu - View commit details
-
Copy full SHA for b16a64d - Browse repository at this point
Copy the full SHA b16a64dView commit details
Commits on Nov 8, 2022
-
[Frontend][PaddlePaddle] Add test case for interpolate op convert fun…
…c… (apache#13277) Add test case for interpolate op convert function apache#13247
Configuration menu - View commit details
-
Copy full SHA for 904ae77 - Browse repository at this point
Copy the full SHA 904ae77View commit details -
[BugFix][Driver] Correctly propogate simple-mode flag in LowerSchedule (
apache#13311) Currently one version of `tvm::LowerSchedule` doesn't pass along the input `simple_mode` flag, which causes it to default back to `false`. This commit fixes it by passing along the input flag.
Configuration menu - View commit details
-
Copy full SHA for f869118 - Browse repository at this point
Copy the full SHA f869118View commit details -
[microTVM] Fix RPC session close on runtime side (apache#13310)
Currently, the RPC session on C/C++ side does not know if the session was closed on Python side which causes extra read/write on transport while the session is already closed. This commit reuses the Hexagon approach in microTVM to shutdown the RPC session.
Configuration menu - View commit details
-
Copy full SHA for e43841d - Browse repository at this point
Copy the full SHA e43841dView commit details -
[Hexagon] [runtime] Move lock/unlock to HexagonHtp temporarily (apach…
…e#13318) Move lock/unlock to HexagonHtp temporarily
Configuration menu - View commit details
-
Copy full SHA for b807613 - Browse repository at this point
Copy the full SHA b807613View commit details -
[TIR] Add thread sync if access index doesn't depend on thread index (a…
…pache#13314) This PR updates the `src/tir/transforms/thread_storage_sync.cc`, to make it insert storage sync if the access index doesn't depend on the innermost thread index, i.e., being constant wit respect to the innermost thread id. This fixes an accuracy problem on model https://github.com/pytorch/benchmark/tree/main/torchbenchmark/models/timm_efficientdet
Configuration menu - View commit details
-
Copy full SHA for c898dc6 - Browse repository at this point
Copy the full SHA c898dc6View commit details -
[ETHOSN] Consolidate target string usage (apache#13159)
* [ETHOSN] Consolidate target string usage Removes support for a deprecated target string. The deprecation warning has been around for a couple of releases now so it should be safe to remove. The target to use moving forward is: `ethos-n -variant=n78 ...` Refactored direct use of a driver stack target string in the testing infrastructure to use the same string we expect users to provide. This simplified some of the code in codegen and hopefully avoids confusion in the future.
Configuration menu - View commit details
-
Copy full SHA for 79093a1 - Browse repository at this point
Copy the full SHA 79093a1View commit details -
[Adreno][Textures] Fix static memory planner (apache#13253)
* [Adreno][Textures] Fix static memory planner Fix memory reusage in static memory planner. * Move token allocators to separate file * Add test on TokenAllocator2d * Apply comments and fix CI
Configuration menu - View commit details
-
Copy full SHA for be30238 - Browse repository at this point
Copy the full SHA be30238View commit details -
Fixup libtorch backend build (apache#13320)
Add clang-format disable for header to prevent reorder. Torch header file need to be put at the end since torch's dlpack is a little different with tvm's. Signed-off-by: Lei Wen <wenlei03@qiyi.com> Co-authored-by: Lei Wen <wenlei03@qiyi.com>
Configuration menu - View commit details
-
Copy full SHA for bf77e79 - Browse repository at this point
Copy the full SHA bf77e79View commit details -
[TVMScript] Hide trailing return type if None (apache#13308)
Because the majority of TIR PrimFuncs operate on buffers, write their outputs to an output parameter, and do not return a value, the `-> None` in the function signature becomes visual noise. This commit removes printing of the return type in cases where the PrimFunc has no return value.
Configuration menu - View commit details
-
Copy full SHA for 15752e4 - Browse repository at this point
Copy the full SHA 15752e4View commit details -
[OpenCL][unit tests] Fix opencl cpp unit tests (apache#13254)
* [OpenCL][unit tests] Fix opencl cpp unit tests After some changes in Hexagon, the run of cpp opencl tests leads to the following error: ``` pluggy.manager.PluginValidationError: unknown hook 'pytest_configure_node' in plugin <module 'tvm.contrib.hexagon.pytest_plugin' ``` Added `pytest_plugin` for OpenCL CPP tests for avoiding this error and processing gtest arguments. * Fix fail than gtest_args option was already added * Move `gtest_args` deginition to the main testing plugin
Configuration menu - View commit details
-
Copy full SHA for 750ba9f - Browse repository at this point
Copy the full SHA 750ba9fView commit details -
[microTVM][CRT] Add memory size as project option (apache#13313)
* Add memory size as project option * cleanup * address comments * address comments
Configuration menu - View commit details
-
Copy full SHA for 16bb1a6 - Browse repository at this point
Copy the full SHA 16bb1a6View commit details
Commits on Nov 9, 2022
-
[TIR] Remove redundant add in vnni/arm intrin (apache#13319)
* [TIR] Remove redundant add in vnni intrin * Update arm intrin Co-authored-by: Ubuntu <ubuntu@ubuntu.com>
Configuration menu - View commit details
-
Copy full SHA for 36b1c5c - Browse repository at this point
Copy the full SHA 36b1c5cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 244bceb - Browse repository at this point
Copy the full SHA 244bcebView commit details -
Configuration menu - View commit details
-
Copy full SHA for 65dbee7 - Browse repository at this point
Copy the full SHA 65dbee7View commit details -
[AOT] Add CreateExecutorMetadata analysis pass (apache#13250)
AOT requires the ExecutorCodegenMetadata object to be populated containing various pieces of information about the compiled module. This commit adds a separate analysis pass to create the metadata + some tests for the new pass. In order to collect the device information correctly, AOTLowerMain is extended to attach the device info as a function attribute.
Configuration menu - View commit details
-
Copy full SHA for 0e395c3 - Browse repository at this point
Copy the full SHA 0e395c3View commit details -
[microTVM][CRT][DOCS] Add a PyTorch tutorial for microTVM with CRT (a…
…pache#13324) This commit adds a tutorial to compile and run a PyTorch model using microTVM, the AOT host-driven executor, and C runtime (CRT).
Configuration menu - View commit details
-
Copy full SHA for fbe174b - Browse repository at this point
Copy the full SHA fbe174bView commit details -
[ci] Update Jenkins readme to match new directory structure (apache#1…
…3333) Update Jenkins readme to match new directory structure
Configuration menu - View commit details
-
Copy full SHA for 999eee8 - Browse repository at this point
Copy the full SHA 999eee8View commit details -
[MetaSchedule] Fix the order of applying
AutoInline
in `ScheduleUsi……ngAnchorTrace` (apache#13329) * index on concat-fusion-fix: 3ffe5b1 fix te extern create_prim_func test * Apply AutoInline to the last block after all other blocks are processed * Do not require CanReverseComputeInline to be true when CanComputeInline is false * add comment * add test * cpplint
Configuration menu - View commit details
-
Copy full SHA for 8453c9c - Browse repository at this point
Copy the full SHA 8453c9cView commit details -
[MetaSchedule] Add JSON Database Validation Scripts (apache#12948)
* Add validation scripts. * Fix testing script. * Fix lint. * Fix lint. * Fix inputs. * Fix lint. * Fix lint. * Add timer func. * Fix ci. * Address comments. * Add total time statistics. * Fix lint.
Configuration menu - View commit details
-
Copy full SHA for 5dc4186 - Browse repository at this point
Copy the full SHA 5dc4186View commit details
Commits on Nov 10, 2022
-
[QNN, ONNX] Extension of QLinearMatMul in ONNX front-end for all rank…
…s of input tensors (apache#13322) * QLinearMatMul was extended for all ranks of a and b * CI test for QLinearMatMul was implemented (onnx front-end) * fix after black check * numpy type fix * fix weight scale and zero point, output type * fix after pylint * resolve different input types in tests * skip resolved TODO * update covering of QLinearMatMul by tests * pylint fixes * skip test of QLinearMatMul on CUDA Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Configuration menu - View commit details
-
Copy full SHA for b4b90d7 - Browse repository at this point
Copy the full SHA b4b90d7View commit details -
[TIR] Check producer predicate in
ReverseComputeInline
(apache#13338)* [TIR] Disallow reverse inline into a producer with non-trivial predicate * add test * Allow cases where the producer predicate can be implied by the new predicate of the inlined block * remove unused variable * update comment in test to reflect the change in ReverseComputeInline
Configuration menu - View commit details
-
Copy full SHA for 6d9d213 - Browse repository at this point
Copy the full SHA 6d9d213View commit details -
[TOPI] Fix conv2d transpose for small channel (apache#13341)
* [TOPI] Fix conv2d transpose for small channel * black
Configuration menu - View commit details
-
Copy full SHA for a16a890 - Browse repository at this point
Copy the full SHA a16a890View commit details -
[Minor][Testing] Consolidate IRs into corresponding functions (apache…
…#13339) We moved most of the IR definition into the testing methods correspondingly. Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 1228104 - Browse repository at this point
Copy the full SHA 1228104View commit details -
[CPP_RPC][ANDROID] Fix cpp_rpc build failure (apache#13305)
* cpp_rpc build failure for Android devices with NDK version < 23 * * Make environment variable ANDROID_NDK_MAJOR optional. Co-authored-by: Siva Rama Krishna Reddy B <sivb@blr-ubuntu-ripper.qualcomm.com>
Configuration menu - View commit details
-
Copy full SHA for a0dcab2 - Browse repository at this point
Copy the full SHA a0dcab2View commit details -
[Hexagon] Make allocate_hexagon_array a hexagon contrib API (apache#1…
…3336) Make 'allocate_hexagon_array' a hexagon contrib API
Configuration menu - View commit details
-
Copy full SHA for 3a30df6 - Browse repository at this point
Copy the full SHA 3a30df6View commit details -
[microNPU] Fixed MergeConstants pass on striped networks (apache#13281)
This PR fixes the bug in MergeConstants pass on striped networks on Ethos-U NPU. The issue was caused by _DivideConstants_ pass which is introducing new mod parameters and changing their order. So ethosu_write parameter in some cases is moved from the end of the list to the middle. E.g. from: `[ethos-u_0_i0, p1, p2, p3, p4, p5, p6, ethosu_write]` To: `[ethos-u_0_i0, p1, p2, ethosu_write, placeholder, placeholder, placeholder, placeholder, placeholder, placeholder, placeholder, placeholder]` Updated version of the _GetArgsToMergeWithoutArgsNotInConstDict_ and _MakeNewConstDict_ methods in passes.cc can now correctly modify const_dict according to the new parameter list.
Configuration menu - View commit details
-
Copy full SHA for 54bd5e1 - Browse repository at this point
Copy the full SHA 54bd5e1View commit details -
[TVMC] Global pass context for compile and tune (apache#13309)
* [TVMC] Global pass context for compile and tune Comes as a followup from conversations in apache#13216. By making the pass context a global value for both `compile` and `tune` commands, we can ensure the pass context is exactly as the user expected and also test components such as `convert_graph_layout` under a pass context suitable for testing (e.g. add instruments). With this change, it becomes the users responsibility to ensure the PassContext they select is suitable for the passes that will be run. By default, `opt_level` remains as 3 so current workflows that do not alter the pass context from the command line / TVMC Python API should not be affected. Change-Id: I7a601daf6fbe664f77bce1b45efeb7ca29f621b3 * fix vitis-ai test and typo Change-Id: I04f5bd031ae4717825f42e373bcb0e1e2c1c9d90
Configuration menu - View commit details
-
Copy full SHA for 23ade0c - Browse repository at this point
Copy the full SHA 23ade0cView commit details -
[TIR] Update ReductionIterNotIndexOutputBuffer to check BlockRealizeN… (
apache#13301) * [TIR] Update ReductionIterNotIndexOutputBuffer to check BlockRealizeNodes match_buffer statements when validating writes * Add test to verify that tensorized blocks are properly validated * update to take into account all match buffer regions. * lint
Configuration menu - View commit details
-
Copy full SHA for 7cd203d - Browse repository at this point
Copy the full SHA 7cd203dView commit details -
[Docker]Refactor timezone script and NRF installation (apache#13342)
This PR refactors timezone setup to a separate script that docker/install/ubuntu_install_core.sh Also, it adds a script to install NRF and reused in both cortexm docker and RVM installation path.
Configuration menu - View commit details
-
Copy full SHA for c66bb00 - Browse repository at this point
Copy the full SHA c66bb00View commit details -
[TIR][Arith] Fix divisor checking in
TryConstFold
(apache#13348)Fix denominator checking in `TryConstFold`.
Configuration menu - View commit details
-
Copy full SHA for 3a639a4 - Browse repository at this point
Copy the full SHA 3a639a4View commit details -
[MetaSchedule][Minor] Fix Typo in ApplyCustomRule Schedule Rule (apac…
…he#13353) * Fix typo. * Add regression test.
Configuration menu - View commit details
-
Copy full SHA for b582cd1 - Browse repository at this point
Copy the full SHA b582cd1View commit details
Commits on Nov 11, 2022
-
[MetaSchedule] Improve inlining and
VerifyGPUCode
for quantized mod……el workload (apache#13334) * [MetaSchedule] Add a new schedule rule to inline all scalar constants * add doc * reorg * identify constant block by its structure, not by name
Configuration menu - View commit details
-
Copy full SHA for 93fdf83 - Browse repository at this point
Copy the full SHA 93fdf83View commit details -
[MetaSchedule][Minor] Allow Zero Run Time In Benchmarking Result (apa…
…che#13354) This PR introduces a check to prevent records with run time of zero into the training data of cost model. This is because when working on microTVM there're cases where the run time of certain successful runs is very tiny, such that it got recorded as zero. In such cases, the runtime of 0 would break XGBoost model because it introduces infinite running speed in GFLOPs. A regression test was also added.
Configuration menu - View commit details
-
Copy full SHA for f950b11 - Browse repository at this point
Copy the full SHA f950b11View commit details -
[Bugfix][TIR] Patch for PR#13269 to support Python 3.10 (apache#13350)
It seems like there is some inconsistency across the python versions and make PR apache#13269 fails at Python 3.10. This patch fixes this issue. Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 6d68aff - Browse repository at this point
Copy the full SHA 6d68affView commit details -
[MetaSchedule] Unannotate
schedule_rule
if corresponding schedule f……unc is not found (apache#13346)
Configuration menu - View commit details
-
Copy full SHA for a156636 - Browse repository at this point
Copy the full SHA a156636View commit details -
Configuration menu - View commit details
-
Copy full SHA for f3eb239 - Browse repository at this point
Copy the full SHA f3eb239View commit details -
[MetaSchedule] Fuse loops around shared to global store block in `Mul…
…tiLevelTilingTensorCore` (apache#13357) * Fuse shared to global store loops in MultiLevelTilingTensorCore * update test
Configuration menu - View commit details
-
Copy full SHA for 5364e5a - Browse repository at this point
Copy the full SHA 5364e5aView commit details -
[TIR][Schedule] Make consistent implementation for GetProducers() & G…
…etConsumers() (apache#13344) Currently there are two versions of `GetConsumers()` and `GetProducers()` implementation. Make them consistent to avoid possible bug when there are WAR dependencies.
Configuration menu - View commit details
-
Copy full SHA for 4532712 - Browse repository at this point
Copy the full SHA 4532712View commit details -
Configuration menu - View commit details
-
Copy full SHA for f9ed60a - Browse repository at this point
Copy the full SHA f9ed60aView commit details -
[TIR] Make syntax of AST nodes different than ops (apache#13358)
As part of effort of more formal TIR semantics, we want to more explicitly differentiate TIR AST nodes (defined in `tir/expr.h`) and TIR ops (defined in `tir/op.h`). A naming convention is that: - Lowercased methods, for example, `tvm.tir.mul`, means an TIR op, which will be eagerly constant-folded, i.e. `mul(1, 2)` returns `3` immediately rather than creating an AST node. - Capitalized callable, for example, `Mul`, means creating an AST node without constant folding. This PR makes this behavior more explictly by printing `T.Mul(a, b)` directly when `a` and `b` are both constants, rather than sugaring it into `mul(a. b)` or `a * b`, so that the difference between an op and an AST node is clarified. Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com> Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for ce0e9ab - Browse repository at this point
Copy the full SHA ce0e9abView commit details -
[FQ2I] Add cast back to input data type after AvgPool2d (apache#13332)
[FQ2I] Add cast back to output data type after AvgPool2d This commit fixes the following issue: For the sequence of qnn.dequantize -> avg_pool2d -> conv2d -> qnn.quantize FQ2I pass inserts qnn.requantize (or cast) to int32 unconditionally before AvgPool2d. As a result fake quantized qnn.conv2d gets input as int32 dtype, but it is forbidden for qnn.conv2d (supports only uint8/int8/int16). This commit adds the following: Add cast back to output data type after AvgPool2d. This preserve input dtype == output dtype for this op.
Configuration menu - View commit details
-
Copy full SHA for 5ffcfd9 - Browse repository at this point
Copy the full SHA 5ffcfd9View commit details -
[IRBuilder][Minor] Add intrinsics like
T.int32x4
(apache#13361)This PR adds all common TIR intrinsics like `T.int32x4`, `T.floatx4`. Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 8897983 - Browse repository at this point
Copy the full SHA 8897983View commit details
Commits on Nov 12, 2022
-
[TIR][Schedule] Fix cache_read loc detecting and region_cover checking (
apache#13345) Fix 2 issues of cache related primitives: * Fix region_cover checking for cache related primitives * Fix CacheLocDetector for nested SeqStmt Co-authored-by: Min Chen <chen.min@intellif.com>
Configuration menu - View commit details
-
Copy full SHA for 3877117 - Browse repository at this point
Copy the full SHA 3877117View commit details -
[TVMScript] Reorganize the folder structure (apache#12496)
This PR introduces some minor restructuring of the `python/tvm/script` folder structure to make it more convenient for future upstreaming. Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for b20b7c4 - Browse repository at this point
Copy the full SHA b20b7c4View commit details
Commits on Nov 13, 2022
-
[ci] Assert some tests are not skipped in the CI (apache#12915)
In this PR, the skipped tests script will also check if tests in the `required_tests_to_run.json` have not been skipped. If there are skipped tests, they will be added to the returned comment. I am not entirely sure where it's best to place the `required_tests_to_run` file, so I left it in `tvm/ci/scripts/`. I am happy to take suggestions. Aims to prevent situations such as apache#12529
Configuration menu - View commit details
-
Copy full SHA for b8384d1 - Browse repository at this point
Copy the full SHA b8384d1View commit details
Commits on Nov 14, 2022
-
[CI] Separate the ci scripts into Github and Jenkins scripts (apache#…
…13368) This PR is a duplicate of apache#12940 and apache#12941. For some reason, I am unable to reopen apache#12940.
Configuration menu - View commit details
-
Copy full SHA for 5a767d0 - Browse repository at this point
Copy the full SHA 5a767d0View commit details -
[TIR][Bugfix] Fix AXIS_SEPARATORS in tir.Schedule.transform_layout (a…
…pache#13326) Preivously, the block SREF reuse only included a single step of changes, and would have an incorrect mapping if multiple sequential changes to the TIR block occurred. This could happen if a `BufferStore` was updated, followed by replacement of `Block` iter vars/values. This commit tracks the Block replacements across each usage, to ensure the SREF instances remain valid.
Configuration menu - View commit details
-
Copy full SHA for b6fae9b - Browse repository at this point
Copy the full SHA b6fae9bView commit details -
[ci] Fix Jenkins quoting (apache#13380)
Merging apache#13368 caused CI to pass but run more than it needed to due to some failures in determination. This fixes the interpolation to use `"` which should correctly pass through the variables Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 68f51e6 - Browse repository at this point
Copy the full SHA 68f51e6View commit details -
[CI] Do not merge before running CI on main (apache#13372)
This PR does not merge `main` if CI is running already on `main`. It aims to avoid a case where a race happens between two subsequent commits, and one of them merges the other. Fixes apache#12392.
Configuration menu - View commit details
-
Copy full SHA for 41a2243 - Browse repository at this point
Copy the full SHA 41a2243View commit details
Commits on Nov 15, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 3aa16f7 - Browse repository at this point
Copy the full SHA 3aa16f7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 647be2b - Browse repository at this point
Copy the full SHA 647be2bView commit details -
[TFLite] Enable int64 biases for int16 quantized operators (apache#12042
Configuration menu - View commit details
-
Copy full SHA for 034dc67 - Browse repository at this point
Copy the full SHA 034dc67View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2d3a5b5 - Browse repository at this point
Copy the full SHA 2d3a5b5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1e7e790 - Browse repository at this point
Copy the full SHA 1e7e790View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8b27975 - Browse repository at this point
Copy the full SHA 8b27975View commit details -
Configuration menu - View commit details
-
Copy full SHA for a91e052 - Browse repository at this point
Copy the full SHA a91e052View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4f20a25 - Browse repository at this point
Copy the full SHA 4f20a25View commit details -
Configuration menu - View commit details
-
Copy full SHA for cb5e183 - Browse repository at this point
Copy the full SHA cb5e183View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7c682ac - Browse repository at this point
Copy the full SHA 7c682acView commit details