-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Torchdynamo tuning script #9
Commits on Aug 18, 2022
-
[MetaSchedule] Handle deserializing empty string RVs in trace (apache…
…#12481) * trace.cc * add tests * remove assert * add proper test * lint * lint
Configuration menu - View commit details
-
Copy full SHA for fb07351 - Browse repository at this point
Copy the full SHA fb07351View commit details -
[HEXAGON][TOPI] This PR adjusts schedules so >64 length vector loads/…
…stores are not generated at LLVM level. This is a workaround for an instruction selection issue in current version of llvm for hexagon (apache#12471)
Configuration menu - View commit details
-
Copy full SHA for 436c17f - Browse repository at this point
Copy the full SHA 436c17fView commit details -
[COMMUNITY] Adam Straw -> Reviewer (apache#12480)
Krzysztof Parzyszek authoredAug 18, 2022 Configuration menu - View commit details
-
Copy full SHA for e140a27 - Browse repository at this point
Copy the full SHA e140a27View commit details -
Configuration menu - View commit details
-
Copy full SHA for aa97f4a - Browse repository at this point
Copy the full SHA aa97f4aView commit details -
[TVMScript] IRBuilder, IRBuilderFrame base class (apache#12482)
* [TVMScript] IRBuilder, IRBuilderFrame base class This PR introduces basic data structures of the generic IRBuilder across the codebase. IRBuilder is a general-purpose IRBuilder that can be used in TIR, Relax and any other vendor-specific dialects; IRBuilderFrame is where contexual information as stored in the IRBuilder. * fix linter * Update include/tvm/script/ir_builder/base.h Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 250b68e - Browse repository at this point
Copy the full SHA 250b68eView commit details -
Configuration menu - View commit details
-
Copy full SHA for da7675c - Browse repository at this point
Copy the full SHA da7675cView commit details -
Configuration menu - View commit details
-
Copy full SHA for a96bda4 - Browse repository at this point
Copy the full SHA a96bda4View commit details -
[HEXAGON] Auto-vectorization (fp16) for v68 (apache#12397)
* Auto-vectorization (fp16) for v68 * use tvm.testing.main in fp16 test of tanh_slice op
Configuration menu - View commit details
-
Copy full SHA for 88928a4 - Browse repository at this point
Copy the full SHA 88928a4View commit details -
[TIR] [bfloat16] add bfloat16 promotion for CallNode (apache#12370)
* add bfloat16 promotion for CallNode * add softmax to bfloat16 build test
Configuration menu - View commit details
-
Copy full SHA for efd7c45 - Browse repository at this point
Copy the full SHA efd7c45View commit details -
[CMSIS-NN] Re-use CPU Target Parser (apache#12320)
Previously `CMSISNNFlags` was derived using logic specific to the external code generator, this converts the external code generator options into a `Target`.
Configuration menu - View commit details
-
Copy full SHA for d1e6f39 - Browse repository at this point
Copy the full SHA d1e6f39View commit details -
[Target] Only append default keys if target doesn't have any yet (apa…
…che#12474) * [Target] Only append default keys if target doesn't have any yet This allows target parsers to provide their own target keys. Without this change, the default keys would always be appended, which may or may not be desirable. * Add "cpu" to ARM CPU keys * Add "cpu" to the keys in the mprofile target parser * Restore the mprofile cpptest, since the "cpu" key is back * So the -device attribute is actually needed...
Krzysztof Parzyszek authoredAug 18, 2022 Configuration menu - View commit details
-
Copy full SHA for 6def53a - Browse repository at this point
Copy the full SHA 6def53aView commit details -
[ci][tvmbot] Search more users when checking usernames (apache#12491)
To figure out a user's association with the repo this code before searched the associations in the repo filtered by the relevant username. GitHub doesn't return the exact match only though, so we have to instead collect many results and search through all of them. Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 72b0f5e - Browse repository at this point
Copy the full SHA 72b0f5eView commit details
Commits on Aug 19, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 5d17e24 - Browse repository at this point
Copy the full SHA 5d17e24View commit details -
Configuration menu - View commit details
-
Copy full SHA for c0d440d - Browse repository at this point
Copy the full SHA c0d440dView commit details -
[microTVM] Add config space to dense_dsp schedule (apache#12444)
* add config space * lint * lint
Configuration menu - View commit details
-
Copy full SHA for 8b3401c - Browse repository at this point
Copy the full SHA 8b3401cView commit details -
[TOPI]fix scatterND large shape problem (apache#12200)
* fix scatterND large shape problem * fix thread pool alloca * add scatternd unit test * update with comment * Empty Co-authored-by: wrongtest <wrongtest0@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 41be1b4 - Browse repository at this point
Copy the full SHA 41be1b4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d6039b - Browse repository at this point
Copy the full SHA 9d6039bView commit details -
[Fix] Fix some typos (apache#11503)
Fix some typos in src/. Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for bdcfa01 - Browse repository at this point
Copy the full SHA bdcfa01View commit details -
Configuration menu - View commit details
-
Copy full SHA for c83ee08 - Browse repository at this point
Copy the full SHA c83ee08View commit details
Commits on Aug 20, 2022
-
[Relay][Layout] Add FInferCorrectLayout for L2 norm layout transform. (…
…apache#12497) * [Relay][Layout] FInferCorrectLayout for L2 norm layout change. * [Relay][Layout] Test for L2 norm layout transform. * [Relay][Layout] Re-edit test to add multi-dimensional axis list. * Fix cpplint errors * Use clang-format-10 rules. * replace uint with size_t.
Configuration menu - View commit details
-
Copy full SHA for 1985c01 - Browse repository at this point
Copy the full SHA 1985c01View commit details -
Configuration menu - View commit details
-
Copy full SHA for eb31123 - Browse repository at this point
Copy the full SHA eb31123View commit details -
[TIR][Schedule][UX] Beautify TIR Trace Printing (apache#12507)
Following apache#12197, this PR introduces `Schedule.show()` which convenience the user experience in the following two aspects: - Python syntax highlighting - Outputs a schedule function instead of standalone instructions so that it's easier to follow. To demonstrate this change: - Before `Schedule.show()` is introduced: <img width="555" alt="image" src="https://user-images.githubusercontent.com/22515877/185713487-03722566-1df7-45c7-a034-c1460d399681.png"> - After this change: <img width="583" alt="image" src="https://user-images.githubusercontent.com/22515877/185713564-c54f3a9d-cd52-4709-a8b8-d8a61361e611.png">
Configuration menu - View commit details
-
Copy full SHA for 3b3443b - Browse repository at this point
Copy the full SHA 3b3443bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 125c9ca - Browse repository at this point
Copy the full SHA 125c9caView commit details -
[MetaSchedule] Migrate MemoryDatabase to C++ (apache#12514)
This PR migrates the existing MemoryDatabase, which is implemented in python at the moment, to C++. The original intent of having an in-memory database that does not persist on disk is merely for testing, but as times go on, we found it useful in production workflow, and thus decided to migrate it C++ for potentially better performance.
Configuration menu - View commit details
-
Copy full SHA for 8ee4b60 - Browse repository at this point
Copy the full SHA 8ee4b60View commit details -
Configuration menu - View commit details
-
Copy full SHA for 92355f2 - Browse repository at this point
Copy the full SHA 92355f2View commit details
Commits on Aug 21, 2022
-
[TVMScript] Printer entry point (apache#12462)
This PR: - Adds an entry point for the TVMScript Unified Printer - Adds a helper object class `RootNodeContainer` to provide an injection point for the actual printer implementation to add specialized logic on the root node to print. Tracking issue: apache#11912
Configuration menu - View commit details
-
Copy full SHA for cc769fd - Browse repository at this point
Copy the full SHA cc769fdView commit details
Commits on Aug 22, 2022
-
[TVMScript] Printer: add boolean operators to OperationDoc (apache#12518
) This PR adds boolean operators to OperationDoc. This is needed by the TIR expression printing because it has `tir::And` and `tir::Or`. Tracking issue: apache#11912
Configuration menu - View commit details
-
Copy full SHA for 2629065 - Browse repository at this point
Copy the full SHA 2629065View commit details -
Configuration menu - View commit details
-
Copy full SHA for e9aad35 - Browse repository at this point
Copy the full SHA e9aad35View commit details -
[ETHOSN] Remove support for older versions of the driver stack (apach…
…e#12347) Removes support for driver stack versions older than 22.05 (semantic 3.0.1). Additionally, changes the integration to make version checks using semantic versioning rather than the previous year.month versioning method.
Configuration menu - View commit details
-
Copy full SHA for 7c318d7 - Browse repository at this point
Copy the full SHA 7c318d7View commit details -
[TIR] Support AllocateConst nodes in TensorIR scheduling flow (apache…
…#12489) * [TIR] Support AllocConstantNode in CreatePrimFunc * Handle AllocConstantNode in LeafBlockRemovalPlan * Properly handle AllocConstNode in BufferAllocationLocator * handle AllocateConst in EstimateFlops * remove NDArray printing * doc update * add test * cpplint * Removed dependency on link-params attribute from target * Restored NDArray printing to unbreak test
Configuration menu - View commit details
-
Copy full SHA for 8146a9b - Browse repository at this point
Copy the full SHA 8146a9bView commit details -
[ONNX] Fix test to disable default ONNX frontend constant folding (ap…
…ache#12532) In TVM ONNX frontend, constants are folded by default, which makes `test_load_model__onnx` to fail because it is looking for "params" that were already converted into constants. This patch fixes the test to disable constant folding so that we can assert that "params" in the model are present as expected.
Configuration menu - View commit details
-
Copy full SHA for 48a8cbd - Browse repository at this point
Copy the full SHA 48a8cbdView commit details -
[CI] Set test dependency on "transformers" package with pytest.import…
…orskip (apache#12528) `test_meta_schedule_integration_extract_from_bert_base` depends on the `transformers` package, which is not currently installed in our Docker images. When running this test currently, it fails with an ImportError. This patch makes this dependency explicit and will make the test to be skipped when the dependency is not installed. `test_meta_schedule_integration_extract_from_bert_base` is part of the integration tests, which is currently only running on AArch64 and CPU image (both not at the moment with torch installed in the live CI system), so this is another issue to be understood/fixed.
Configuration menu - View commit details
-
Copy full SHA for 3896756 - Browse repository at this point
Copy the full SHA 3896756View commit details -
[MicroTVM] expose project options in autotuning (apache#12479)
* expose project_options in autotuning * address comment * address comment Co-authored-by: Mohamad <mkatanbaf@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 40fd43e - Browse repository at this point
Copy the full SHA 40fd43eView commit details -
[TIR][Schedule] Support for specific consumer block targeting in cach…
…e_read (apache#12505) * Add optional consumer blocks to cache_read. * remove comments * Fully functional * Add test for consumer targetting. * Formatting. * Add missing parameter comment. * Fix comments * Simplify type of consumer_blocks in python. * Change how consumer_blocks is printed in python.
Josh Fromm authoredAug 22, 2022 Configuration menu - View commit details
-
Copy full SHA for 3f56851 - Browse repository at this point
Copy the full SHA 3f56851View commit details -
Configuration menu - View commit details
-
Copy full SHA for c1bd022 - Browse repository at this point
Copy the full SHA c1bd022View commit details -
[ci] xfail failing ethosu codegen tests (apache#12508)
This adds a testing utility so we can mark parameter combinations as xfail without having to manually match each parameter from the name into the code. The param strings here come directly from CI logs as in https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-12389/5/pipeline Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 4d7e7a8 - Browse repository at this point
Copy the full SHA 4d7e7a8View commit details -
[CI] Add alexnet and googlenet caffe model to request hook (apache#12510
) This PR intends to move the alexnet and googlenet caffe models from the old link to s3, therefore getting rid of the flakiness in `caffe/test_forward.py` introduced by external url timeouts. Fixes apache#12465
Yuanjing Shi authoredAug 22, 2022 Configuration menu - View commit details
-
Copy full SHA for 66a31e9 - Browse repository at this point
Copy the full SHA 66a31e9View commit details -
[LLVM] Add "cl-opt" attribute to target_kind "llvm" (apache#12440)
* [LLVM] Add "cl-opt" attribute to target_kind "llvm" Add LLVMTargetInfo class that can be used to query the LLVM configuration without forcing an LLVMTarget to be created. There is no programmatic way to obtain the actual type of an LLVM option. The type is necessary to obtain the value of the option, hence it must be provided as a part of the option string. See src/target/llvm/target_kind.cc for more information about the syntax. * Fix lowercasing of bool value string * Use std::optional instead of std::pair<..., bool> * Treat malformed options as fatal errors * Fix linter * More unit tests for option parsing, have one case per test * Remove "option ignored" from fatal error messages
Krzysztof Parzyszek authoredAug 22, 2022 Configuration menu - View commit details
-
Copy full SHA for e5e05fe - Browse repository at this point
Copy the full SHA e5e05feView commit details -
[BugFix][UMA] Fix order issue in uma_lower (apache#12447)
There was a flaw in uma_lower (see issue apache#12410) that lead in some case to a different argument ordering of the cached_func and the Relay function. This results in an incorrect lowering of the primfunc and eventually a wrong result of a run-time error, in some cases. This commit adds code to correct the described misbehavior and a unit test case to check this end-to-end functionality with a TFLITE model.
Configuration menu - View commit details
-
Copy full SHA for 902343a - Browse repository at this point
Copy the full SHA 902343aView commit details -
[TIR] Add pass to check for out of bounds memory access (apache#12352)
* [TIR] Add pass to check for out of bounds memory access This is a conservative static analysis that checks to see if any out of bounds array access occurs. It is not enabled by default. * formatting * manually construct local irmodule * update comment * fix bug in int_set
Tristan Konolige authoredAug 22, 2022 Configuration menu - View commit details
-
Copy full SHA for 1e399fa - Browse repository at this point
Copy the full SHA 1e399faView commit details
Commits on Aug 23, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 8e95bba - Browse repository at this point
Copy the full SHA 8e95bbaView commit details -
check for CMSIS_PATH in project generation (apache#12547)
Co-authored-by: Mohamad <mkatanbaf@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 3bd1681 - Browse repository at this point
Copy the full SHA 3bd1681View commit details -
[microTVM] Rework evaluate_model_accuracy into a more generic helper …
…function (apache#12539) * Add workaround for apache#12538 * Rework evaluate_model_accuracy into predict_labels_aot
Configuration menu - View commit details
-
Copy full SHA for 5cef6bf - Browse repository at this point
Copy the full SHA 5cef6bfView commit details -
[microTVM] Replace static fixtures with parameterization (apache#12530)
* Replace microTVM static fixtures with parameterization * [microTVM] Only perform parameterization when fixture is present * Reformat with black * Fix Cortex-M tests * Add docstring to pytest_generate_tests * Remove trailing space from docstring
Configuration menu - View commit details
-
Copy full SHA for 58f2139 - Browse repository at this point
Copy the full SHA 58f2139View commit details -
[docs] Add CI contribution instructions (apache#12551)
This PR documents the steps to introducing a new CI docker image, which we've been doing a lot lately.
Configuration menu - View commit details
-
Copy full SHA for e252d7f - Browse repository at this point
Copy the full SHA e252d7fView commit details -
[ACL] Adjust mobilenet test for Keras 2.9 (apache#12541)
In Keras 2.7, one "reshape" operator was removed from the Mobilenet model, making our test which verifies the number of operators to be incorrect. This patch adjusts the operator count so that it is in line with the changes in Keras. For reference, the change in keras repo was done in hash b6abfaed132 "Remove unnecessary reshape layer in MobileNet architecture".
Configuration menu - View commit details
-
Copy full SHA for d26bf80 - Browse repository at this point
Copy the full SHA d26bf80View commit details -
[COMMUNITY] @konturn -> Reviewer (apache#12543)
Co-authored-by: Leandro Nunes <leanun01@e123855.arm.com>
Configuration menu - View commit details
-
Copy full SHA for 3983a47 - Browse repository at this point
Copy the full SHA 3983a47View commit details -
Fix TFLite 2.9 tests (apache#12130)
This pr fixes the tests that will be broken when we will update TFLite to the 2.9 version. We will update TensorFlow and TFLite versions to 2.9 so that we can benefit from improvements in packaging to support multiple platforms and Operating Systems.
Nicola Lancellotti authoredAug 23, 2022 Configuration menu - View commit details
-
Copy full SHA for 383bd41 - Browse repository at this point
Copy the full SHA 383bd41View commit details -
[CMSIS-NN] Pad fusion with QNN Conv2D (apache#12353)
Pass that fuses nn.pad and qnn.conv2d for CMSIS-NN target.
Configuration menu - View commit details
-
Copy full SHA for 52779f1 - Browse repository at this point
Copy the full SHA 52779f1View commit details -
[CI][AArch64] Skip libgomp failures in integration tests (apache#12554)
Some integration tests are failing when running in CI machines that have torch installed (validated only in AARch64 for now), with an error message related to libgomp, similar to the one above: OSError: /.../dist-packages/torch/lib/libgomp-d22c30c5.so.1: cannot allocate memory in static TLS block As part of enabling the integration tests in AArch64, I'm marking this tests as skipped, so that tests can start executing and don't regress while we take time to investigate these specific failures.
Configuration menu - View commit details
-
Copy full SHA for d271678 - Browse repository at this point
Copy the full SHA d271678View commit details -
[ETHOSN] Fix requantize output conversion (apache#12540)
Fixes a small issue when converting the output information to the support library API. The `requantize_info` output datatype needed updating with the output datatype from the relay function to ensure the graph is compiled correctly by the support library. Included a test to prevent regression in the future.
Configuration menu - View commit details
-
Copy full SHA for ff46fa1 - Browse repository at this point
Copy the full SHA ff46fa1View commit details -
[Relay] Add Rsqrt to SimplifyExpr (apache#12363)
* Add Rsqrt to SimplifyExpr * fix unit tests
Matthew Brookhart authoredAug 23, 2022 Configuration menu - View commit details
-
Copy full SHA for dd7ae2d - Browse repository at this point
Copy the full SHA dd7ae2dView commit details -
[AutoTVM] Add support for text buffers to ApplyHistoryBest (apache#12521
) Currently, AutoTVM's ApplyHistoryBest class does not support loading tuning logs from memory. This is a pet peeve of mine, as it requires you to work with a tempfile whenever writing autotuning tests. This is also just strange, as the rest of AutoTVM has support for text buffers (e.g. tvm.autotvm.callback.log_to_file supports passing in a text buffer, letting us write to but not read from them). Additionally, ApplyHistoryBest handles input arguments very unintuitively. Before this PR, it allowed users to pass string filepaths, a list of string filepaths, or an Iterable (such as a list) of input and result tuples. However, it did not support taking in StringIO objects as mentioned above, nor pathlib.Path objects, nor combinations of a filepath and an Iterable of tuples. In a perfect world, we would change ApplyHistoryBest to take as input a path-like object, file-like object, or an Iterable of input and result tuples (similar to what ApplyGraphBest takes as an argument). However, this would break the existing functionality to take as input a list of filepaths. To be backwards compatible, while fixing this issue, this pull request defines a new type inside dispatcher.py: Records = Union[ Union[str, bytes, Path], # Path-like objects TextIOBase, # File-like objects Iterable[Tuple[MeasureInput, MeasureResult]], ] It then rewrites ApplyHistoryBest.load so it takes the following arguments: def load(self, records: Union[Records, Iterable[Records]]): This PR also adds unit tests for this new functionality, and fixes a relevant bug in tests/micro/common/test_autotune.py in which a StringIO object was passed to apply_history_best, causing it to appear to pass but not actually read any data.
Configuration menu - View commit details
-
Copy full SHA for da5836f - Browse repository at this point
Copy the full SHA da5836fView commit details -
[skip ci][ci] Mark more ethosu tests with xfail (apache#12560)
See apache#12511 for context. Since more parameterizations are popping up as failed, this disables whole tests rather than specific combinations of parameters.
Configuration menu - View commit details
-
Copy full SHA for 1d71c1b - Browse repository at this point
Copy the full SHA 1d71c1bView commit details -
[CI] Remove Vela from ci_cpu (apache#12533)
While the dependencies for microNPU and CMSIS-NN moved into ci_cortexm, Vela is still installed in ci_cpu. As a result, we have some of the microNPU tests outside of test_ethosu folder failing since they use precence of Vela to decide whether to skip the test. This change will * remove Vela from ci_cpu * remove unnecessary PATH update
Configuration menu - View commit details
-
Copy full SHA for 99b9b74 - Browse repository at this point
Copy the full SHA 99b9b74View commit details -
[ETHOSN] Add support for special indices of Reshape (apache#12556)
This pr adds support for the special indices values of the reshape operator for the Arm(R) Ethos(TM)-N NPU.
Nicola Lancellotti authoredAug 23, 2022 Configuration menu - View commit details
-
Copy full SHA for 4d104e5 - Browse repository at this point
Copy the full SHA 4d104e5View commit details -
[MicroTVM] add heap-size to project options (apache#12390)
* heap-size is added to project options * change stm32l4r5zi recommended heap size * change stm32l4r5zi recommended heap size * addressing comments * addressing comments * addressing comments Co-authored-by: Mohamad <mkatanbaf@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 8c23469 - Browse repository at this point
Copy the full SHA 8c23469View commit details -
Replace std::result_of (deprecated in C++17) with std::invoke_result,…
… NFC (apache#12562)
Krzysztof Parzyszek authoredAug 23, 2022 Configuration menu - View commit details
-
Copy full SHA for 13ebbfb - Browse repository at this point
Copy the full SHA 13ebbfbView commit details -
Add using directives for otherwise hidden virtual functions, NFC (apa…
…che#12561) This silences warning ``` warning: 'foo' hides overloaded virtual functions [-Woverloaded-virtual] ``` typically caused by overriding only some overloads of `VisitExpr_` from a set defined in the base class.
Krzysztof Parzyszek authoredAug 23, 2022 Configuration menu - View commit details
-
Copy full SHA for 8174d08 - Browse repository at this point
Copy the full SHA 8174d08View commit details
Commits on Aug 24, 2022
-
[Target] Remove deprecated parameters from target (apache#12416)
* remove depricated parameters in target * lint * fix cpp tests fix * remove more configs in test files * address comments * fix error * fix hexagon * fix micro tutorial * fix integration tests * fix hexagon * lint * fix unittest * fix readme * fix assert executor in target * address comments * fix tutorials * fix hexagon target * fix tutorial * fix for tutorials * hexagon
Configuration menu - View commit details
-
Copy full SHA for c15cc5e - Browse repository at this point
Copy the full SHA c15cc5eView commit details -
[PyTorch][Fix] Fix for numerically unstable logsigmoid (apache#12563)
* Fix numerical instability for log sigmoid Fix numerical instability for log sigmoid in pytorch frontend * update * add test for overflow check * merging two tests
Configuration menu - View commit details
-
Copy full SHA for 5778261 - Browse repository at this point
Copy the full SHA 5778261View commit details -
[microNPU] Force compute_cycles_hint to be interpreted as an int64 va…
…lue (apache#12558) `compute_cycles` can be the size of an int64 value, however it seems that when that value is attached to the IR as a pragma from Python, it is interpreted as an `int`, rather than `int64_t`. This commit adds an explicit cast to ensure the value is interpreted correctly. The reason these values started appearing very large and randomly is still yet to be solved, although the hope is that this fix will unblock CI. Change-Id: Idcdd7d37af1acd665590c87624446a025b50eb3d
Configuration menu - View commit details
-
Copy full SHA for e468dc2 - Browse repository at this point
Copy the full SHA e468dc2View commit details -
[CI][CMSIS-NN] Running tests parallel using pytest-xdist (apache#12557)
Introducing -n auto for CMSIS-NN tests to run them in parallel with pytest-xdist. This is needed because of additional parameterization done over cpu variants. Change-Id: I02e1b37ead0b0a562b5b1b2dacfeb3fdd7cc1ce3
Configuration menu - View commit details
-
Copy full SHA for 90b2f0d - Browse repository at this point
Copy the full SHA 90b2f0dView commit details -
[ETHOSN] Add support for resize (apache#12535)
This commit adds support for the `resize` operator for Arm(R) Ethos(TM)-N NPU.
Nicola Lancellotti authoredAug 24, 2022 Configuration menu - View commit details
-
Copy full SHA for 989e5a1 - Browse repository at this point
Copy the full SHA 989e5a1View commit details -
[TIR][CompactBufferAllocation] Improve upperbound estimation of buffe…
…r compaction (apache#12527) Hi, this change wants to add some minor updation to region estimator used by buffer compaction: - Add and clearify among `EstimateRegionStrictBound`, `EstimateRegionLowerBound` and `EstimateRegionUpperBound` Originally we have `EstimateRegionLowerBound`, actually it implements strict bound estimation IMO. Now add `upper` and `strict` version for where we actually want them. - When estimating upperbounds (eg. in buffer compaction), try estimate each dimension independently when they are dependent accesses where `EstimateRegionLowerBound` is expected to fail. Eg, `A[i, i], 3 < i < 16` fails via `EstimateRegionLowerBound` who check indices be independent. But we can still try best to invoke strict bound analysis on each dimension individually. - If range->extent == 1 for `EvalSet(range, dom)`, invoke `EvalSet(range->min, dom)` instead. Eg, `EvalSet([k*k, k*k+1), dom_k)` results to [-inf, +inf] due to current algorithm limitation but `EvalSet(k*k, dom_k)` results to a range which makes more sense.
Configuration menu - View commit details
-
Copy full SHA for 1ec2c36 - Browse repository at this point
Copy the full SHA 1ec2c36View commit details -
[Target] Replace IsaAnalyzer with Target Features (apache#12322)
This is clean up to use the new `target.features` instead of `IsaAnalyzer`.
Configuration menu - View commit details
-
Copy full SHA for 592148a - Browse repository at this point
Copy the full SHA 592148aView commit details -
[CI] Set test python.contrib.test_onnx.test_resize as xfail (apache#1…
…2568) `python.contrib.test_onnx.test_resize` is failing due to a numerical accuracy issue, reported in apache#12567. This patch marks that test as an xfail, so that other tests can be enabled, while this one is investigated separately.
Configuration menu - View commit details
-
Copy full SHA for 6e79f64 - Browse repository at this point
Copy the full SHA 6e79f64View commit details -
[ETHOSN] Support multiply conversion to depthwise (apache#12403)
Multiply can be supported when offloaded to the NPU by a conversion to a depthwise convolution operation. This is only supported when the multiply operation has a single single variable input with the other being a constant of shape [1, ..., C]. This commit adds a new pass "ConvertEquivalents" (name subject to change) to handle this conversion before codegen.
Configuration menu - View commit details
-
Copy full SHA for a0fe74b - Browse repository at this point
Copy the full SHA a0fe74bView commit details -
[TIR] Expose Vector-related API in Python (apache#12571)
This PR exposes the following TIR operation in python: - `vectorlow`: tested [here](https://github.com/apache/tvm/blob/592148abf6866a41eefa736efca067d42f5aea86/python/tvm/tir/tensor_intrin/arm_cpu.py#L62) - `vectorhigh`: tested [here](https://github.com/apache/tvm/blob/592148abf6866a41eefa736efca067d42f5aea86/python/tvm/tir/tensor_intrin/arm_cpu.py#L79) - `vectorcombine`: add new unittest Co-Authored-By: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 038523e - Browse repository at this point
Copy the full SHA 038523eView commit details -
[Hexagon] Add support to run on multiple devices (apache#12504)
* working in parralel using worker * creating launchers per test and clean up * clean up * ci change to distrube tests * ci work with any number of devices * fix running on simulator * adding function docstring * fix android_serial_number to always return a list of string * lint issue * fix internal error when skipping tests while androideserial number is not set * lint issue
Configuration menu - View commit details
-
Copy full SHA for bf65b39 - Browse repository at this point
Copy the full SHA bf65b39View commit details -
[Hexagon] Fix missing pytest import (apache#12565)
* Add pytest * lint
Configuration menu - View commit details
-
Copy full SHA for f53ee0c - Browse repository at this point
Copy the full SHA f53ee0cView commit details -
[TOPI][Hexagon] Implement quantized avgpool (apache#12340)
* [TOPI][Hexagon] Implement quantized avgpool * Fix pylint errors * Needed to adjust input padding for int8 buffer layout * Fix formatting issue * Add unit test for fixed-point conversion utility function Also, address review comments. * Remove pytest.skip for test_avg_pool2d_slice.py to enable on-target testing * Fix formatting issue * Update python/tvm/topi/hexagon/utils.py Co-authored-by: Christian Convey <christian.convey@gmail.com> * Update comments and error messages * Address review comments * Import Tuple from typing * Address pylint error Co-authored-by: Christian Convey <christian.convey@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 1afd059 - Browse repository at this point
Copy the full SHA 1afd059View commit details
Commits on Aug 25, 2022
-
[microTVM] Fix
build
directory exists error (apache#12575)When you build a project from existing project directory using `tvm.micro.project.GeneratedProject.from_directory` it would show up error if build directory previously existed.
Configuration menu - View commit details
-
Copy full SHA for 17989e8 - Browse repository at this point
Copy the full SHA 17989e8View commit details -
[MicroTVM] fix compile error when the compiler implements char as uns…
…igned (apache#12519) When compiling tvm with micro on the compiler which implements char as unsigned(such as arm-linux-gcc), there is an error: `src/runtime/crt/graph_executor/load_json.c:218:12: error: result of comparison of constant -1 with expression of type 'char' is always false [-Werror,-Wtautological-constant-out-of-range-compare]` ` if (ch == EOF || ch == '\r' || ch == '\n') {` The reason is because the implementation of char is undefined, so it's better to specify here that it is signed.
Configuration menu - View commit details
-
Copy full SHA for b8fbfe2 - Browse repository at this point
Copy the full SHA b8fbfe2View commit details -
[TIR] Expose
shift_left
andshift_right
to Python (apache#12584)This PR exposes the following TIR operation in python: - `shift_left`: tested [here](https://github.com/apache/tvm/blob/1afd0593956066635ee49297b731726c9218c91c/tests/python/unittest/test_tir_transform_simplify.py#L487) - `shift_right`: add new unittest Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for cd8fd91 - Browse repository at this point
Copy the full SHA cd8fd91View commit details -
[MetaSchedule] Add software pipeline in CUDA tensor core auto tensori…
…zation (apache#12544) cc @Hzfengsy @junrushao @junrushao1994 @masahi @spectrometerHBH
Configuration menu - View commit details
-
Copy full SHA for 9aac161 - Browse repository at this point
Copy the full SHA 9aac161View commit details -
[TIR] Expose WMMA-related TensorCore builtins (apache#12589)
This PR exposes the following TIR operation in python: `tvm_load_matrix_sync`: tested [here](https://github.com/apache/tvm/blob/cd8fd9121deb22b078c9fe73cd8a554e6e7a0e15/tests/python/unittest/test_tvmscript_roundtrip.py#L711) `tvm_store_matrix_sync`: tested [here](https://github.com/apache/tvm/blob/cd8fd9121deb22b078c9fe73cd8a554e6e7a0e15/tests/python/unittest/test_tvmscript_roundtrip.py#L913) `tvm_mma_sync`: tested [here](https://github.com/apache/tvm/blob/cd8fd9121deb22b078c9fe73cd8a554e6e7a0e15/tests/python/unittest/test_tvmscript_roundtrip.py#L860) `tvm_bmma_sync`: add new unittest `tvm_fill_fragment`: tested [here](https://github.com/apache/tvm/blob/cd8fd9121deb22b078c9fe73cd8a554e6e7a0e15/tests/python/unittest/test_tvmscript_roundtrip.py#L571) Co-authored-by: yongwww <yongcale@gmail.com> cc: @junrushao cc @Hzfengsy @junrushao1994 Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for b387384 - Browse repository at this point
Copy the full SHA b387384View commit details -
[PyTorch] Add aten::new_empty (apache#12591)
This PR intends to add `aten::new_empty` which is used for model like `hf_Longformer`. cc: @masahi
Yuanjing Shi authoredAug 25, 2022 Configuration menu - View commit details
-
Copy full SHA for 40bdea8 - Browse repository at this point
Copy the full SHA 40bdea8View commit details -
[CI] Install xgboost in Hexagon image (apache#12592)
Needed for apache#12587 @mehrdadh cc @Mousius @areusch @driazati @gigiblender
Configuration menu - View commit details
-
Copy full SHA for fb7cf97 - Browse repository at this point
Copy the full SHA fb7cf97View commit details -
[microTVM][Zephyr] Add recommended heap size for NRF and qemu_x86 (ap…
…ache#12585) This PR sets recommended heap size for qemu_x86 and NRF board to fix memory size with models like VWW using AoT host driven executor.
Configuration menu - View commit details
-
Copy full SHA for cc19cdd - Browse repository at this point
Copy the full SHA cc19cddView commit details -
[CI] Assert some unittests are not skipped in CI (apache#12436)
This PR adds a script that does a diff of skipped tests between the latest successful build on the main and the current branch. Then, it posts a comment with the report on the open PR. apache#11670
Configuration menu - View commit details
-
Copy full SHA for 56b7c8a - Browse repository at this point
Copy the full SHA 56b7c8aView commit details -
[DOC] fix code-block error in debuggging TVM part (apache#12597)
The code block in part Debuggging TVM is not showing up. Just fix it.
Configuration menu - View commit details
-
Copy full SHA for 61c034a - Browse repository at this point
Copy the full SHA 61c034aView commit details -
[CI] github_cc_reviewers: Catch all exceptions so all reviewers can b…
…e processed (apache#12578) In a recent change, `github.post` throws `RuntimeError` instead of `HTTPError` when the requested reviewer isn't a project collaborator. This prevents other reviewers to be added to the PR, for example, https://github.com/apache/tvm/runs/8001367110?check_suite_focus=true. This PR changes the caller to catch any exception so the execution won't be interrupted. Co-authored-by: driazati <9407960+driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for b547106 - Browse repository at this point
Copy the full SHA b547106View commit details -
[microNPU] Remove xfail from tests relating to apache#12511 (apache#1…
…2570) Removes tests previously marked as xfail since the issue has now been resolved.
Configuration menu - View commit details
-
Copy full SHA for 399f2e9 - Browse repository at this point
Copy the full SHA 399f2e9View commit details -
[ETHOSN] Support conversion of add to depthwise (apache#12531)
In similar fashion to the conversion of mul to depthwise, this commit converts add when one input is a constant of shape [1, ..., n] to a depthwise convolution. If neither input is a constant, the add is offloaded naturally like before. The addition testing has been improved to use pytest features.
Configuration menu - View commit details
-
Copy full SHA for f7c1436 - Browse repository at this point
Copy the full SHA f7c1436View commit details -
[F2QI] Fix a rounding error on AvgPool when input and output affine s…
…cales differ (apache#12577) cc @sfvaroglu @AndrewZhaoLuo
Matthew Brookhart authoredAug 25, 2022 Configuration menu - View commit details
-
Copy full SHA for 21db1eb - Browse repository at this point
Copy the full SHA 21db1ebView commit details -
[CUDA][CodeGen] Fix cuda codegen's fp16 inf literal (apache#12581)
* Fix cuda codegen's fp16 inf literal * add relay testcase
Configuration menu - View commit details
-
Copy full SHA for bb00a15 - Browse repository at this point
Copy the full SHA bb00a15View commit details -
[ci] Default to n=2 for test parallelism (apache#12414)
* Revert "[skip ci] Revert "[ci] Default to n=2 for test parallelism (apache#12376)" (apache#12413)" This reverts commit 478b672. * [ci] Default to n=2 for test parallelism This is attempt #2 of apache#12376 which was reverted in apache#12413. The changes in `plugin.py` should keep all the tests on the same node so sporadic failures don't happen due to scheduling. Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 01fcdfc - Browse repository at this point
Copy the full SHA 01fcdfcView commit details -
[Runtime] Change default alignment to 64 bytes. (apache#12586)
* Change default alignment to 64 bits. * Run dlpack test a few times. * Update alignment in tests. * Revert mma alignment change. * Change default printing of buffer. * Change crt runtime default allocation.
Josh Fromm authoredAug 25, 2022 Configuration menu - View commit details
-
Copy full SHA for 8d60b3c - Browse repository at this point
Copy the full SHA 8d60b3cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5db38ba - Browse repository at this point
Copy the full SHA 5db38baView commit details -
[skip ci][Community] Wuwei Lin -> PMC (apache#12605)
[Community] Wuwei Lin -> PMC
Configuration menu - View commit details
-
Copy full SHA for a9f7c32 - Browse repository at this point
Copy the full SHA a9f7c32View commit details
Commits on Aug 26, 2022
-
[TOPI][Bugfix] Make semantics of empty
axis
insqueeze
consistent…… with Relay (apache#12596) * Fix empty axis of `squeeze` in TOPI. * Add test case for `squeeze` with empty `axis`. * Add LLVM target for `test_squeeze`.
Configuration menu - View commit details
-
Copy full SHA for 3224817 - Browse repository at this point
Copy the full SHA 3224817View commit details -
[TIR] Expose Memory Copy-Related PTX Builtins (apache#12611)
* Expose Memory Copy-Related PTX Builtins This PR exposes the following TIR operation in python: `ptx_ldmatrix`: tested `ptx_cp_async`: tested `ptx_commit_group`: tested `ptx_wait_group`: tested Co-authored-by: yongwww <yongcale@gmail.com> * apply code review suggestion Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 4f431c8 - Browse repository at this point
Copy the full SHA 4f431c8View commit details -
[TIR][Schedule] enhance compute_at and reverse_compute_at primitive t…
…o choose possible position (apache#12450) Current TIR "compute_at" primitive will compute at it's closest consumers. When a block has multiple producers, whoever compute at later who is behind. But for some special hardware, we usually hope keep the a certain order whatever it's compute at early or late. eg: block A and block B are producers of block C. block A compute at block C first and block B compute at block C later. We hope the result is block B->block A->block C under some loop var.
Configuration menu - View commit details
-
Copy full SHA for e02f2f9 - Browse repository at this point
Copy the full SHA e02f2f9View commit details -
[SimplifyExpr] Add simplify for dq->arg funcs (apache#12580)
* add simplify for dq->arg funcs * add comments, fix lint * move comments to the right spots
Matthew Brookhart authoredAug 26, 2022 Configuration menu - View commit details
-
Copy full SHA for d171b4a - Browse repository at this point
Copy the full SHA d171b4aView commit details -
[Hexagon] Initial support for meta schedule tuning (apache#12587)
Enables AutoTVM-style, template-based tuning for Hexagon. To run compiled code on Hexagon, we need to use Hexagon `Session` object https://github.com/apache/tvm/blob/dc522a6ff65b68532cd1bba43827cd981114df2c/python/tvm/contrib/hexagon/session.py#L35 in the metaschedule `RPCRunner`. But for RPC "session", `RPCRunner` expects an instance of `RPCSession`, https://github.com/apache/tvm/blob/53fe5966823eee4e011d7228bceab3c82c1d9caa/python/tvm/rpc/client.py#L32, to be created and used by various customizable functions. Since `RPCSession` and Hexagon `Session` have slightly different API, we cannot use `RPCRunner` with customizable functions directly. So I introduced an alternative implementation of `RPCRunner` for Hexagon. The test is disabled for simulator since `HexagonLauncherSimulator` is not pickle-able due to its `multiprocessing.Process` attribute: https://github.com/apache/tvm/blob/c97895e0ffb512e73c89de7cdee9846f052244fc/python/tvm/contrib/hexagon/build.py#L614 Output log from tuning `vrmpy` dense (included in the test) ``` ID | Name | FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted Latency (us) | Trials | Terminated -------------------------------------------------------------------------------------------------------------- 0 | main | 150994944 | 1 | 380.3399 | 397.0000 | 397.0000 | 32 | -------------------------------------------------------------------------------------------------------------- ```
Configuration menu - View commit details
-
Copy full SHA for d87fa85 - Browse repository at this point
Copy the full SHA d87fa85View commit details -
[TIR] More hygenic TVM_SREF macros (apache#12607)
Previously, the `TVM_SREF_TO_BLOCK`, `TVM_SREF_TO_FOR`, and `TVM_TYPE_AS` macros required both the input and output variables. The input variable name is useful for improving the error message returned, but the output variable name isn't necessary for this functionality, and prevents the macro from being used as part of an expression. * Generate an immediately-invoked lambda expression to allow for an independently-scoped `result` variable. * Use parentheses around the input argument, in case the sref is the result of an expression. * Update all call sites to remove the macro argument providing the first argument.
Configuration menu - View commit details
-
Copy full SHA for 49b3c72 - Browse repository at this point
Copy the full SHA 49b3c72View commit details -
[CI] Update Hexagon image to install boost (apache#12613)
The new image has xgboost installed, which I need for apache#12587 Validated in https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/ci-docker-staging/279/pipeline
Configuration menu - View commit details
-
Copy full SHA for 2e83e03 - Browse repository at this point
Copy the full SHA 2e83e03View commit details -
Replace '> >' in templates with >>, NFC (apache#12615)
The problem with greedy lexing of >> as an operator was solved in C++11, and now templates no longer require spaces between >'s.
Krzysztof Parzyszek authoredAug 26, 2022 Configuration menu - View commit details
-
Copy full SHA for 23e7944 - Browse repository at this point
Copy the full SHA 23e7944View commit details -
[Hexagon] Asynchronous DMA support (apache#12411)
Adds adds asynchronous DMA support through the Hexagon User DMA engine with unit tests to validate basic functionality. Asynchronous DMA support here means the ability to "kick off" asynchronously a number of DMAs using the Copy API and then to Poll for or Wait on a number of "in flight" (not done) DMAs. Enables future testing and development for asynchronous memory copy on Hexagon. For now, Hexagon DMA support remains synchronous in nature through existing hexagon_user_dma_1d_sync interface which uses asynchronous capable HexagonUserDMA class in a synchronous way --- calling Copy and Wait back to back for each request. * use ring buffer to store DMA descriptors * add RingBuffer class; used by HexUserDMA to store descriptors * add test to overflow the HexagonUserDMA ring buffer
Configuration menu - View commit details
-
Copy full SHA for 7f1856d - Browse repository at this point
Copy the full SHA 7f1856dView commit details
Commits on Aug 27, 2022
-
[MetaSchedule][UX] Make
Database
with-able (apache#12520)`ApplyHistoryBest` right now plays a role as the database adaptor to query inside the database. In fact, the logic could be simplified and users only have to deal with `Database` instead of this extra object. - [x] Add `EnterWithScope`/`ExitWithScope`/`Current` to Database - [x] Migrate `te_filter_func` => "tir_filter" in Relay's pass context - [x] Migrate `f_take_tuning_record` => "Database.query_tuning_record" - [x] Migrate `TECompiler` to use `Database` - [x] Remove apply-history-best Next PR: - Migrate `f_direct_dispatch` (potentially unify with `apply_fixed_schedule`?)
Configuration menu - View commit details
-
Copy full SHA for 370abe6 - Browse repository at this point
Copy the full SHA 370abe6View commit details -
[TIR] Expose MMA-related PTX builtins (apache#12623)
Expose MMA-related PTX builtins This PR exposes the following TIR operation in python: `ptx_mma`: tested `ptx_mma_sp`: tested `mma_store`: add new unittest `mma_fill`: add new unittest Co-authored-by: yongwww <yongcale@gmail.com> Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 5344128 - Browse repository at this point
Copy the full SHA 5344128View commit details
Commits on Aug 29, 2022
-
[MetaSchedule] Introduce
ScheduleFnDatabase
(apache#12626)Following apache#12520, this PR introduces `ScheduleFnDatabase`, a mocked database to allow injecting handcrafted schedules provided by a schedule function. The schedule function comes with the following signature: ```python def schedule_fn( sch: tir.Schedule, ) -> bool: task_name = sch.mod.attrs["task_name"] # ^^^ provides an optional name of the task queried ... ``` This mocked database helps incorporate the existing testing utility `apply_fixed_schedule` more formally into the MetaSchedule-Relay build pipeline, and allows further extension to Relax with the same interface. Next as another follow-up, we will introduce ConcatDatabase that allows mixing multiple databases, including the mocked and ones from JSON files.
Configuration menu - View commit details
-
Copy full SHA for 648a29a - Browse repository at this point
Copy the full SHA 648a29aView commit details -
[Refactor] Replace std::tie with structured bindings (apache#12610)
* [Refactor] Replace std::tie with structured bindings With C++17 enabled in apache#12337, using structured bindings to replace cases where `std::tie` is used to define local variables. * Added missing header for <optional> * Silenced unused variable warnings after structured bindings This is a bug in gcc version 7, resolved in gcc 8. While gcc version 7 is used for CI, we'll need to silence unused variable warnings resulting from using only part of a structured binding. More information: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81767
Configuration menu - View commit details
-
Copy full SHA for 3d41ac3 - Browse repository at this point
Copy the full SHA 3d41ac3View commit details -
[QNN] Align output_scale/zero_point of sigmoid to Torch (apache#12624)
* [QNN] Align output_scale/zero_point of sigmoid to Torch * [QNN] Align output_scale/zero_point of sigmoid to Torch
Configuration menu - View commit details
-
Copy full SHA for c5c99a4 - Browse repository at this point
Copy the full SHA c5c99a4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0de2219 - Browse repository at this point
Copy the full SHA 0de2219View commit details -
[ci] Don't update Jenkinsfile timestamp on image updates (apache#12621)
The timestamp in the Jenkinsfile is there to prevent post-merge conflicts from different PRs that edit the templates merging non-sequentially. This is not an issue when a line is edited in place though, which is often the case when Docker image tags are updated. This PR makes it so the timestamp is not updated in these cases which should reduce merge conflicts on these types of PRs.
Configuration menu - View commit details
-
Copy full SHA for c31a762 - Browse repository at this point
Copy the full SHA c31a762View commit details -
[Utils] Handled Callable in tir.schedule._type_checker (apache#12633)
Previously, `Callable` was handled as an atomic type. This worked when it was included as last element of a `Union[]` annotation with no subtypes, but raised an error for other use cases, including `Optional[Callable]`. This commit adds explicit checks for `Callable` type annotations to validate whether the argument is callable, but doesn't recursively validate the signature of the callable object, because lambda functions cannot have type annotations. (https://peps.python.org/pep-3107/#lambda)
Configuration menu - View commit details
-
Copy full SHA for 74988d3 - Browse repository at this point
Copy the full SHA 74988d3View commit details
Commits on Aug 30, 2022
-
[TIR] Improved error messages for PrimExpr operator overloads (apache…
…#12638) Previously, type-checks in boolean operators on `PrimExpr` would state that the type is incorrect, but further investigation would be required in order to determine what expression caused the error. After this commit, error messages for these type checks include the expression that was used, and the dtype of that expression.
Configuration menu - View commit details
-
Copy full SHA for 9e88723 - Browse repository at this point
Copy the full SHA 9e88723View commit details -
[ci] Move non-task CI scripts into ci/ folder (apache#12609)
[CI] Update Hexagon image to install boost (apache#12613) The new image has xgboost installed, which I need for apache#12587 Validated in https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/ci-docker-staging/279/pipeline Co-authored-by: masahi <masahi129@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 5287d8f - Browse repository at this point
Copy the full SHA 5287d8fView commit details -
[TVMScript] support float inf, -inf and nan in TVMScript parser and p…
…rinter (apache#12618) * support float inf, -inf and nan in TVMScript parser and printer * address comment and fix lint * use type_extensions.Literal * address comments * fix win build * remove template
Yuanjing Shi authoredAug 30, 2022 Configuration menu - View commit details
-
Copy full SHA for 58ee935 - Browse repository at this point
Copy the full SHA 58ee935View commit details -
[microTVM][ARM-DSP] Fix pool schedule (apache#12653)
When I built keyword spotting ONNX model, there was an issue with the pool schedule because certain schedules like broadcast and elemwise do not have input tensors.
Configuration menu - View commit details
-
Copy full SHA for b44f134 - Browse repository at this point
Copy the full SHA b44f134View commit details -
[microTVM]Fix test util functions (apache#12641)
* Fix test utils * Update python/tvm/micro/testing/utils.py Co-authored-by: driazati <9407960+driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for d421e32 - Browse repository at this point
Copy the full SHA d421e32View commit details -
[Hexagon] Expose gtest output through runtime exception (apache#12502)
Expose Hexagon gtest output in CI by raising it as a runtime exception rather than printing it to stdout.
Configuration menu - View commit details
-
Copy full SHA for 1c32798 - Browse repository at this point
Copy the full SHA 1c32798View commit details -
[microTVM][Zephyr] Add missing CMSIS-NN source files to cmake file (a…
…pache#12642) This PR adds missing CMSIS-NN source files to Zephyr cmake template file for models like keyword spotting, anomaly detection, VWW and image classification.
Configuration menu - View commit details
-
Copy full SHA for 775520c - Browse repository at this point
Copy the full SHA 775520cView commit details -
[ci] Add mechanism for trust on certain CI scripts (apache#12604)
This makes it so changes to certain files from users not listed in `CONTRIBUTING.md` are not tested in CI. This is necessary since these scripts run on the baremetal EC2 instances and not inside Docker containers, so they can affect other builds and potentially grab Jenkins secrets. This checks out the version from the upstream for the listed files after running `git checkout`. Tested in CI: [positive](https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-12604/6/pipeline/) and [negative](https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-12604/9/pipeline/)
Configuration menu - View commit details
-
Copy full SHA for caf326f - Browse repository at this point
Copy the full SHA caf326fView commit details
Commits on Aug 31, 2022
-
[MetaSchedule] Complete NCHW Conv2D Winograd Kernel Scheduling (apach…
…e#12648) * Complete winograd scheduling. * Fix test.
Configuration menu - View commit details
-
Copy full SHA for f7cc992 - Browse repository at this point
Copy the full SHA f7cc992View commit details -
Configuration menu - View commit details
-
Copy full SHA for f114d55 - Browse repository at this point
Copy the full SHA f114d55View commit details -
Configuration menu - View commit details
-
Copy full SHA for c2824a8 - Browse repository at this point
Copy the full SHA c2824a8View commit details -
[ETHOSN] Improve inferring new shape of the Reshape operator (apache#…
…12594) Fixes the case when reshape is > 4 dims. While this cannot be offloaded to the NPU, the check was previously producing an error preventing further compilation. The correct behavior is to ensure the check returns False and not offload the reshape.
Nicola Lancellotti authoredAug 31, 2022 Configuration menu - View commit details
-
Copy full SHA for acbbd9f - Browse repository at this point
Copy the full SHA acbbd9fView commit details -
[TIR][TVMScript] Update printer / parser to make T.allocate return bu…
…ffer var (apache#12412) * Updated TVMScript syntax of `T.allocate` to return buffer var. * Added syntax sugar for `T.decl_buffer`. When `data` field is not specified, `data` will be implicitly created via `Allocate` stmt. * Updated the existing test cases. Most test cases can be updated by changing `T.allocate` to `T.decl_buffer`. `T.allocate` in some tests are updated to `T.allocate` + `T.buffer_decl`, to maintain the legacy behavior of allocation and implicit buffer declaration (will be followed up in future PR to adopt `T.decl_buffer`).
Configuration menu - View commit details
-
Copy full SHA for 0c37454 - Browse repository at this point
Copy the full SHA 0c37454View commit details -
[Torch][AArch64] Skip test_load_model___wrong_language__to_pytorch (a…
…pache#12660) This patch makes test_load_model___wrong_language__to_pytorch to be skipped in AArch64 due to a bug that can be reproduced when enabling Integration Tests in machines with Torch installed in TVM. ``` The error message seen is: OSError: /usr/local/lib/python3.7/dist-packages/torch/lib/ libgomp-d22c30c5.so.1: cannot allocate memory in static TLS block ``` While the test needs further investigation, it is being set as skipped so other tests can be enabled and not to regress and allow time for the investigation to be made. This relates to the issue described in apache#10673.
Configuration menu - View commit details
-
Copy full SHA for d54c065 - Browse repository at this point
Copy the full SHA d54c065View commit details -
[ci] Add linter for PR title and body (apache#12367)
* [skip ci][ci] Fix Jenkinsfile (apache#12387) This got out of date after merging apache#12178 Co-authored-by: driazati <driazati@users.noreply.github.com> * Address comments Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for a399e6c - Browse repository at this point
Copy the full SHA a399e6cView commit details
Commits on Sep 1, 2022
-
[TIR] Allow string/buffer arguments to Schedule cache_read/write (apa…
…che#12661) Previously, the argument needed to be an integer specifying the index into the read/write regions of a block. Now, the argument can be a string specifying the name of the buffer, or the Buffer object itself. This is a follow-up from apache#11624.
Configuration menu - View commit details
-
Copy full SHA for c6516a5 - Browse repository at this point
Copy the full SHA c6516a5View commit details -
[ETHOSN] Fix tests pylint errors (apache#12649)
This pr fixes pylint errors in tests/python/contrib/test_ethosn as reported in issue apache#11414.
Nicola Lancellotti authoredSep 1, 2022 Configuration menu - View commit details
-
Copy full SHA for aa6c712 - Browse repository at this point
Copy the full SHA aa6c712View commit details -
[Relay] Extract intermediate node by its expression ID (apache#12646)
[Relay] Extract Intermediate Expr by relay expr ID for analysis modify doc comments Co-authored-by: Bin Li <binli1@amd.com>
Configuration menu - View commit details
-
Copy full SHA for 38ba8c0 - Browse repository at this point
Copy the full SHA 38ba8c0View commit details -
[Hexagon] Implement fixed_point_multiply op through intrinsics. (apac…
…he#12659) This commit adds high-performance implementation of fixed_point_multiply operation based on Hexagon intrinsics for vmpye/vmpyo instructions. Benchmarking of 'fixed_point_multiply' op with (1,8,56,56,32) input tensor on Qualcomm SM8350: * default implementation: 10.06 ms * optimized implementation: 1.42 ms * speedup: 7x times (!!!) Please note that this is introducing a small round-up error for some corner cases with negative shift argument (The same as for ARM CPU, see PR#5980). This is because we are rounding twice instead than only once: * original q_multiply_shift: round(x*y*2^-s) * hexagon q_multiply_shift: round(round(x*y)*2^-s)
Configuration menu - View commit details
-
Copy full SHA for 038f15b - Browse repository at this point
Copy the full SHA 038f15bView commit details -
[MetaSchedule] Fix autoinline for single const consumer block (apache…
…#12668) fix autoinline and add test
Yuanjing Shi authoredSep 1, 2022 Configuration menu - View commit details
-
Copy full SHA for 32f9a5f - Browse repository at this point
Copy the full SHA 32f9a5fView commit details -
Add methods to get and set late-bound constants. (apache#12664)
* Add methods to read and restore late-bound constants on Executable. * Add bindings for new functions * Cleanup * Fix function name * Add tests for python API to access new load/save functions * Add another tests for python API to access new load/save functions where there are no constants
Configuration menu - View commit details
-
Copy full SHA for effcd22 - Browse repository at this point
Copy the full SHA effcd22View commit details -
[Adreno] Change compute/schedule for ToMixedPrecision pass (apache#12537
Configuration menu - View commit details
-
Copy full SHA for e814f79 - Browse repository at this point
Copy the full SHA e814f79View commit details -
[hexagon][tests] re-enable maxpool hardware test (apache#12676)
- Re-enable test_max_pool2d_slice.py when run on Hexagon hardware (as opposed to hexagon-sim). This is now safe because apache#11928 has been fixed.
Christian Convey authoredSep 1, 2022 Configuration menu - View commit details
-
Copy full SHA for 54786bb - Browse repository at this point
Copy the full SHA 54786bbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 50dad0d - Browse repository at this point
Copy the full SHA 50dad0dView commit details -
[MetaSchedule] Introduce
Union
andOrderedUnion
in Database (apac……he#12628) Following up apache#12520 and apache#12626, this PR introduces two database classes: `UnionDatabase` and `OrderedUnionDatabase`, both of which allow users to organically compose multiple databases together, so that the high-level IR (Relay, Relax) could select the best tuning records according to running time or a preferred order given by users. To each query, `UnionDatabase` returns the best record among all the databases given; Instead, `OrderedUnionDatabase` returns he record from the first database that responds to the query. Used together, users may specify complicated dispatching patterns like below: Examples below demonstrate the usecases of and difference between UnionDatabase and OrderDatabase. Assumption: * db1, db2 do not have tuning records for the target workload. * Each of db3, db4, db5 has tuning records r3, r4, r5 for target workload respectively. ```python #### Case 1. `UnionDatabase`: merged_db = ms.database.UnionDatabase( db1, # no record db2, # no record db3, # has r3 db4 # has r4 ) # returns the better one between r3 and r4 merged_db.query_tuning_record(..., target_workload) ### Case 2. `OrderedUnionDatabase` merged_db = ms.database.OrderedUnionDatabase( db1, # no record db2, # no record db3, # has r3 db4 # has r4 ) # returns r3 merged_db.query_tuning_record(..., target_workload) ### Case 3. Mix-use scenario merged_db = ms.database.UnionDatabase( db1, # no record db2, # no record db3, # has r3 ms.database.OrderedUnionDatabase( # returns r4 db4, # has r4 db5, # has r5 ) ) # returns the better one between r3 and r4 merged_db.query_tuning_record(..., target_workload) ### Case 4. Another mix-use scenario merged_db = ms.database.UnionDatabase( db1, # no record db2, # no record db3, # has r3 ms.database.UnionDatabase( # returns the better one between r4 and r5 db4, # has r4 db5, # has r5 ) ) # returns the best one among r3, r4 and r5 merged_db.query_tuning_record(..., target_workload) ### Case 5. Yet another mix-use scenario merged_db = ms.database.OrderedUnionDatabase( db1, # no record db2, # no record ms.database.UnionDatabase( # returns the better one between r3 and r4 db3, # has r3 db4, # has r4 ) db5, # has r5 ) # returns the better one between r3 and r4 merged_db.query_tuning_record(..., target_workload) ``` Co-authored-by: sunggg <49998730+sunggg@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for eecb7fd - Browse repository at this point
Copy the full SHA eecb7fdView commit details
Commits on Sep 2, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 8ca8f24 - Browse repository at this point
Copy the full SHA 8ca8f24View commit details -
[COMMUNITY] Yaxing Cai -> Reviewer (apache#12683)
Please join me in welcoming Yaxing Cai (@cyx-6) as a new reviewer in TVM. Yaxing has brought the PackedFunc into TVM object system ([RFC-051](apache/tvm-rfcs#51)), designed and implemented the new parser infrastructure for TVMScript and meta-programming ([RFC-079](apache/tvm-rfcs#79)) - [Commits History](https://github.com/apache/tvm/commits?author=cyx-6) - [Code Review](https://github.com/apache/tvm/pulls?q=reviewed-by%3Acyx-6+)
Configuration menu - View commit details
-
Copy full SHA for 4acddb1 - Browse repository at this point
Copy the full SHA 4acddb1View commit details -
[PyTorch] Fix aten::arange for pytorch (apache#12681)
fix arange for pytorch nightly 20220815
Yuanjing Shi authoredSep 2, 2022 Configuration menu - View commit details
-
Copy full SHA for b2d6600 - Browse repository at this point
Copy the full SHA b2d6600View commit details -
[MetaSchedule][UX] Convenient Object Creation (apache#12643)
This PR introduces a set of `.create` methods making it easier to create MetaSchedule objects. For example: ```python ms.database.JSONDatabase(...) ms.database.create("json") ms.runner.RPCRunner(...) ms.runner.create("rpc") ``` Besides, this PR allows `JSONDatabase` to be created via `work_dir`: ```python db = ms.database.create("json", work_dir="/path/to/db/") db = ms.database.create(work_dir="/path/to/db/") # or even simpler ```
Configuration menu - View commit details
-
Copy full SHA for bb56f2a - Browse repository at this point
Copy the full SHA bb56f2aView commit details -
[ETHOSN] Fix some more pylint issues (apache#12675)
Fixing a few more pylint issues caught when using pylint==2.9.3. Change-Id: Ie7ca61e1a8083a40e0ffccf1418192966884707a
Configuration menu - View commit details
-
Copy full SHA for 445a14f - Browse repository at this point
Copy the full SHA 445a14fView commit details -
[ETHOSN] Add support for concatenate with negative axis (apache#12686)
Supports offloading concatenate with a negative axis to the NPU. In addition, parameterized the concatenate unit tests.
Configuration menu - View commit details
-
Copy full SHA for 0549a08 - Browse repository at this point
Copy the full SHA 0549a08View commit details -
[ci][tvmbot] Trigger GitHub Actions after merging (apache#12361)
This fixes the issue where merging from GitHub Actions (i.e. with the default `GITHUB_TOKEN`) doesn't trigger post merge GitHub Actions on the commit it creates in `main`. Instead these jobs are triggered manually by a call to the Actions API after the merge has taken place. This also updates the tvmbot testing code (and by extension some of the other CI testing code) to remove the fixtures for each test in favor of constructing them from a single sample at runtime, this makes it a lot easier to add new tests and see what is different between each data sample and clean up the testing anti-patterns that were there before (e.g. `run()` instead of `pytest.mark.parameterize`, but none of the tests in `test_ci.py` have changed) Tested in driazati#36 which ran https://github.com/driazati/tvm/actions/runs/2881047903
Configuration menu - View commit details
-
Copy full SHA for 7c7b0f7 - Browse repository at this point
Copy the full SHA 7c7b0f7View commit details -
[AutoTVM][Testing] Add
tune_relay
scripts (apache#12685)Example: ```bash python -m tvm.autotvm.testing.tune_relay \ --workload bert_base \ --input-shape '[1,64]' \ --target "llvm" \ --num-trials 800 \ --rpc-host 192.168.6.66 \ --rpc-port 4445 \ --rpc-key 3090ti \ --work-dir /logs/autotvm-bert_base \ --cache-dir /cache-workloads \ --graph-tuner True \ --cpu-flush True \ --backend graph ```
Configuration menu - View commit details
-
Copy full SHA for 0cbf3aa - Browse repository at this point
Copy the full SHA 0cbf3aaView commit details -
[ci] Add tests for PR linter (apache#12680)
This adds some checks for the current usages of the PR linter and fixes the case where the script would error uncleanly when a PR body was `null`.
Configuration menu - View commit details
-
Copy full SHA for 4ed6564 - Browse repository at this point
Copy the full SHA 4ed6564View commit details -
[Adreno] Define memory_info for global.texture* (apache#12647)
There are now many warnings in the tuning process about undefined memory information when using textures. A definition is required as textures* are tagged.
Configuration menu - View commit details
-
Copy full SHA for 2734d04 - Browse repository at this point
Copy the full SHA 2734d04View commit details -
[Web][Emscripten] Update EMCC C++ standard to C++17 (apache#12693)
As a follow-up to apache#12337, updating the EMCC flags from `-std=c++14` to `-std=c++17`.
Configuration menu - View commit details
-
Copy full SHA for 28cad58 - Browse repository at this point
Copy the full SHA 28cad58View commit details
Commits on Sep 5, 2022
-
[ETHOSN] Use pytest parameterization for integration tests (apache#12688
Configuration menu - View commit details
-
Copy full SHA for 5dcf622 - Browse repository at this point
Copy the full SHA 5dcf622View commit details
Commits on Sep 6, 2022
-
[Apps] Pin android_camera TensorFlow/Keras dependency version (apache…
…#12710) At the moment, android camera is installing latest TF and Keras which is causing the following issue in CI: ``` File ".../keras/dtensor/lazy_variable.py", line 26, in <module> from tensorflow.python.trackable import base as trackable ModuleNotFoundError: No module named 'tensorflow.python.trackable' ``` This patch fixes the versions in the last known working versions of both: TF 2.9.1 and Keras 2.9.
Configuration menu - View commit details
-
Copy full SHA for b3edb6e - Browse repository at this point
Copy the full SHA b3edb6eView commit details -
[Hexagon][Runtime] Better support for 2-tier memory (apache#12574)
- Introduce 'global.ddr' memory scope: - Like 'global', this allocates memory from the Hexagon SoC's DDR memory. - Like 'global.vtcm', the specified tensor shape must be 1d or 2d, where 2d indicates Hexagon's "indirect tensor" (i.e., discontiguous) allocation scheme. - Change memory-alignment strategy to always be 2048-byte aligned on Hexagon. (This can be refined in the future, but for now it ensures all allocations meet the strictest alignment requirements for any Hexagon operations.)
Christian Convey authoredSep 6, 2022 Configuration menu - View commit details
-
Copy full SHA for 832cffa - Browse repository at this point
Copy the full SHA 832cffaView commit details -
[TIR][StorageRewrite] Allow in-place buffer reuse of non-flat memory (a…
…pache#12655) * [TIR][StorageRewrite] Allow in-place buffer reuse of non-flat memory Previously, shared buffer use was entirely disabled for non-flat memory, since the existing checks for shared memory assume flat 1-d spaces. This was enforced in `FindAlloc` and validated in `PrepareNewAlloc`. The validation in `PrepareNewAlloc` could trigger, if the buffer sharing was due to an in-place operation, and not through the `FindAlloc` function. In-place operations do not require N-d packing, nor do they introduce ambiguity in how different code generators may interpret non-flat physical indices. Therefore, this commit relaxes the validation in `PrepareNewAlloc`, allowing buffer reuse of non-flat buffers for in-place operations. * Update new StorageRewrite with correct allocate/buffer_decl usage
Configuration menu - View commit details
-
Copy full SHA for 744649e - Browse repository at this point
Copy the full SHA 744649eView commit details -
Configuration menu - View commit details
-
Copy full SHA for d4201a9 - Browse repository at this point
Copy the full SHA d4201a9View commit details -
[Hexagon] Add optimized schedule for nn.pad (apache#12714)
Motivation: In case of quantized models nn.pad operation typically is not fused with QNN ops and lives as a standalone operation. In this case it uses default injective schedule for Hexagon target and it is not optimized very well (based on analysis of real models like ResNet50 INT8). What was done: New schedule for Pad operation was implemented instead of default injective schedule. For Hexagon target injective schedule does fusion of all axis and vectorization on 128/64/32 (depends on dtype). It works fine for Add, Sub, etc... but not for Pad. New optimized schedule does these steps (fusion+vectorization) only if last tensor dimension is divisible by 128/64/32 (depends on dtype). It was done only for Hexagon, for other targets (x86, cuda, etc.) there is no changes and it uses default injective schedule. Benchmark results on Snapdragon 888: 4d NHWC layout with ((0, 0), (1, 1), (1, 1), (0, 0)) padding, "uint8" dtype: shape | default schedule, ms | optimized schedule, ms | speedup | -------------------|----------------------|------------------------|-------------------| (1, 112, 112, 32) | 10,03 | 0.2 | 50.1x times | (1, 56, 56, 128) | 0,099 | 0,085 | ~1x (no speedup) | ---------------------------------------------------------------------------------------| 4d NCHW layout with ((0, 0), (0, 0), (1, 1), (1, 1)) padding, "uint8" dtype: shape | default schedule, ms | optimized schedule, ms | speedup | -------------------|----------------------|------------------------|-------------------| (1, 128, 56, 56) | 10.96 | 1.38 | 7.9x times | (1, 32, 126, 126) | 1.66 | 1.58 | ~1x (no speedup) | (1, 32, 128, 128) | 13.98 | 2.66 | 5.25x times | ---------------------------------------------------------------------------------------| 5d NCHWc layout with ((0, 0), (0, 0), (1, 1), (1, 1), (0, 0)) padding, "uint8" dtype: shape | default schedule, ms | optimized schedule, ms | speedup | -------------------|----------------------|------------------------|-------------------| (1, 4, 56, 56, 32) | 6.39 | 0.29 | 22x times | (1, 56, 56, 128) | 0.15 | 0.15 | ~1x (no speedup) | ---------------------------------------------------------------------------------------| Summary: For some input tensors we get up to 50x times speedup, for other performance is the same. No performance degradations were detected.
Configuration menu - View commit details
-
Copy full SHA for 141b17b - Browse repository at this point
Copy the full SHA 141b17bView commit details -
[TVMC] Run module once by default (apache#12713)
* [TVMC] Run module once by default Currently executing `tvmc run module.tar` will run the input model twice. For benchmaking this is to be expected as the first run is used to prime caches etc before taking a measurement. However, this seems a bit unintuitive to have as default, especially when benchmarking is not always intended. In this sense, this commit aims to amend the number of runs for the default: `tvmc run module.tar` to a single run. After inspection, this seems to be down to the use of the `.benchmark()` method which runs (1 + repeat * number) executions in total. This means that at least two runs are required (i.e. when repeat=1, number=1). It also seems that it is only necessary to benchmark the model when `--print-time` has been set from the CLI POV. From the python interface POV, benchmarking is always run, but this may not always be necessary. This commit makes use of the `.run()` method to singularly execute the model by default. From the CLI this will be used when `--print-time` is set to False whereas from the python interface this will be used when `benchmark=False`. Otherwise, the `.benchmark()` method will be used as before. Complementary to this change `repeat`, `number` and `end_to_end` parameters are only used when either `--print-time` or `benchmark` are set to True - and the documentation has been updated to indicate this. Change-Id: I18a38a9d430d660264f7fce5caf0779aa059fed3 * improve documentation with number of exectuions when benchmarking Change-Id: Iecf557594420fcc9f3abcec5ce7d952db2c94271
Configuration menu - View commit details
-
Copy full SHA for da48e13 - Browse repository at this point
Copy the full SHA da48e13View commit details
Commits on Sep 7, 2022
-
[Docs] Add Commit Message Guideline (apache#12689)
This commit adds the Commit Message Guideline text to Apache TVM documentation in ./docs/contribute/pull_request.rst, under section 'Submit a Pull Request', below subsection 'Guidelines', as a subsection named “Commit Message Guideline”. The text in the second-last item in subsection 'Guidelines' that mentions PR tags is also updated to refer to this guideline. This documentation will help guide contributors on how to write good commit messages when submitting code / creating Pull Requests, in accordance with RFC-0088: https://github.com/apache/tvm-rfcs/blob/main/rfcs/0088-commit-message-guideline.md
Configuration menu - View commit details
-
Copy full SHA for 85bf80c - Browse repository at this point
Copy the full SHA 85bf80cView commit details -
[TIR] Fix pragma_loop_partition_hint attrs should check it's value (a…
…pache#12699) Current LoopPartition doesn't check the value of attribute key "pragma_loop_partition_hint". Whatever I set pragma_loop_partition_hint to True or False, the result is same, which is confused for debug. This PR fix pragma_loop_partition_hint attribute key should check it's value.
Configuration menu - View commit details
-
Copy full SHA for 6cd31e7 - Browse repository at this point
Copy the full SHA 6cd31e7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 291dd2f - Browse repository at this point
Copy the full SHA 291dd2fView commit details -
[ETHOSN] Add support for transpose convolution (apache#12674)
Adds support for offloading transpose convolution with an optional bias to the NPU. Co-authored-by: Samuel Panijel <samuel.panijel@arm.com> Co-authored-by: Leo Blonk <leo.blonk@arm.com>
Configuration menu - View commit details
-
Copy full SHA for b55ffcd - Browse repository at this point
Copy the full SHA b55ffcdView commit details -
[microTVM][Zephyr] Enable -O2 optimization on build by default (apach…
…e#12718) * add spped optimization flag * trigger * add exception for qemu_riscv64
Configuration menu - View commit details
-
Copy full SHA for ff9a530 - Browse repository at this point
Copy the full SHA ff9a530View commit details -
Configuration menu - View commit details
-
Copy full SHA for 269d536 - Browse repository at this point
Copy the full SHA 269d536View commit details -
[Build] Update C++ standard to C++17 for AOT, iOS, VTA (apache#12712)
Follow-up from apache#12337 and apache#12693, updating a few additional locations that specified C++14.
Configuration menu - View commit details
-
Copy full SHA for 2622ac9 - Browse repository at this point
Copy the full SHA 2622ac9View commit details -
[TVMScript] IRBuilder methods for
IRModule
(apache#12694)* IRBuilder methods for `IRModule` This PR introduces IRBuilder methods for `IRModule`. Co-authored-by: yongwww <yongcale@gmail.com> * apply code review suggestion Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 010c662 - Browse repository at this point
Copy the full SHA 010c662View commit details -
[TFLite][CI] Update TensorFlow dependency to 2.9.1 (apache#12131)
This updates the TF version to be used in TVM CI to 2.9.1, which brings improvements so that more platforms are supported by official packages. When building TFLite, an update to CMake was also required, which is updated now to 3.18.4. ethos-u-vela dependency is also updated, from version 3.2.0 to 3.4.0 so that it is closer to the TensorFlow version being proposed here. This PR updates the Docker images scripting to install TF and TFLite. Change-Id: I290085f0c018ad57606f1295494c19ff6e1af2dd
Configuration menu - View commit details
-
Copy full SHA for bee5627 - Browse repository at this point
Copy the full SHA bee5627View commit details -
[ci] Add onnx model to S3 (apache#12716)
Addresses this CI failure on `main`: https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/4235/pipeline/ Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 7f788dc - Browse repository at this point
Copy the full SHA 7f788dcView commit details -
[ci] Re-balance shards (apache#12473)
Replace '> >' in templates with >>, NFC (apache#12615) The problem with greedy lexing of >> as an operator was solved in C++11, and now templates no longer require spaces between >'s. Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com>
Configuration menu - View commit details
-
Copy full SHA for 546a7da - Browse repository at this point
Copy the full SHA 546a7daView commit details
Commits on Sep 8, 2022
-
[TIR] Add unroll_loop_with_partition_hint_no_interval attr in LoopPar…
…titionConfig to unroll loop (apache#12631) [TIR] Add unroll_loop_with_partition_hint_no_interval attr in LoopPartitionConfig to unroll loop
Configuration menu - View commit details
-
Copy full SHA for abb2aa0 - Browse repository at this point
Copy the full SHA abb2aa0View commit details -
[OpenCLML] CLML Profiling fixes corresponding to OpenCL Timer recent … (
apache#12711) * [OpenCLML] CLML Profiling fixes corresponding to OpenCL Timer recent changes. * [OpenCLML] Review comments. * * review comment
Configuration menu - View commit details
-
Copy full SHA for 6be04d7 - Browse repository at this point
Copy the full SHA 6be04d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 62bdc91 - Browse repository at this point
Copy the full SHA 62bdc91View commit details -
[Relay] Change when int8 operations are converted to int16 on Arm (ap…
…ache#12671) Currently, Relay QNN uses its `helper_no_fast_int8_hw_legalization` to convert most `int8` convolution and dense operations into `int16` ones on Arm. This currently occurs on ARM chips except for `v8.2a` chips with `dotprod` support. However, this behavior means that `int8` operations are replaced with `int16` ones on Cortex-M chips. On these chips `int16` is substantially slower, as while it saves a few sign extension operations, it doubles the amount of memory loads we need to perform. This PR changes when `helper_no_fast_int8_hw_legalization` is used on Arm, and instead makes **not** doing this replacement the standard. We will only do this replacement if we are on a chip with ASIMD support but without `v8.2a` and `dotprod`. This ensures that Cortex-M microcontrollers do not have `int8` operations turned into `int16` ones. I have also verified that this does, in fact, improve performance for some common models. For example, MobileNet_v1_0.25 on the Cortex-M4 saw a 10% performance improvement, compared to before this change. Accuracy does not seem to be affected.
Configuration menu - View commit details
-
Copy full SHA for cd99ca6 - Browse repository at this point
Copy the full SHA cd99ca6View commit details -
[CI][AArch64] Mark tests to be skipped due to torch crash (apache#12730)
Some integration tests are not being run on CI due to the configuration of the machine with onnx and torch not calling the integration tests script. This patch skips two more tests failing with the error message below: ``` "OSError: /.../torch/lib/libgomp-d22c30c5.so.1: cannot allocate memory in static TLS block" ```
Configuration menu - View commit details
-
Copy full SHA for 2d36e46 - Browse repository at this point
Copy the full SHA 2d36e46View commit details -
[MetaSchedule] Mark two tests as xfail (apache#12733)
This patch marks two tests as xfail for further investigation: * test_meta_schedule_integration_extract_from_resnet_with_filter_func * test_meta_schedule_integration_extract_from_resnet
Configuration menu - View commit details
-
Copy full SHA for 4f4bc26 - Browse repository at this point
Copy the full SHA 4f4bc26View commit details -
[Test] Add tvm.testing.requires_libtorch (apache#12737)
Create a specific test dependency to map to USE_LIBTORCH, which is disabled by deafult, and is independent from torch being installed on the underlying machine, so it causes problems in machines that have torch installed but TVM is build with USE_LIBTORCH OFF. Mark tests.python.contrib.test_libtorch_ops.test_backend with this new decorator.
Configuration menu - View commit details
-
Copy full SHA for ed63012 - Browse repository at this point
Copy the full SHA ed63012View commit details -
[TIR] Handle axis_separators during FlattenBuffer (apache#12652)
* [TIR] Moved tir.FlattenBuffer to occur before tir.LowerOpaqueBlock For buffers with more than one physical axis, the `axis_separators` are required in order to know which groups of logical axes to fuse into each physical axis. The implementation in `tir.FlattenBuffer` assumed that all buffers were being flattened to a single physical axis. Because `tir.LowerOpaqueBlock` replaces the `BlockNode::alloc_buffers` with `Allocate` nodes, `tir.FlattenBuffer` no longer has access to the axis separators and performs inconsistent flattening for `Allocate` as opposed to `BufferLoad`/`BufferStore`. This was introduced in apache#12172, which decoupled the lowering/flattening steps. The commit reorders the `tir.FlattenBuffer` to occur before `tir.LowerOpaqueBlock`, to make use of the axis separators. Any `Allocate` nodes that exist at that point (e.g. from hand-written schedules) are still flattened to 1-d physical buffers, but the `BlockNode::alloc_buffers` are flattened according to the axis separators. * Add unit test to validate non-flat memory after tvm.lower * Explicitly write T.reads for test on BufferRegion updates * Update incorrect docstring for test * Use DeclBuffer information in FlattenBuffer The DeclBuffer node can be inserted during LowerOpaqueBlock, then provide the missing Buffer information required to flatten the allocation. * Use T.allocate in unit tests With the insertion of `DeclBuffer` nodes, `LowerOpaqueBlock` no longer needs to be before `FlattenBuffer`, and has been moved back to its original position. Revering the tests to use `T.allocate` instead of `T.alloc_buffer` more closely represents the functions as they are being lowered. * Fix usage of T.decl_buffer in updated tests * Update LowerOpaqueBuffer to expect the DeclBuffer nodes * Strip DeclBuffer annotation in FlattenBuffer The DeclBuffer annotations aren't yet supported in all passes. This restricts them to being introduced in LowerOpaqueBuffer, then immediately removed in FlattenBuffer. * Strip out all DeclBuffer nodes in FlattenBuffer * Update unit tests to remove expectation of DeclBuffer nodes
Configuration menu - View commit details
-
Copy full SHA for b2bd434 - Browse repository at this point
Copy the full SHA b2bd434View commit details -
[TIR] Update region min/extent in ReplaceBufferMutator (apache#12725)
Prior to this commit, `ReplaceBufferMutator` only checks `BufferRegionNode::buffer` to determine if a `BufferRegion` needs to be replaced, and doesn't check the `BufferRegionNode::region`. As a result, updating `T.reads(A[B[i]])` would fail to replace `B`. This commit checks `BufferRegionNode::region` for buffer usage to resolve this issue.
Configuration menu - View commit details
-
Copy full SHA for 299ca26 - Browse repository at this point
Copy the full SHA 299ca26View commit details -
Move static array initialization into a function go avoid link errors (…
…apache#12678) * Move static array initialization into a function go avoid link errors * Fix line length
Configuration menu - View commit details
-
Copy full SHA for 64031d5 - Browse repository at this point
Copy the full SHA 64031d5View commit details
Commits on Sep 9, 2022
-
[TIR, Schedule] Check consumer in-bound and covered in reverse_comput…
…e_inline (apache#12717) * [TIR, Schedule] Generate consumer-in-bound predicate after reverse_compute_inline * Check consumer block iters are covered * fix lint
Configuration menu - View commit details
-
Copy full SHA for 89ce171 - Browse repository at this point
Copy the full SHA 89ce171View commit details -
[ci][docker] Use CMake 3.20.0 for cortexm (apache#12744)
The Zephyr project builds require 3.20.0 to work correctly Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 1c5ffc6 - Browse repository at this point
Copy the full SHA 1c5ffc6View commit details -
Configuration menu - View commit details
-
Copy full SHA for cb08a12 - Browse repository at this point
Copy the full SHA cb08a12View commit details -
[CI] Update Docker images to bring TF 2.9 and integration tests (apac…
…he#12738) [CI] Update Docker images to tag 20220908-060034-62bdc91b1 Updates all Docker images to tag 20220908-060034-62bdc91b1, to update TensorFlow/TFLite/Keras to 2.9, and cascaded dependencies such as numpy. Updates ethos-u-vela to 3.4.0. It also brings ONNX and PyTorch to ci_arm, to enable Integration tests to be run in CI. Standadises the minimum CMake version required in CI to be 3.18.4, fixing apps/microtvm/zephyr_cmsisnn to require this version. Finally, adds a new import error in the tutorials documentation which doesn't affect the final result. The new warning added is 'absl:Found untraced functions such as _jit_compiled_convolution_op'
Configuration menu - View commit details
-
Copy full SHA for 90fb79b - Browse repository at this point
Copy the full SHA 90fb79bView commit details -
Aligned CMSIS-NN SHA in TVM to CMSIS top of tree (apache#12723)
Aligned CMSIS-NN SHA in TVM to top of tree of CMSIS. -Aligned buffer size APIs to CMSIS implementations. -Updated the tests to match new CMSIS context buffer sizes. -This change needs updates to cortex-m docker image. Change-Id: I13f1ad29fe0ef02f08660eca4c818b5d66145ffc
Configuration menu - View commit details
-
Copy full SHA for 7596964 - Browse repository at this point
Copy the full SHA 7596964View commit details -
[microtvm][Zephyr] Add project overlay to overwrite device tree confi…
…gs (apache#12741) * add nucleo overlay
Configuration menu - View commit details
-
Copy full SHA for 1d32c40 - Browse repository at this point
Copy the full SHA 1d32c40View commit details -
[TVMScript] Base IRBuilder methods for
PrimFunc
(apache#12745)Base IRBuilder methods for `PrimFunc` This PR introduces base IRBuilder methods for `PrimFunc`. Co-authored-by: yongwww <yongcale@gmail.com> Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 8bd81e6 - Browse repository at this point
Copy the full SHA 8bd81e6View commit details -
[TVMScript][TIR] Clarify scope of BlockNode::iter_vars (apache#12726)
Previously, it was ambiguous whether `BlockNode::iter_vars` were in-scope for `BlockRealizeNode::predicate`. `ConvertBlocksToOpaque` treated them as in-scope, and applied a mapping from `iter_vars` to `iter_values`. Similarly, TVMScript printing places `T.where` statements below the `T.axis` statements, where `T.axis` definitions are in scope. However, `BlockRealizeNode::SEqualReduce` and `BlockRealizeNode::SHashReduce` do not visit the block and `iter_vars` until after visiting the predicate, placing the `iter_vars` out of scope. This commit updates the printing of `T.where` to be above `T.axis`, and updates `ConvertBlocksToOpaque` to report an error if the predicate contains references to `BlockNode::iter_vars`. After this commit, these three usages all consistently treat `BlockNode::iter_vars` as out of scope for `BlockRealizeNode::predicate`.
Configuration menu - View commit details
-
Copy full SHA for 14999f8 - Browse repository at this point
Copy the full SHA 14999f8View commit details -
[OpenCL] Enable OpenCL for GPU tests (apache#12490)
* Add opencl target in test build script * Fix fp16 test and compile test for opencl * fix lint * Fix relay OpenCL texture tests * Fix lint * Enable relay OpenCL tests * Fix opencl relay texture tests * fix lint * Remove OpenCL gtest variable * Fix unbound variable * Skip tests that are not supported in CI * fix lint * Add path for opencl gtest directory * Fix opencl gtests include directory * Enable OpenCL googletest. Fix bug in opencl timer test * testing fix for build cpp tests * update googletest git version for opencl tests build * update cmakelist * Update CMakeList * Update CMakeList * Disable opencl googletests * update Opecnl.cmake * fix Opecnl.cmake * Apply comments. Remove xfail decerator for opencl tests. Now specific tests are skipped in the environment script * minor code changes * apply comments * apply comment * skip test in ci by decorator * fix pytest skipif warnings * Fix skipif for opencl gtests
Configuration menu - View commit details
-
Copy full SHA for 574794e - Browse repository at this point
Copy the full SHA 574794eView commit details -
[Frontend][Paddle] Fix op in paddle did't transmit layout information (…
…apache#12658) [Frontend][Paddle] Fix adaptive_avg_pool2d in paddle did't transmit layout information
Configuration menu - View commit details
-
Copy full SHA for b21bf66 - Browse repository at this point
Copy the full SHA b21bf66View commit details -
[TIR][Arith] Add more strict checking in imm construction and folding. (
apache#12515) * Add more strict check in tir imm construction and folding. * fix bool-compare compile error * fix some illegal imm construction in testcases * do not test i64 overflow behaviour because it is not consistent on cython and ctypes * fix float32 testcase * auto-inferred dtype should be int64 when value exceeds int32 range * add floatimm range check for fp16 and fp32 * add more folding testcases and fix store fp32 folding result to double * fix i386 fp16 cases
Configuration menu - View commit details
-
Copy full SHA for 029fa46 - Browse repository at this point
Copy the full SHA 029fa46View commit details -
[TOPI][Hexagon] Add test and schedule for uint8 resize2d (apache#12559)
* [TOPI][Hexagon] Add test and schedule for uint8 resize2d * Fix correctness issue * Reformat * Remove cubic from testing * Remove unnecessary else
Configuration menu - View commit details
-
Copy full SHA for 4c05656 - Browse repository at this point
Copy the full SHA 4c05656View commit details -
[TOPI][Hexagon] Implement quantized elementwise for hexagon (apache#1…
…2606) * [TOPI][Hexagon] Add test and schedule for uint8 resize2d * Fix correctness issue * Reformat * [TOPI][Hexagon] Implement quantized elementwise * Reformat * Address review comments * Reformat * Revert * Address review comments
Configuration menu - View commit details
-
Copy full SHA for 2eed663 - Browse repository at this point
Copy the full SHA 2eed663View commit details
Commits on Sep 10, 2022
-
[ETHOSN] Update driver stack version to 22.08 (apache#12650)
Updates the driver stack used by the NPU to the latest released version (semantic version 3.1.0), while maintaining backwards compatibility for the previous version 22.05 (semantic 3.0.1) during the migration period. In addition, support for split is re-introduced as this is now supported in 22.08. Change-Id: I86bce3469f0b8ad52e66461ae055dec6717b3527
Configuration menu - View commit details
-
Copy full SHA for 76f91b4 - Browse repository at this point
Copy the full SHA 76f91b4View commit details
Commits on Sep 12, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 286fade - Browse repository at this point
Copy the full SHA 286fadeView commit details -
[TVMScript] Base IRBuilder methods for
Block
(apache#12748)This PR introduces base IRBuilder methods for `Block`. Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 4c863fc - Browse repository at this point
Copy the full SHA 4c863fcView commit details -
[MetaSchedule] Fix typo of compare between GlobalVar and str (apache#…
…12704) fix typo of compare between GlobalVar and str
Configuration menu - View commit details
-
Copy full SHA for a63d03a - Browse repository at this point
Copy the full SHA a63d03aView commit details -
[CI] Always install into a python venv in ci containers (apache#12663)
This PR changes all ci_ to install TVM Python dependencies in a virtualenv separate from the system Python dependencies. Sets the stage for adding the poetry-based dependency generator to the CI container build process. * Always install into a python venv in ci containers. * Respect Dockerfile ENV PATH modifications in docker/bash.sh lookups.
Configuration menu - View commit details
-
Copy full SHA for a047e02 - Browse repository at this point
Copy the full SHA a047e02View commit details -
[Hexagon] Add Hand written HVX conv2d (apache#12204)
* [Hexagon] Add Hand written HVX conv2d Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com> * Address review comments Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com> * Add some more comments and a file rename * Add gtest unit tests for blockize/deblockize * Add gtest unit tests fp16 utils Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com>
Configuration menu - View commit details
-
Copy full SHA for b22b872 - Browse repository at this point
Copy the full SHA b22b872View commit details -
[TFLite] Support quantized GREATER op in TFLite frontend (apache#12754)
Support GREATER quantization operation conversion as part of issue apache#9187 Continuation of apache#11519.
Configuration menu - View commit details
-
Copy full SHA for 1222398 - Browse repository at this point
Copy the full SHA 1222398View commit details -
[Hexagon] Validate 2-d physical shapes for TIR-derived schedules (apa…
…che#12662) Previously, the test cases only tested TE-based schedules. This commit runs the same tests for equivalent TIR-based schedules as well. This is intended to catch Hexagon-specific regressions, such as the one resolved in apache#12652.
Configuration menu - View commit details
-
Copy full SHA for 9671aee - Browse repository at this point
Copy the full SHA 9671aeeView commit details -
[AutoTVM] Fix
None
feature in AutoTVM tuning (apache#12760)This PR introduces a couple of fixes to make AutoTVM working more robustly: - Fixed a very rarecase that `None` could pop up in AutoTVM features; - Fixed a misuse of `ARGS` in the testing script; - Fixed the filename for caching.
Configuration menu - View commit details
-
Copy full SHA for 4d27664 - Browse repository at this point
Copy the full SHA 4d27664View commit details -
[MetaSchedule][Test] Migrate AddRFactor to SEqual (apache#12758)
This PR migrates the usage of `check_trace` to `check_sketch`, which prefers structural equality of TIRs insteda of string equalty of traces.
Configuration menu - View commit details
-
Copy full SHA for a23b71c - Browse repository at this point
Copy the full SHA a23b71cView commit details
Commits on Sep 13, 2022
-
[MetaSchedule][Test] Migrate
check_trace
tocheck_sketch
(apache#……12764) * Migrate AutoBind * Migrate RandomComputeLocation * Migrate CrossThreadReduction * Migrate ParallelVectorizeUnroll
Configuration menu - View commit details
-
Copy full SHA for ef784d6 - Browse repository at this point
Copy the full SHA ef784d6View commit details -
[Hexagon] Create tests to showcase vtcm loading capabilities on Hexag…
…on. (apache#12667) * [Hexagon] Increase max buffer size for tvm_rpc_android to 1GB. * [Hexagon] Make errors more clear when unable to allocate VTCM buffers and throw an error to fail early. * [Hexagon] Add mem_copy_DLTensor to enable directly calling DMA for mem copies. * [Hexagon] Add new tests as examples of the performance to expect when copying data to VTCM. * [Hexagon] Reduce rpc max size. * [Hexagon] Fix test_parallel_hvx_load_vtcm.py test output to be human readable. * Comment out tests that only work on 8Gen1 HDKs to get CI to pass
Configuration menu - View commit details
-
Copy full SHA for 8058423 - Browse repository at this point
Copy the full SHA 8058423View commit details -
Configuration menu - View commit details
-
Copy full SHA for 64635b7 - Browse repository at this point
Copy the full SHA 64635b7View commit details
Commits on Sep 14, 2022
-
[FQ2I] Quantized constant bias (apache#12666)
* support fp32 constants in quantized bias add * add a test * clean up comment * assert the bias is floating point as well as constant before requantizing
Matthew Brookhart authoredSep 14, 2022 Configuration menu - View commit details
-
Copy full SHA for ab8fe34 - Browse repository at this point
Copy the full SHA ab8fe34View commit details -
[Hybrid] Fix handling AST subcription for Python3.9 (apache#12769)
fixed apache#9955, this is covered by the existing test case `tests/python/relay/test_op_level3.py::test_unique`
Configuration menu - View commit details
-
Copy full SHA for 91bd9a3 - Browse repository at this point
Copy the full SHA 91bd9a3View commit details -
[AOT] Add AOTLowerMain pass to lower a Relay main into TIR (apache#12550
Configuration menu - View commit details
-
Copy full SHA for f7f2cda - Browse repository at this point
Copy the full SHA f7f2cdaView commit details -
[OpenCLML] More ops and network coverage (apache#12762)
Added operators pooling (avg, max), binary operators (add, subtract, multiply, min, max) and concat. Clip operator with min=0 and max=6 is remapped to relu6 to take advantage of CLML acceleration without sub graphing this to fallback path. Added new test cases for above listed operators and also end-to-end network test cases for Resnet50 & InceptionV3. CLML support FP16 arithmetic mode which gives significant performance boost over FP32. This PR enhances FP16 usage based on Operator datatype in relay graph. Co-authored-by: Krishna Raju quic_kvegiraj@quicinc.com Co-authored-by: Shwetank Singh quic_shwesing@quicinc.com
Configuration menu - View commit details
-
Copy full SHA for 2aa0d1f - Browse repository at this point
Copy the full SHA 2aa0d1fView commit details -
[Relay][TE] Use Relay parameter name to generated TE tensor name (apa…
…che#10516) * [Relay][TE] Use Relay parameter name to generated TE tensor name Previously, the TE placeholders representing relay function parameters were all named `"placeholder"`, which could be difficult to follow when debugging larger functions.
Configuration menu - View commit details
-
Copy full SHA for a408493 - Browse repository at this point
Copy the full SHA a408493View commit details -
[CI] Set USE_CMSISNN and USE_ETHOSU off in task_config_build_cpu.sh (a…
…pache#12456) The dependencies for these have moved into ci_cortexm Docker image, so there is not much point in building them for ci_cpu as we can't run the associated tests.
Configuration menu - View commit details
-
Copy full SHA for a0cbefb - Browse repository at this point
Copy the full SHA a0cbefbView commit details -
[TVMScript] IRBuilder methods for
PrimFunc
(apache#12755)This PR introduces remaining IRBuilder methods for `PrimFunc`. Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 3d7439e - Browse repository at this point
Copy the full SHA 3d7439eView commit details -
[TIR][Meta-Schedule] Tuple-reduction scheduling support (apache#11639)
[TIR][MetaSchedule] Support Tuple Reduction This PR improves our TIR scheduling primitives/transformations (rfactor & cross-thread reduction) designed for reduction operators, so that they can be applied to blocks of tuple-reduction.
Configuration menu - View commit details
-
Copy full SHA for 421ff76 - Browse repository at this point
Copy the full SHA 421ff76View commit details -
Fixed pylint issues after moving to venv in ci_lint docker (apache#12775
) Following change introduced installing python dependencies inside virtual environments: apache#12663 Previous to this fix, a different version of python was being picked up that didn't catch the issues fixed in this commit. Change-Id: Ie290d9474a799311e07d293fa1b8299326b11661
Configuration menu - View commit details
-
Copy full SHA for 296565a - Browse repository at this point
Copy the full SHA 296565aView commit details -
[microTVM][Zephyr] Fix PLL freq. in overlay for nucleo_l4r5zi board (a…
…pache#12756) * [microTVM][Zephyr] Fix PLL freq. in overlay for nucleo_l4r5zi board Commit 1d32c40 ("Add project overlay to overwrite device tree configs") added overlay for setting 'clock-frequency' property of node 'rcc' to 120 MHz, however to effectively change the PLL frequency that drivers the core it's necessary also to overlay the attributes for the 'pll' node. This commit does that. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org> * Remove div-p and div-q properties from overlay Remove div-p and div-q properties from the overlay file since values for these properties will be inherited from the 'pll' that is overlaid. Since currently microTVM does not use any subsystem which relies on clocks associated to either P or Q params, these params can be left unchanged for now. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Configuration menu - View commit details
-
Copy full SHA for e5adb83 - Browse repository at this point
Copy the full SHA e5adb83View commit details
Commits on Sep 15, 2022
-
[Arith][Refactor] Return Optional<PrimExpr> from TryConstFold (apache…
…#12784) Prior to this commit, the templated `TryConstFold` utility returned an undefined `PrimExpr` to represent a failure to perform constant folding. This commit makes this explicit by returning `Optional<PrimExpr>` instead.
Configuration menu - View commit details
-
Copy full SHA for 397cf87 - Browse repository at this point
Copy the full SHA 397cf87View commit details -
[TIR, Schedule] Add schedule primitive PadEinsum (apache#12750)
* [TIR, Schedule] Add schedule primitive PadEinsum Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> * lint * [TIR] Fix producer indices check in PadEinsum * address comments * simplify lambda expr * fix Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 1f8b5de - Browse repository at this point
Copy the full SHA 1f8b5deView commit details -
[Arith] Simplify nested if_then_else (apache#12749)
[Arith] Simplify nested if_then_else Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 9b10425 - Browse repository at this point
Copy the full SHA 9b10425View commit details -
[Docker][CI][RISC-V] Build riscv-isa-sim (spike) in ci_riscv Docker i…
…mage to enable RISC-V unit testing (apache#12534) * Remove CSI-NN from ci_cortexm docker image * [Docker] [RISC-V] Split up CSI-NN2 installation script into several files [Docker] [RISC-V] move gcc toolchain installation out of csi-nn2 script [Docker] [RISC-V] move qemu installation out of csi-nn2 script * use updated version of qemu * [Docker] [RISC-V] Install newlib (baremetal) gcc toolchain * [Docker] [RISC-V] Install spike simulator * [Docker] move initialization of timezone and DEBIAN_FRONTEND to ubuntu_install_core.sh script
Configuration menu - View commit details
-
Copy full SHA for f5517d4 - Browse repository at this point
Copy the full SHA f5517d4View commit details -
[Target] Print deprecation warning before canonicalisation in build m…
…odule (apache#12747) Hopefully fixes apache#12742, as the warning should only be printed when a user passes `target_host`, in the current case if the user passes `None` as `target_host` it'll be processed by `canon_target_map_and_host` which seems to always produce a `target_host` and thus triggering the warning despite the user doing nothing wrong.
Configuration menu - View commit details
-
Copy full SHA for c900250 - Browse repository at this point
Copy the full SHA c900250View commit details -
[ci] Add retries to docker push (apache#12773)
This should mitigate failures like in https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/4274/pipeline. This also moves the `retry` function to a script now that we have PR apache#12604. Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for c00ce57 - Browse repository at this point
Copy the full SHA c00ce57View commit details -
[ci][docker] Always build cmake from source (apache#12774)
This should fix some version drift in the current cmake versions in the Docker containers (currently running all of 3.10, 3.16, 3.18, and 3.20) Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 111a88d - Browse repository at this point
Copy the full SHA 111a88dView commit details -
[ci] Remove author check from ping bot (apache#12788)
This has been working fine for a while, this code opens it up so it's not limited to the authors in apache#9983. Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 5b43c62 - Browse repository at this point
Copy the full SHA 5b43c62View commit details -
Configuration menu - View commit details
-
Copy full SHA for afad20d - Browse repository at this point
Copy the full SHA afad20dView commit details -
[TVMScript] IRBuilder methods for
For
(apache#12786)This PR introduces remaining IRBuilder methods for `For`. Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 6a05184 - Browse repository at this point
Copy the full SHA 6a05184View commit details -
[TVMScript] Fix parse minimal i32 literal for tir script (apache#12772)
This change tries to fix an issue due to apache#12515. Previously the logic for `-2147483648` is `parse(-literal)` = `-parse(literal)`, and all integer literals are converted to i32 (either the literal value actually overflow or not). Since after apache#12515, parse `2147483648` results in an i64 typed integer rather than i32, `-2147483648` then becomes an i64 integer too, which is not reasonable.
Configuration menu - View commit details
-
Copy full SHA for 9a3b3dd - Browse repository at this point
Copy the full SHA 9a3b3ddView commit details
Commits on Sep 16, 2022
-
[community] Fix outdated contributor GitHub usernames (apache#12799)
These couple names were linking to 404 pages, this PR updates them to their current counterparts. Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for c96cc11 - Browse repository at this point
Copy the full SHA c96cc11View commit details -
[TIR] Add extra simpliciation in region cover analysis (apache#12800)
Added extra simplify step to eliminate false negative cases.
Configuration menu - View commit details
-
Copy full SHA for e6525a3 - Browse repository at this point
Copy the full SHA e6525a3View commit details -
[MetaSchedule] Enable Clone Function for Task-Level Classes (apache#1…
…2796) This PR introduces a clone function for each of the task-level MetaSchedule classes for convenient class deep copying. - [x] ScheduleRule - [x] Postproc - [x] Mutator - [x] SpaceGenerator - [x] SearchStrategy - [x] TuneContext
Configuration menu - View commit details
-
Copy full SHA for 02c2eae - Browse repository at this point
Copy the full SHA 02c2eaeView commit details -
[MetaSchedule][Test] MLT uses SEqual tests (apache#12805)
This PR finishes migration from `check_trace` (string-based equality check on TIR trace) to `check_sketch` (SEqual-based equality check on TIR). Here, we split multi-level-tiling into 3 files: - Plain multi-level tiling without any intrinsics - Multi-level tiling with intrinsics like VNNI, DP4a - Multi-level tiling with TensorCore which comes with different handling Besides, we cleaned up the testing folder and removed several methods that are no longer useful for unittests.
Configuration menu - View commit details
-
Copy full SHA for 77d0a28 - Browse repository at this point
Copy the full SHA 77d0a28View commit details -
[TVMScript] IRBuilder methods for
Axis
(apache#12808)This PR introduces remaining IRBuilder methods for `Axis`. Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for c0d2734 - Browse repository at this point
Copy the full SHA c0d2734View commit details -
[ci][docker] Fix nightly Docker tests (apache#12804)
These were broken due to this missing guard: https://ci.tlcpack.ai/job/docker-images-ci/job/docker-image-run-tests/223/console Co-authored-by: driazati <driazati@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 9b17f34 - Browse repository at this point
Copy the full SHA 9b17f34View commit details -
[MetaSchedule][Minor]Fix Random State Fork in TuneContext Clone Funct…
…ion (apache#12811) Fix random state fork in TuneContext Clone function.
Configuration menu - View commit details
-
Copy full SHA for 6b3be49 - Browse repository at this point
Copy the full SHA 6b3be49View commit details -
Fix for import requests and import caffe failures (apache#12813)
Recently virtual environments were introduced in the docker images which was a great contribution to localize errors: apache#12663. In this fix, link to the caffe is created inside this virtual env instead of adding it to the system path of python. This fix also removes importing request package where not needed. Fixes apache#12663
Configuration menu - View commit details
-
Copy full SHA for 8f8b6d8 - Browse repository at this point
Copy the full SHA 8f8b6d8View commit details -
[Hexagon] Reduce the number of tests run for VTCM testing in order to… (
apache#12783) [Hexagon] Reduce the number of tests run for VTCM testing in order to speedup CI.
Configuration menu - View commit details
-
Copy full SHA for 43d9a3b - Browse repository at this point
Copy the full SHA 43d9a3bView commit details -
[Hexagon] [runtime] Protect access to global HexagonBufferManager map (…
…apache#12807) * Protect access to global buffer manager map * Fix lint
Configuration menu - View commit details
-
Copy full SHA for 7c96e25 - Browse repository at this point
Copy the full SHA 7c96e25View commit details -
[ci] Fix docs push (apache#12810)
This was missing a repo checkout and failing as in https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/4302/pipeline. This also adds in the changes from apache#12719: Fixes apache#12600. The original solution there doesn't actually fix the issue, there would need to be some job queue that could make sure to reject old pushes. Since this case is pretty rare, generally the next commit that comes along and builds will fix everything up so we can ignore failures that happen on `push`es.
Configuration menu - View commit details
-
Copy full SHA for 5d0a167 - Browse repository at this point
Copy the full SHA 5d0a167View commit details -
[ci] Add bot to post welcome comment (apache#12695)
This would post the comment that the tests bot and the docs comment bot uses straightaway when a PR is posted. This will contain links to generic info about posting PRs (and obviate the `.github/PULL_REQUEST_TEMPLATE.md`) as well as dynamic info about the specific PR (filled in later by the respective bots). This would make things like the auto-cc bot more transparent since it would have a link to the relevant issue. Tested live here: driazati#21 (comment)
Configuration menu - View commit details
-
Copy full SHA for e037ae4 - Browse repository at this point
Copy the full SHA e037ae4View commit details -
[Testing] Add decorator tvm.testing.requires_cuda_compute_version (ap…
…ache#12778) * [Testing] Add decorator tvm.testing.requires_cuda_compute_version Previously, individual unit tests would call `tvm.contrib.nvcc.get_target_compute_version` and return early. This was repeated boilerplate in many tests, and incorrectly reported a test as `PASSED` if the required infrastructure wasn't present. This commit introduces `tvm.testing.requires_cuda_compute_version`, a decorator that checks the CUDA compute version and applies `pytest.mark.skipif`. If required infrastructure isn't present, a test will be reported as `SKIPPED`. * requires_cuda_compute_version skips test when no GPU is present
Configuration menu - View commit details
-
Copy full SHA for aded9d4 - Browse repository at this point
Copy the full SHA aded9d4View commit details -
[Hexagon] Add debug option to hexagon pytest (apache#12795)
* add debug option to hexagon pytest * address comment
Configuration menu - View commit details
-
Copy full SHA for bb80f19 - Browse repository at this point
Copy the full SHA bb80f19View commit details -
[Hexagon] [runtime] Improve runtime resource management (apache#12727)
* First pass at improving runtime resource management * Add unit test * Fix lint and clang format errors * Disable resource reset for simulator * Moved acquire/release calls to session object, separate buffer managers for non-runtime (static) and runtime (dynamic). * Fix lint errors * Fix lint errors * Improve robustness of session shutdown * Fix lint * Address feedback * Only allow call to Acquire in a clean state * Use a pointer to indicate the "active" manager
Configuration menu - View commit details
-
Copy full SHA for 38f53e8 - Browse repository at this point
Copy the full SHA 38f53e8View commit details
Commits on Sep 17, 2022
-
[TVMScript] IRBuilder methods for
Block
(apache#12815)This PR introduces remaining IRBuilder methods for `Block`. Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 41b65a3 - Browse repository at this point
Copy the full SHA 41b65a3View commit details -
[TIR] Support pattern matching argmax/argmin generated by TOPI (apach…
…e#12827) This PR introduces two reducers to TIR reduction part, so that rfactor and cross-thread reduction can be applied to those functions who contains argmax/argmin computation generated by TOPI.
Configuration menu - View commit details
-
Copy full SHA for 2cae905 - Browse repository at this point
Copy the full SHA 2cae905View commit details -
[TIR] Construct the inverse in SuggestIndexMap (apache#12797)
Computing the inverse mapping requires arithmetic analysis which is not guaranteed to cover all cases. We provide the pre-defined inverse index map instead.
Configuration menu - View commit details
-
Copy full SHA for 91cce56 - Browse repository at this point
Copy the full SHA 91cce56View commit details -
[BugFix][TIR] Fix Buffer LCA Detector (apache#12819)
Prior to this PR, the LCA detector of buffers in TIR didn't take buffer memory scopes and GPU hierarchy into consideration. An consequent issue is that, when an intermediate buffer is in global memory, TIR's lowering passes don't necessarily allocated the intermediate buffer outside all `blockIdx`. As a result, the global intermediate buffer is allocated under a GPU thread block, which is illegal. This PR fixes this issue by fixing the LCA detector, making it be aware of the buffer memory scopes and GPU hierarchy. With this fix, the global intermediate buffers are all allocated outside `blockIdx`.
Configuration menu - View commit details
-
Copy full SHA for e92f5d4 - Browse repository at this point
Copy the full SHA e92f5d4View commit details -
[TVMScript] Add more helper functions to the printer infra (apache#12829
) This PR is split from apache#12492, to make the necessary updates to the printer infra for future PRs of TIR printer. Tracking issue: apache#11912 Co-authored-by: Greg Bonik <gbonik@octoml.ai>
Configuration menu - View commit details
-
Copy full SHA for 1ecf084 - Browse repository at this point
Copy the full SHA 1ecf084View commit details
Commits on Sep 18, 2022
-
[MetaSchedule] Relax conditions of rule Cross-Thread Reduction (apach…
…e#12825) This PR relaxes the conditions of Meta-Schedule schedule rule CrossThreadReduction. The rules are previously a bit over-strict, and some workloads with small reduction loop length are unable to be optimized by cross-thread reduction automatically. In this PR, we relax the rules so that such workloads can be optimized.
Configuration menu - View commit details
-
Copy full SHA for d1871a6 - Browse repository at this point
Copy the full SHA d1871a6View commit details -
[TVMScript] IRBuilder methods for
Stmt
(apache#12830)This PR introduces IRBuilder methods for `Assert`, `Let`, `Realize`, `Evaluate`, `LaunchThread`, `EnvThread`. Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for b2c5add - Browse repository at this point
Copy the full SHA b2c5addView commit details -
[TVMScript] IRBuilder methods for
Stmt
(apache#12831)This PR introduces IRBuilder methods for `allocate`, `Let`, `allocate_const`, `attr`, `While`, `If/Then/Else`, `decl_buffer`, `buffer_store`, `prefetch`. Co-authored-by: yongwww <yongcale@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 052e702 - Browse repository at this point
Copy the full SHA 052e702View commit details
Commits on Sep 19, 2022
-
[Frontend][TFLite] fix detection_postprocess's non_max_suppression_at…
…trs["force_suppress"] (apache#12593) * [Frontend][TFLite]fix detection_postprocess's non_max_suppression_attrs["force_suppress"] Since tvm only supports operators detection_postprocess use_regular_nms is false, which will suppress boxes that exceed the threshold regardless of the class when implementing NMS in tflite, in order for the results of tvm and tflite to be consistent, we need to set force_suppress to True. * [Frontend][TFLite]fix detection_postprocess's non_max_suppression_attrs[force_suppress] Added a test case that reproduces inconsistent results between tvm and tflite When the force_suppress is false,it will get a good result if you set the force_suppress as true
Configuration menu - View commit details
-
Copy full SHA for 60cf692 - Browse repository at this point
Copy the full SHA 60cf692View commit details -
[TIR] Implement API for padded layout transformations (apache#12720)
Implementation of API in `tvm.tir.schedule` for layout transformations with padding, as part of apache#12261, item "Insert pad value into generated TIR, using `tir::if_then_else`, `builtin::assume`, and `builtin::undef`". Following the RFC discussion in apache/tvm-rfcs#77 (comment) and apache/tvm-rfcs#77 (comment), this commit preferentially rewrites the loops that surround a padded transformation where possible, in order to express padding in terms of `tir::if_then_else`.
Configuration menu - View commit details
-
Copy full SHA for 2af9b90 - Browse repository at this point
Copy the full SHA 2af9b90View commit details -
Configuration menu - View commit details
-
Copy full SHA for f417555 - Browse repository at this point
Copy the full SHA f417555View commit details
Commits on Sep 20, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 0123e1a - Browse repository at this point
Copy the full SHA 0123e1aView commit details -
Configuration menu - View commit details
-
Copy full SHA for be0a16c - Browse repository at this point
Copy the full SHA be0a16cView commit details
Commits on Sep 21, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 23a6658 - Browse repository at this point
Copy the full SHA 23a6658View commit details
Commits on Sep 22, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 058b8ee - Browse repository at this point
Copy the full SHA 058b8eeView commit details
Commits on Sep 23, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 03d630f - Browse repository at this point
Copy the full SHA 03d630fView commit details