Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge to newest code #2

Merged
merged 166 commits into from
Aug 3, 2021
Merged

merge to newest code #2

merged 166 commits into from
Aug 3, 2021

Conversation

jiangjiajun
Copy link
Owner

Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

Hzfengsy and others added 30 commits July 1, 2021 14:46
…8381)

After fix a66186b, I saw that it should be necessary to do the same fix
for depthwise_conv2d for intel graphics. I saw that we never used the
removed code and it is just the same code from
cuda/depthwise_conv2d.py. So we can use the cuda implementation when it
will be necessary.
* fix type relation for batch_matmul

* fix lint
* Fix np.int and np.float usage in the tree.

Newer versions of numpy give loads of warnings that suggest
that np.int and np.float will be deprecated. CI uses pytest
and these warning logs clog memory for testing and make it
slower.

* Fix formatting
* rename _update_target and document its function

* make tvm.build return OperatorModule to return multiple outputs

* allow retrieving the var names used in TIR repr

* add Operator Model Library Format and test

* Add pathlib convenience functions to utils.TempDirectory.

* fix tests

* black format

* git-clang-format

* pylint fixes

* add asf header

* change memory map to make more sense, fix tests

* address giuseros comments

* align GetVarName with future TypedPackedFunc

* fix test

* clang-format

* rev model library format to v4 (bad merge)
Remove warning about macOS support from tutorial
* add stm32l4r5zi_nucleo

* add parameter for test qemu

* file type check

* fix test

* change order

* revert
* fix weight shape in torch.mm conversion

* Revert "fix weight shape in torch.mm conversion"

This reverts commit a1a8fd3.

* [Torch] remove unused conversion
* [Arith] Inverse affine map

* [Arith] Inverse affine map

* Update iter_affine_map.h

* Update iter_affine_map.h

* Update iter_affine_map.py

* Topology order visit

* doc

* fix

* address comments

* lint

* remove print
* Support test aten::flip

* Support aten::flip
* rename resize to resize2d

* refactor resize_2d

* Add resize1d op, normalize attribute names across ops

* normalize resize3d to match the API of 1D and 2D

* fix lint

* fix relay tests from API change

* refactor topi tests, docs

* fix method naming in framework frontends

fix more frontend issues

* refactor resize tests to reuse components, add more coordinate tranform modes to tests

* add cubic resize reference kernel and tests, add relay tests for resize1d

* fix pylint

* fix test typo
* [fix] Broken link in apps for wasm-standalone

* [fix] Broken link in apps for wasm-standalone

* [CI] Manual trigger for CI
Co-authored-by: Jackson Hsieh <chengpi@amazon.com>
In a similar vein to previous pull requests
replacing deprecated use of np.bool and np.int from
numpy with bool and int.

https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
* [ONNX] Wrap 'If' if it has multiple outputs

Without this wrapper, an assertion in from_onnx() will fail with the
error message showing ""Number of output mismatch"

* [ONNX] Test If nodes with multiple output tensors

* Fix formatting issues
* Fix AttributeError when TEST_DATA_ROOT_PATH is set

Initiate a Path object from TEST_DATA_ROOT_PATH to fix the error:
AttributeError: 'str' object has no attribute 'mkdir'

* [DOCS] Add docs for Pass Instrument

 - Add a tutorial about how to use pass instrument.
 - Add related sections in Pass Infrastructure documents.

* Fix ir.rst, the length of separator.

* Fix unused local name

* Fix linting errors

* Fix linting errors

* Fix linting errors

* Address code-review feedbacks

* Fix linting

* Fix the order of tutorial.

* Add exception handling. Address feedbacks.

* Fix CI error -- clearing instruments in global pass_ctx

* Clarify section hierachy.

* Emphasize to use decorator instead of subclassing

* Add a sentence to explain Pass Instrument. Fix typo.

* Shrink python docs a little.

* Fix tag name.

* Address feedbacks.
Duplicate the CompileEngine interface.

Refactor the graph_runtime_codegen to invoke the new LowerTE pass

More changes

Things appear to be working

Some tracing to get Relay code to flow through too.

Disable some assertions as exp.

Tweak printing for now

Fix a few bugs: (#13)

1. Don't add relay main function to list of lowered TIR functions
2. Don't skip visiting call to relay function in graph runtime codegen

Remove debug prints.

Start refactoring

Split out shared data structures

Fix implicit duplicate decl of IsDynamic

Clean up handling of name + global prim fn

Clean up the code and debug issue introduced by previous hack

Clean up the debugging

Do C++ lint clean up

Update src/relay/backend/graph_executor_codegen.cc

Co-authored-by: Chris Sullivan <csullivan@octoml.ai>

Clean up handling of external functions

Add more error messages

More clean up

Update src/runtime/graph_executor/graph_executor.cc

Co-authored-by: Chris Sullivan <csullivan@octoml.ai>

Update src/runtime/graph_executor/graph_executor.cc

Co-authored-by: Chris Sullivan <csullivan@octoml.ai>

Update src/relay/backend/te_compiler.h

Co-authored-by: Haichen Shen <shenhaichen@gmail.com>

Update src/relay/backend/te_compiler.h

Co-authored-by: Haichen Shen <shenhaichen@gmail.com>

Fix

CR

More CR

Format

Fix lowering path for C++

Fix tests

Remove uncessary change

Clean up a few more things

CI fix

Fix the default context

Fix

Fix broken test cases

Update

Fix

WIP

Clean up storage data structures

WIP

WIP

Fix build errors

Remove TVMLower

Fix lint

Lint again

fix black

Move UpdateMainWorkspaceSize into te_compiler.cc

Fix link errors

Formatting

Change UpdateMainWorkspaceSize to return Map<String, FunctionInfo>

Workaround for GCC 5 error caused by enums in maps (GCC 5 is on i386 CI)

Testing how functions should be named

Lint

Change how function metadata is updated

Attempt to update aot_executor_codegen to use new StaticMemoryPlan instead of storage_device_map

Pass memory plan through LowerTE into UpdateMainWorkspaceSize so that we don't need to run GraphPlanMemory an extra time

Fix return in UpdateMainWorkspaceSize

Lint

Try to fix UpdateMainWorkspaceSize

Fix construction of static memory plan

Clean up code while debugging

Adding UpdateWorkspaceSize back

Add closure + call to UpdateFunctionMetadata (WIP)

UpdateFunctionMetadata builds; weird error with device ctx map though. Not sure if it came from this change or something else

Add some debugging of UpdateMainWorkspaceSize

Starting to move UpdateFunctionMetadata call to use process_fn infra

UWhat target should be passed to UpdateFunctionMetadata?

UpdateFunctionMetadata is not workinggg

Added some comments about UpdateFunctionMetadata for Jared

Fix the creation of function metadata

Try another stab at cleaning up the information

Fix

Port StorageInfo and StaticMemoryPlan data structure (#8297)

Restoring reshape opt

Fix tests

Caught a nasty typo from Lily, Map::Set does not mutate

Format

Disable stupid Google style warning

Rebase cleanup

Formatting

Add docstring for storage info

Black

Post rebase fix

Remove prints

Disable assert that doesn't make sense for now

Fix lint

Add copying attrs from relay node to graph node; still need to figure out how to do this in the case of global vars

Work with Lily to fix graph attrs

Try to figure out where extra arguments are coming from; fix merge

passes the profiling test

Clean up

Fix profile test

Remove debugging

Add attributes for BYOC uTVM case

Format

Dumb typo

Another fix for byoc

Format

Fix last 3 failing tests

Format

Fix final two test cases

Format

Fix lint

Fix again

Fix

Fix auto scheduler code

Fix issue

Address CR comment

Format

Co-authored-by: Jared Roesch <roeschinc@gmail.com>
When dilation is larger than value 1 in conv2d with NHWC
layout, the ordering of indexes when accessing data array
in computation of convolution appears to be incorrect.

'data_vec' is defined as

lambda n, oho, owo, kh, kw, ic, ohi, owi:

But accessed as

data_vec[n, oho, owo, kh, kw, ohi, owi, ic]

This patch fixes the order of indexes and modifies the test
so that it is suitable for running on an AArch64 CPU.
* [Relay] Add support of conv2d with NHWC for Mali

Added template schedule for conv2d NHWC reusing similar strategy
as for NCHW layout. The schedule is also added to the
corresponding test that can be run to verify correctness.

* [Relay] Fix issue from pylint in conv2d for Mali
With either the ci_lint docker image, or the matched version of
pylint==2.4.4, I got two lint errors running locally that didn't show
up in the CI.  Fixing them.

Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
-Some ops(ex:view) call infer_value when converting a model into Relay IR.
-If LLVM is not enabled, it leads to segementation fault.

Co-authored-by: kueitang <kueitang@qti.qualcomm.com>
* [Bug] Fix x86 dense schedule extern ops

* more

* lint
Matthew Brookhart and others added 25 commits July 30, 2021 09:49
* Otherwise, stale pytest-results could appear in builds.
* Fix storage_access not visiting else branch

* fix conflict with #8516 in the test

* update thread sync test following #8516 update
* add flag

* fix and test

* format

* fix memory memory_align function

* fix and address comments

* format

* fix crt aot test

* comments

* fix test

* trigger

* trigger

* trigger

* trigger

* trigger

Co-authored-by: Mehrdad Hessar <mhessar@ip-172-31-20-199.us-west-2.compute.internal>
* [Vulkan] Rewrote PointerValueTypeRewrite transform

In C-style codegen, pointer types can be freely cast between scalar
and vectorized types (e.g. `float16x4* <-> float16*`).  In SPIR-V,
these are separate types, and no such casting is allowed.  This was
previously handled by having a special-case for `Ramp(base, stride=1,
lanes)` in the codegen.  That method didn't cover all possible cases,
including Broadcast nodes used as indices.

PointerValueTypeRewrite previously re-wrote the AllocateNode and
parameter pointer types, but didn't update the Load/Store node.  This
change tracks which variables can be updated to a vectorized type, and
then updates all references to those.  This includes removing the
`RampNode`, as the vectorization is then included as part of the
variable type.

* [StorageRewrite] Updates as recommended in review.

- Added explicit TODO(Lunderberg) for follow-ups

- Pass `checker.info_map_` instead of `checker` to
  `VectorTypeRewriter`

* [Vulkan] Allow for pointer rewrites that change base type.

A single memory allocation may have more than one type of data stored
within it.  This allows the PointerTypeRewrite pass to recognize if a
function only uses the pointer as a particular base type.  This wasn't
an issue in C-based codegen, but is required for Vulkan.  Since Vulkan
shaders do not permit type-casting, the cast must be done when passing
the pointer argument into the shader.

Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
* [TOPI][CUDA] Improve the performance of scatter_nd by:

1. Split into 2 kernels, one does the "Init" and another does the "Update".
   Thus they can have different Grid/Block configurations to better utilize
   SMs.
2. Use atomic_add instead of direct assignment, which could avoid the race
   condtion when multiple indices point to the same location of the output
   tensor. With this moidification, it's safe now to use more CUDA threads
   to gain more parallelism.

* Fix python code format.

* FIX: [TOPI][CUDA] Improve the performance of scatter_nd #8479

- Split ScatterND kernel into 2 sub-kernels using ib.new_scope()

- Replace ib.for_range() with blockIdx.y

- Using atomic_add when mode == "add"

- Keep threadIdx.x less than max_threads of GPU

* Comment added

* Add fallback implementation when "mode=add" meets int64

- Atomic_add from CUDA doesn't support int64 data type
- Change "ind{i}" to "ind%d"%i, where names of relay.var could correctly display

* Python format

* Fix line too long

* CI pass

* Empty, for CI pass

* Empty, for CI pass

* Empty, for CI pass

* Empty, for CI pass

* Empty, for CI pass

* Exchange blockIdx.x and blockIdx.y

* check for Vulkan or metal

* Fallback to previous algorithm when mode==update

* Update python/tvm/topi/cuda/scatter.py

Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>

* Assign TODO

* Swapping then and else block

Co-authored-by: wenxizhu <wenxizhu@tencent.com>
Co-authored-by: CaptainDuke <captainduke328@gmail.com>
Co-authored-by: Tristan Konolige <tristan.konolige@gmail.com>
* ccache

* ccache

Fix formatting

Add comment about nvcc

Change default to AUTO

More progress

Add auto as a mode

Disable ccache in CI

add-cache-to-cmake

Fix typo

* Fix rebase

* flaky test
This adds support for the external code generation tests to use AOT. As
part of this the existing logic in check_result was split out into
multiple functions, this allows selectively disabling those that aren't
supported such as JSON outputs not being supported in AOT. I've replaced
existing checks to skip tests with @pytest.mark.skipif macros as they've
been moved out of the `check_result` function.
* The fix to disable cache needs to run after pip is installed
* This is quick follow up fix after #8575
* [runtime] Remove unused parameter.

* fix build issue when TVM_CRT_DEBUG enabled
* Introduce --interface-api={c,packed} parameter

This introduces structures generated to provide a documented and stable user
friendly interface to a TVM generated model, as can be seen in the AOT
demo application:
```
struct tvmgen_default_inputs inputs = {
  .input_1 = input_data,
};
struct tvmgen_default_outputs outputs = {
  .output = output_data,
};
int ret_val = tvmgen_default_run(&inputs, &outputs, NULL, NULL);
```

To facilitate this, some other changes are included:
* Removed dependency on `aot_executor.{c,h}` in tests, pending the
discussion in the interface RFC as to whether we keep them.
* Moved creation of test DLTensor's into the AOT test utils, in future this
can be replaced by loading via the Python API or otherwise
* Introduce `parametrize_aot_options` which can be used to test
permutations of AOT which work together - for now this filters C
interface and packed operators
* Updated demo application to generate the header for demonstration
purposes, we should consider porting the demo application to Model
Library Format and using the toolchain in the Zephyr App via CMake
instead?

This patch builds upon the improvements @giuseros made to AOT testing
and name mangling from #8014

* Tweak metadata variable description and MLF target loop

* Remove direct usage of `relay::Var` in meta_data.h

This looks like the only place that could be causing the Windows CI failures, so trying removing the additional header in meta_data.h

* Linting fix

* Post-rebase files fixing

These tests were somehow transmuted in transit, I've updated them to the
most recent variant of the test helpers.

* Strip back interface API to just inputs and outputs

This removes any speculative structures from the generated code and cleans up some of the documentation.

* Add header guards and tweak documentation
* Docker env for Arm® Ethos™-U55 Port

* Added Arm® Corstone™-300 Reference System for testing
* Added Arm® Ethos™-U driver stack
* Added installation of Arm® Vela.

Co-authored-by: Manupa Karunaratne <manupa.karunaratne@arm.com>

Change-Id: Ie3cc43943c876d95618a39887aa666da20bcb1e4

* Docker env for Arm® Ethos™-U55 Port

* Removes /opt/arm/cmake/bin from the path
* Parameterizes Arm® Ethos™-U55 driver stack version number

Change-Id: I2162b40f82241fd013643cbfa8847b60d7f4f5a1

* Docker env for Arm® Ethos™-U55 Port

* Adds ethosu as an extra to /python/gen_requirements.py

Change-Id: I2162b40f82241fd013643cbfa8847b60d7f4f5a1

* Docker env for Arm® Ethos™-U55 Port

* Added comment explaining why Vela version needs to be pinned to 2.1.1

Change-Id: I1ade280faa5274cca78899f4dae9e596b16fb5df
…8241)

* support scalars in quantize and requantize

* Add affine type support for ops with multipe output, use it in concat, move to header

* support new ops, refactor tests

* add more binary ops

fix pylint

fix black

black broke pylint

oops on black

* fix a typo in a branch and add a test that hits it

* improve comments
* instructiosn for m1 mac

* typos

* above to below

* nits, link against python issue on github

* correct link

* more cleanup

* correct source

* address chrishoge suggestions

Co-authored-by: Andrew Zhao Luo <andrewzhaoluo@system76-pc.localdomain>
@jiangjiajun jiangjiajun merged commit 5b39c79 into jiangjiajun:main Aug 3, 2021
junrushao pushed a commit that referenced this pull request Aug 26, 2021
* Add C++ API for computing type key from type index

* Try and isolate leak

* Rewrite the bindings to fix the ArgValue lifetime issue

There are still quite a few issues left to resolve in this patch, but I believe the runtime
changes stablize memory consumption as long as the parameters are only set once. ByteArray
also has some totally broken unsafe code which I am unsure of how it was introduced.

* Finish handling tvm-rt issues due to ArgValue lifetime

This patch further refactors the bindings to better handle the
lifetime issues introduced by detecting the argument memory leak.

* WIP memory leak

* There is issue using TVMCb function which is breaking refcount

* Fix fallout from the lifetime refactor

* Another tweak

* Follow up work from the memory leak, attempt to clean up ByteArray

* Add some todos for future work

* Fix doc string

* Clean up the changes

* Format
jiangjiajun pushed a commit that referenced this pull request Sep 6, 2021
…ter (apache#8835)

* # This is a combination of 2 commits.
# This is the 1st commit message:

Initial changes

# This is the commit message #2:

Ftarget string -> Target object works!

* Fix remaining target strings

* fix bad rebase

* Fix typo

* 1 more bad rebase fix

* Lint

* typo

* Forgot to commit this

* Add TargetStrHash and Map<Target... to std::unordered_map<Target... conversion fn

* Passing most tests, yay

* remove some comments

* lint

* target-str-to-target-object

* Respond to change requests

Co-authored-by: Jared Roesch <roeschinc@gmail.com>
jiangjiajun added a commit that referenced this pull request Sep 22, 2021
* nll loss v1

* add converter

* decode strings in byte form

* decode variable length inputs

* make shapes correct

* unsqueeze

* proper weight handling

* simplify if statement

* fix tests

* add comment about tests

* delete extra file

* lint

* so cool

* Update CI Lint Image Version (apache#8841)

* Update CI Lint Image Version

* trigger

* [BUG] ToBasicBlockNormalForm immutability (apache#8778)

* ToBasicBlockNormalForm immutability

* better comment on ToBasicBlock

* refine comment of ToBasicBlockForm

* [GRAPH EXECUTOR,VM] Add benchmarking function to graph executor and vm (apache#8807)

* [GRAPH EXECUTOR,VM] Add benchmarking function to graph executor and vm

This new benchmarking function is just a convenience function for
calling time_evaluator on the underlying module. Hopefully this should
make it easier for users to get good benchmarks of their code.

* formatting

* import order

* more test, more comments, more precision

* fix tests

* add seconds descriptions to doc

* Apply CPPLint to CRT Tests (apache#8844)

This one was a bit trickier as there was more usage of dynamic arrays and less safe casts. I've tried to minimise the changes to just those required to passing linting.

* [Relay][TOPI] Support of depthwise conv2d NHWC for Mali/Bifrost. (apache#8584)

* [Relay][TOPI] Support of depthwise conv2d NHWC for Mali/Bifrost.

Added initial tunable autotvm templates for depthwise conv2d with
NHWC layout for Mali and Bifrost.

* [Relay][TOPI] Misc fixes for depthwise conv2d Mali/Bifrost.

- Fix assert for Bifrost.
- Set reasonable default axis splits to avoid using tophub for NHWC.
- Fixed typo: arm cpu -> Mali.

* [Relay][TOPI] Fixed formatting in depthwise conv2d Mali/Bifrost.

* Support for CMSIS-NN in Corstone300 Makefile (apache#8831)

Change-Id: Ifc2305db4e11d1d15d45407287f8f0bea469100a

* [microtvm][Zephyr] Increase timeout to fix flaky tests (apache#8846)

* increase timeout

* trigger

* [AMP] Bump up tolerance on flaky test (apache#8850)

* bumpy up tol

* bumped tolerance up even more

* jostle ci

* [Hexagon] Rework tvm.target.hexagon() interface (apache#8823)

* [Hexagon] Rework tvm.target.hexagon() interface

Make the tvm.target.hexagon() function take most options as keyword
parameters. This will allow adding additional parameters without changing
the interface.

No changes are required to existing code, except for changing positional
parameters following the CPU version to keyword parameters, and updating
the names of the keyword parameters:
  sim_args  -> sim_options,
  llvm_args -> llvm_options,
although the old names will be accepted for the time being.

* formatting

* change ' to "

* Rename 'args' to 'config' for clarity

* Use 'strip' instad of 'replace'

* Restart build

* [Pattern matching] Add an option to rewrite the graph only once (apache#8843)

* [Pattern matching] Add an option to rewrite the graph only once

If the graph returned from the callback consists of the original
pattern, the rewriter will run in the loop, which is not always desired.
So this patch proposes an option to run the rewriter only once.

Change-Id: I85cf0a055b8961d52394f21c1e4d7aad0a7e1d06

* Make rewrite_once default to false

Change-Id: Idf6f01f254c403158883681e75c2a5978efbd2d0

* update gpu and cpu (apache#8853)

* VTA cmake change to include Verilator header for building tsim library (apache#8797)

* VTA cmake file require Verilator include for tsim target. VTA module.cc uses svOpenArrayHandle to send wide data through DPI

* Refactor Verialtor check conditions

* Build TSIM only for CPU target. CPU target don't use -Werror to compile with Verilator. Jenkinsfile to have tvm_multilib_tsim defined for CPU build target.

* remove build/libvta_tsim.so from non tsim targeting builds

* Revert to enable TSIM build i386. Revert to -Werror in CPU config. Remove verilator CPP objects from cmake config for tsim and put them as include into vta module.cc to avoid Verilator compilation warnings

* [FIX] Bug fix for a floormod rewrite simplify rule (apache#8852)

* Update rewrite_simplify.cc

* Update test_arith_rewrite_simplify.py

* Update test_arith_rewrite_simplify.py

* Update test_arith_rewrite_simplify.py

* move rust lint script (apache#8726)

* [AMP] Disallow fp16 conversion for summation-like ops (apache#8810)

* [AMP] Disallow fp16 conversion for summation-like ops

* test only structural equality

* [TOPI] [Relay] Sparse Conv2d Implementation for 3x3 kernels (apache#8605)

* [topi] add spconv2d_3x3 nhwc

* [relay] sparse_conv2d: add kernel_size attr

* [relay] add strategy for spconv2d_3x3 nhwc

* [relay] pass to convert spconv2d with const args

* [relay] convert sparse conv2d pass fixes

* use array for sparse conv2d attr

* fixup 1x1 tests; new 3x3 tests

* extend repeat_interleave op for relay.Expr (apache#8839)

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>

* Change AOT from ExprVisitor to MixedModeVisitor (apache#8856)

This should allow better scale-ability for AOT when targeting larger networks.

* Add a PaddlePaddle Frontend (apache#8645)

* fix some problems for matmul

* fix some problems for matmul

* add alpha parameter for matmul

* remove unnecessary condition

* add TranslatedLayer which support model loaded by jit.load

* add mul operator support

* Add padding mode support for conv/pool2d

* support 4 two-tuples

* add paddle test case

* add paddle conv2d  case

* update test_forward.py

* fix paddle convert_matmul

* add paddle multiply and matmul op test case

* add test case and fix bug

* delete import pandas

* add paddlepaddle tests

* modify the variable name of convert_reshape

* formatting

* formatting

* use black to format python code

* pylint check

* Remove fluid api

* black format

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: wjj19950828 <wjjisloser@163.com>
Co-authored-by: heliqi <1101791222@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>

* [Runtime] add set_output_zero_copy (apache#8497)

* Update graph_executor.h

* Update graph_executor.cc

* modify zero copy UT add set input zero copy

* modify C style

* add runtime test

* realy build  generatr the json

Co-authored-by: hwstaff <hwstaff@hwstaffdeMacBook-Pro.local>

* [Hexagon] Change declaration order of unique_ptr objects to fix crash (apache#8859)

A crash occurs when automatically deleting an instance of
CodeGenHexagon because the LLVMContext object has already been
freed. Objects of both types are created using unique_ptr, but
the object managed by the LLVMContext unique_ptr is passed to
CodeGenHexagon object (not as a unique_ptr).

This crash is fixed by moving the declaration of the LLVMContext
object before the CodeGenHexagon object. I'm not sure if this
is the best way to fix this, but it does fix the crash. Also,
in other files, the LLVMContext object is always created first.

Co-authored-by: Cahoon, Brendon <bcahoon@quicinc.com>

* [Graph Executor, VM] Add end to end benchmarking of models (apache#8858)

Add benchmarking that includes ovearhead of transfering inputs and
outputs to and from the device. This should give an accurate measurement
of the runtime a user would see when using the model. This is
accomplished by adding functions that run from inputs to return values
into the graph executor and the VM.

* [UnitTests] Expose TVM pytest helpers as plugin (apache#8532)

* [UnitTests] Expose TVM pytest helpers as plugin

Previously, pytest helper utilities such as automatic parametrization
of `target`/`dev`, or `tvm.testing.parameter` were only available for
tests within the `${TVM_HOME}/tests` directory.  This PR extracts the
helper utilities into an importable plugin, which can be used in
external tests (e.g. one-off debugging).

* [UnitTests] Refactor the plugin-specific logic out into plugin.py.

* [UnitTests] Moved marker definition out to global variable.

* Remove AOT Executor header from Arduino project (apache#8857)

* [Community] @mdw-octoml -> Reviewer (apache#8868)

* [TIR] Fix opaque access in buffer locator pass and match_buffer in region detector (apache#8855)

* init

* fix

* Update src/tir/transforms/plan_update_buffer_allocation_location.cc

Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>

* Update src/tir/transforms/plan_update_buffer_allocation_location.cc

Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>

* address

Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>

* [Autoscheduler] Configurable workload keys (apache#8862)

* change workload keys

* remove binary string comparison

* append the tuple not every integer

* clean up

* lint

* dump workload keys to dags

* fix things

* change some strings

* misc fixes, add tests

* jostle ci

* [Tutorial][Executor] Fix the usage of executors in tutorials (apache#8586)

* fix: executor usage for keras tutorial

* fix: executor usage for onnx tutorial

* [Tutorial][Executor] Fix executors in tutorials

* [Frontend][Onnx] Simplify onnx input since name accesses are not reliable. (apache#8867)

* Simplify onnx input since name accesses are no longer supported.

* move Celu importer.

* [TIR] GetBlockReadWriteRegion (apache#8875)

* [TIR] GetBlockReadWriteRegion

* Fix black issue

* Use constant reference for the interface

* Fix lint issue

* [RISCV] Add support for llvm parameter -mabi (-target-abi) (apache#8860)

* [Community] @manupa-arm -> Committer (apache#8870)

* adding Manupa to the contributors list

* re-trigger CI

* [RPC] Fix ios_rpc build (apache#8864)

* [Vulkan][Target] Added the driver name to the vulkan target string. (apache#8882)

Driver name (e.g. "NVIDIA", "radv", "AMD open-source driver") is read
from the `driverName` property in
[VkPhysicalDeviceDriverProperties](https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VkPhysicalDeviceDriverProperties.html),
or is left as `"unknown_driver_name"` if the driver does not support
querying the driver name.

* [ONNX][TOPI] Support select_last_index for argmin/max (apache#8816)

* support select_last_index for argmin/max

* reverse conditions which made on accident

* forward args in reduce.py

* make proper nodes for reduction ops

* remove complicated nested lambdas

* fix lambda capture for conversion

* forward more arguments

* forward more args

* enable onnx tests

* wrapping casts to remove ambiguity

* revert changes extraneous

* correct incorrect attrs being used for ops

* change attributes

* remove old impl

* register new attribute node

* clean up test

* reformat

* reformat

* coolio

* stable comparison

* casts to avoid ambiguity

* casting more

* correct arg passing

* support select_last_index for argmin/max

* reverse conditions which made on accident

* forward args in reduce.py

* make proper nodes for reduction ops

* remove complicated nested lambdas

* fix lambda capture for conversion

* forward more arguments

* forward more args

* enable onnx tests

* wrapping casts to remove ambiguity

* revert changes extraneous

* correct incorrect attrs being used for ops

* change attributes

* remove old impl

* register new attribute node

* clean up test

* reformat

* reformat

* coolio

* stable comparison

* casts to avoid ambiguity

* casting more

* correct arg passing

* fix broken input

* OneElementReduceAttrs-->ArgReduceAttrs"

* reduce boilerplate

* change names

* remove log statement

* jostle ci

Co-authored-by: Andrew Zhao Luo <andrewzhaoluo@system76-pc.localdomain>

* refactor optimize GEMM on CPU tutorial (apache#8825)

* refactor optimize GEMM on CPU tutorial

* fix lint errors

* fix more lint errors

* fix typo

* fix problem with redefinition of `k`
add TODO and comments around loop unrolling
clarify note on the array packing figure

* reword general description of array packing

* grap kaxis from compute definition

* remove duplicate comments on unrolling

* Change target string to Target object in the TE compiler and interpreter (apache#8835)

* # This is a combination of 2 commits.
# This is the 1st commit message:

Initial changes

# This is the commit message #2:

Ftarget string -> Target object works!

* Fix remaining target strings

* fix bad rebase

* Fix typo

* 1 more bad rebase fix

* Lint

* typo

* Forgot to commit this

* Add TargetStrHash and Map<Target... to std::unordered_map<Target... conversion fn

* Passing most tests, yay

* remove some comments

* lint

* target-str-to-target-object

* Respond to change requests

Co-authored-by: Jared Roesch <roeschinc@gmail.com>

* [TensorIR][M2a] CacheRead/Write (apache#8863)

Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>

* [CI] make pre-commit hooks to run on every push instead of every commit (apache#8888)

* [TVMScript] Fix printing ForNode annotations (apache#8891)

* [1/10] CMSIS-NN graph partitioner for softmax (apache#8653)

* cmsis graph partitioner for softmax

Change-Id: I80ecd7bc5351f241b4674ef53b36e4398c8adb83

* Updated docstring in the partioning function

Change-Id: Ieb4b623e5929cfdb6aa0235db64c825fac8d7055

* [microTVM][RVM] Add Arduino RVM (apache#8748)

* Functioning Arduino Vagrant VM

Begin building Arduino Vagrant VM

Mostly working Vagrant VM

Changes for debugging

Add ignored json file

Fix venv path

* Generalize parts of RVM for multiple platforms

cwd hack

Add unit tests from apps directory to task_python_microtvm.sh

Generalize parts of RVM for multiple platforms

* Add Vagrantfile lint exceptions

* Address PR comments

Address Mehrdad's PR comments

More PR comments

Documentation tweaks

Add dialout group to user

* Rerun tests

* Spresense fix

* Rerun CI tests

* Rerun tests

* sce loss example

* add comments, remove other tests

* lint

* lint

* jostle

* lint up

* jostle

* uncomment some tests

* proper return

* clean up

* lint

* minor merge errors

Co-authored-by: Andrew Zhao Luo <andrewzhaoluo@system76-pc.localdomain>
Co-authored-by: Mehrdad Hessar <mhessar@octoml.ai>
Co-authored-by: Jiawei Liu <jaway.liu@gmail.com>
Co-authored-by: Tristan Konolige <tkonolige@octoml.ai>
Co-authored-by: Christopher Sidebottom <chris.sidebottom@arm.com>
Co-authored-by: Anastasia Stulova <38433336+AnastasiaStulova@users.noreply.github.com>
Co-authored-by: Ashutosh Parkhi <86472128+ashutosh-arm@users.noreply.github.com>
Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com>
Co-authored-by: Elen Kalda <elen.kalda@arm.com>
Co-authored-by: Anton Sorokin <anton.a.sorokin@intel.com>
Co-authored-by: Chenfan <jcf94@outlook.com>
Co-authored-by: masahi <masahi129@gmail.com>
Co-authored-by: Tantalus13A98B5F <jsl_713@live.com>
Co-authored-by: Valery Chernov <black.chervi@gmail.com>
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: Jason <928090362@qq.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: wjj19950828 <wjjisloser@163.com>
Co-authored-by: heliqi <1101791222@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Swift.Sun <sunjiwei@yeah.net>
Co-authored-by: hwstaff <hwstaff@hwstaffdeMacBook-Pro.local>
Co-authored-by: Cahoon, Brendon <bcahoon@quicinc.com>
Co-authored-by: Lunderberg <Lunderberg@users.noreply.github.com>
Co-authored-by: Yizhi Liu <liuyizhi@apache.org>
Co-authored-by: Siyuan Feng <Hzfengsy@vip.qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Josh Fromm <jwfromm@octoml.ai>
Co-authored-by: Alexander Pivovarov <pivovaa@amazon.com>
Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>
Co-authored-by: Egor Churaev <egor.churaev@gmail.com>
Co-authored-by: Adam Straw <astraw@octoml.ai>
Co-authored-by: Lily Orth-Smith <lilyorthsmith@gmail.com>
Co-authored-by: Jared Roesch <roeschinc@gmail.com>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Michalis Papadimitriou <mikepapadim@users.noreply.github.com>
Co-authored-by: Gavin Uberti <guberti@users.noreply.github.com>
jiangjiajun added a commit that referenced this pull request Oct 5, 2021
… only to `/docs` (apache#9031)

* Add script to look for changed in doc dir

* Modify Jenkinsfile

* Minor changes in scripts

* Working Jenkinsfile on selective stages on docs

* Pass groovy formater on Jenkinsfile

* Implementation of relay_to_tir target hook (apache#8423)

This the first new hook proposed in the Additional Target Hooks RFC, longer
term the compilation should move to using `Target` proper but this unblocks our current work whilst illustrating the eventual interface via `Target` in `src/relay/backend/contrib/example_target_hooks/relay_to_tir.cc`

Ideally the host target would be annotated onto the `IRModule` so as this `Pass` could use it instead of defaulting to C but this is fine for now.

* [CUDA] Fix dense tensorcore legalize type error when units is specified (apache#9030)

* Fix dense tensorcore legalize type error when units is specified

* revert black change due to different version from CI

* [ONNX] QLinearAveragePool and QLinearGlobalAveragePool contrib op (apache#9017)

* [ONNX] QLinearAveragePool and QLinearGlobalAveragePool contrib op

* Fix linter error for variable name and else after return

* Separate quantized avg_pool impl and add TODO for global_avg_pool

* Fix comment typo

* Fix line break in `setup.py` (apache#9029)

* [Onnx] Add SoftmaxCrossEntropyLoss (apache#8906)

* nll loss v1

* add converter

* decode strings in byte form

* decode variable length inputs

* make shapes correct

* unsqueeze

* proper weight handling

* simplify if statement

* fix tests

* add comment about tests

* delete extra file

* lint

* so cool

* Update CI Lint Image Version (apache#8841)

* Update CI Lint Image Version

* trigger

* [BUG] ToBasicBlockNormalForm immutability (apache#8778)

* ToBasicBlockNormalForm immutability

* better comment on ToBasicBlock

* refine comment of ToBasicBlockForm

* [GRAPH EXECUTOR,VM] Add benchmarking function to graph executor and vm (apache#8807)

* [GRAPH EXECUTOR,VM] Add benchmarking function to graph executor and vm

This new benchmarking function is just a convenience function for
calling time_evaluator on the underlying module. Hopefully this should
make it easier for users to get good benchmarks of their code.

* formatting

* import order

* more test, more comments, more precision

* fix tests

* add seconds descriptions to doc

* Apply CPPLint to CRT Tests (apache#8844)

This one was a bit trickier as there was more usage of dynamic arrays and less safe casts. I've tried to minimise the changes to just those required to passing linting.

* [Relay][TOPI] Support of depthwise conv2d NHWC for Mali/Bifrost. (apache#8584)

* [Relay][TOPI] Support of depthwise conv2d NHWC for Mali/Bifrost.

Added initial tunable autotvm templates for depthwise conv2d with
NHWC layout for Mali and Bifrost.

* [Relay][TOPI] Misc fixes for depthwise conv2d Mali/Bifrost.

- Fix assert for Bifrost.
- Set reasonable default axis splits to avoid using tophub for NHWC.
- Fixed typo: arm cpu -> Mali.

* [Relay][TOPI] Fixed formatting in depthwise conv2d Mali/Bifrost.

* Support for CMSIS-NN in Corstone300 Makefile (apache#8831)

Change-Id: Ifc2305db4e11d1d15d45407287f8f0bea469100a

* [microtvm][Zephyr] Increase timeout to fix flaky tests (apache#8846)

* increase timeout

* trigger

* [AMP] Bump up tolerance on flaky test (apache#8850)

* bumpy up tol

* bumped tolerance up even more

* jostle ci

* [Hexagon] Rework tvm.target.hexagon() interface (apache#8823)

* [Hexagon] Rework tvm.target.hexagon() interface

Make the tvm.target.hexagon() function take most options as keyword
parameters. This will allow adding additional parameters without changing
the interface.

No changes are required to existing code, except for changing positional
parameters following the CPU version to keyword parameters, and updating
the names of the keyword parameters:
  sim_args  -> sim_options,
  llvm_args -> llvm_options,
although the old names will be accepted for the time being.

* formatting

* change ' to "

* Rename 'args' to 'config' for clarity

* Use 'strip' instad of 'replace'

* Restart build

* [Pattern matching] Add an option to rewrite the graph only once (apache#8843)

* [Pattern matching] Add an option to rewrite the graph only once

If the graph returned from the callback consists of the original
pattern, the rewriter will run in the loop, which is not always desired.
So this patch proposes an option to run the rewriter only once.

Change-Id: I85cf0a055b8961d52394f21c1e4d7aad0a7e1d06

* Make rewrite_once default to false

Change-Id: Idf6f01f254c403158883681e75c2a5978efbd2d0

* update gpu and cpu (apache#8853)

* VTA cmake change to include Verilator header for building tsim library (apache#8797)

* VTA cmake file require Verilator include for tsim target. VTA module.cc uses svOpenArrayHandle to send wide data through DPI

* Refactor Verialtor check conditions

* Build TSIM only for CPU target. CPU target don't use -Werror to compile with Verilator. Jenkinsfile to have tvm_multilib_tsim defined for CPU build target.

* remove build/libvta_tsim.so from non tsim targeting builds

* Revert to enable TSIM build i386. Revert to -Werror in CPU config. Remove verilator CPP objects from cmake config for tsim and put them as include into vta module.cc to avoid Verilator compilation warnings

* [FIX] Bug fix for a floormod rewrite simplify rule (apache#8852)

* Update rewrite_simplify.cc

* Update test_arith_rewrite_simplify.py

* Update test_arith_rewrite_simplify.py

* Update test_arith_rewrite_simplify.py

* move rust lint script (apache#8726)

* [AMP] Disallow fp16 conversion for summation-like ops (apache#8810)

* [AMP] Disallow fp16 conversion for summation-like ops

* test only structural equality

* [TOPI] [Relay] Sparse Conv2d Implementation for 3x3 kernels (apache#8605)

* [topi] add spconv2d_3x3 nhwc

* [relay] sparse_conv2d: add kernel_size attr

* [relay] add strategy for spconv2d_3x3 nhwc

* [relay] pass to convert spconv2d with const args

* [relay] convert sparse conv2d pass fixes

* use array for sparse conv2d attr

* fixup 1x1 tests; new 3x3 tests

* extend repeat_interleave op for relay.Expr (apache#8839)

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>

* Change AOT from ExprVisitor to MixedModeVisitor (apache#8856)

This should allow better scale-ability for AOT when targeting larger networks.

* Add a PaddlePaddle Frontend (apache#8645)

* fix some problems for matmul

* fix some problems for matmul

* add alpha parameter for matmul

* remove unnecessary condition

* add TranslatedLayer which support model loaded by jit.load

* add mul operator support

* Add padding mode support for conv/pool2d

* support 4 two-tuples

* add paddle test case

* add paddle conv2d  case

* update test_forward.py

* fix paddle convert_matmul

* add paddle multiply and matmul op test case

* add test case and fix bug

* delete import pandas

* add paddlepaddle tests

* modify the variable name of convert_reshape

* formatting

* formatting

* use black to format python code

* pylint check

* Remove fluid api

* black format

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: wjj19950828 <wjjisloser@163.com>
Co-authored-by: heliqi <1101791222@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>

* [Runtime] add set_output_zero_copy (apache#8497)

* Update graph_executor.h

* Update graph_executor.cc

* modify zero copy UT add set input zero copy

* modify C style

* add runtime test

* realy build  generatr the json

Co-authored-by: hwstaff <hwstaff@hwstaffdeMacBook-Pro.local>

* [Hexagon] Change declaration order of unique_ptr objects to fix crash (apache#8859)

A crash occurs when automatically deleting an instance of
CodeGenHexagon because the LLVMContext object has already been
freed. Objects of both types are created using unique_ptr, but
the object managed by the LLVMContext unique_ptr is passed to
CodeGenHexagon object (not as a unique_ptr).

This crash is fixed by moving the declaration of the LLVMContext
object before the CodeGenHexagon object. I'm not sure if this
is the best way to fix this, but it does fix the crash. Also,
in other files, the LLVMContext object is always created first.

Co-authored-by: Cahoon, Brendon <bcahoon@quicinc.com>

* [Graph Executor, VM] Add end to end benchmarking of models (apache#8858)

Add benchmarking that includes ovearhead of transfering inputs and
outputs to and from the device. This should give an accurate measurement
of the runtime a user would see when using the model. This is
accomplished by adding functions that run from inputs to return values
into the graph executor and the VM.

* [UnitTests] Expose TVM pytest helpers as plugin (apache#8532)

* [UnitTests] Expose TVM pytest helpers as plugin

Previously, pytest helper utilities such as automatic parametrization
of `target`/`dev`, or `tvm.testing.parameter` were only available for
tests within the `${TVM_HOME}/tests` directory.  This PR extracts the
helper utilities into an importable plugin, which can be used in
external tests (e.g. one-off debugging).

* [UnitTests] Refactor the plugin-specific logic out into plugin.py.

* [UnitTests] Moved marker definition out to global variable.

* Remove AOT Executor header from Arduino project (apache#8857)

* [Community] @mdw-octoml -> Reviewer (apache#8868)

* [TIR] Fix opaque access in buffer locator pass and match_buffer in region detector (apache#8855)

* init

* fix

* Update src/tir/transforms/plan_update_buffer_allocation_location.cc

Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>

* Update src/tir/transforms/plan_update_buffer_allocation_location.cc

Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>

* address

Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>

* [Autoscheduler] Configurable workload keys (apache#8862)

* change workload keys

* remove binary string comparison

* append the tuple not every integer

* clean up

* lint

* dump workload keys to dags

* fix things

* change some strings

* misc fixes, add tests

* jostle ci

* [Tutorial][Executor] Fix the usage of executors in tutorials (apache#8586)

* fix: executor usage for keras tutorial

* fix: executor usage for onnx tutorial

* [Tutorial][Executor] Fix executors in tutorials

* [Frontend][Onnx] Simplify onnx input since name accesses are not reliable. (apache#8867)

* Simplify onnx input since name accesses are no longer supported.

* move Celu importer.

* [TIR] GetBlockReadWriteRegion (apache#8875)

* [TIR] GetBlockReadWriteRegion

* Fix black issue

* Use constant reference for the interface

* Fix lint issue

* [RISCV] Add support for llvm parameter -mabi (-target-abi) (apache#8860)

* [Community] @manupa-arm -> Committer (apache#8870)

* adding Manupa to the contributors list

* re-trigger CI

* [RPC] Fix ios_rpc build (apache#8864)

* [Vulkan][Target] Added the driver name to the vulkan target string. (apache#8882)

Driver name (e.g. "NVIDIA", "radv", "AMD open-source driver") is read
from the `driverName` property in
[VkPhysicalDeviceDriverProperties](https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/VkPhysicalDeviceDriverProperties.html),
or is left as `"unknown_driver_name"` if the driver does not support
querying the driver name.

* [ONNX][TOPI] Support select_last_index for argmin/max (apache#8816)

* support select_last_index for argmin/max

* reverse conditions which made on accident

* forward args in reduce.py

* make proper nodes for reduction ops

* remove complicated nested lambdas

* fix lambda capture for conversion

* forward more arguments

* forward more args

* enable onnx tests

* wrapping casts to remove ambiguity

* revert changes extraneous

* correct incorrect attrs being used for ops

* change attributes

* remove old impl

* register new attribute node

* clean up test

* reformat

* reformat

* coolio

* stable comparison

* casts to avoid ambiguity

* casting more

* correct arg passing

* support select_last_index for argmin/max

* reverse conditions which made on accident

* forward args in reduce.py

* make proper nodes for reduction ops

* remove complicated nested lambdas

* fix lambda capture for conversion

* forward more arguments

* forward more args

* enable onnx tests

* wrapping casts to remove ambiguity

* revert changes extraneous

* correct incorrect attrs being used for ops

* change attributes

* remove old impl

* register new attribute node

* clean up test

* reformat

* reformat

* coolio

* stable comparison

* casts to avoid ambiguity

* casting more

* correct arg passing

* fix broken input

* OneElementReduceAttrs-->ArgReduceAttrs"

* reduce boilerplate

* change names

* remove log statement

* jostle ci

Co-authored-by: Andrew Zhao Luo <andrewzhaoluo@system76-pc.localdomain>

* refactor optimize GEMM on CPU tutorial (apache#8825)

* refactor optimize GEMM on CPU tutorial

* fix lint errors

* fix more lint errors

* fix typo

* fix problem with redefinition of `k`
add TODO and comments around loop unrolling
clarify note on the array packing figure

* reword general description of array packing

* grap kaxis from compute definition

* remove duplicate comments on unrolling

* Change target string to Target object in the TE compiler and interpreter (apache#8835)

* # This is a combination of 2 commits.
# This is the 1st commit message:

Initial changes

# This is the commit message #2:

Ftarget string -> Target object works!

* Fix remaining target strings

* fix bad rebase

* Fix typo

* 1 more bad rebase fix

* Lint

* typo

* Forgot to commit this

* Add TargetStrHash and Map<Target... to std::unordered_map<Target... conversion fn

* Passing most tests, yay

* remove some comments

* lint

* target-str-to-target-object

* Respond to change requests

Co-authored-by: Jared Roesch <roeschinc@gmail.com>

* [TensorIR][M2a] CacheRead/Write (apache#8863)

Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>

* [CI] make pre-commit hooks to run on every push instead of every commit (apache#8888)

* [TVMScript] Fix printing ForNode annotations (apache#8891)

* [1/10] CMSIS-NN graph partitioner for softmax (apache#8653)

* cmsis graph partitioner for softmax

Change-Id: I80ecd7bc5351f241b4674ef53b36e4398c8adb83

* Updated docstring in the partioning function

Change-Id: Ieb4b623e5929cfdb6aa0235db64c825fac8d7055

* [microTVM][RVM] Add Arduino RVM (apache#8748)

* Functioning Arduino Vagrant VM

Begin building Arduino Vagrant VM

Mostly working Vagrant VM

Changes for debugging

Add ignored json file

Fix venv path

* Generalize parts of RVM for multiple platforms

cwd hack

Add unit tests from apps directory to task_python_microtvm.sh

Generalize parts of RVM for multiple platforms

* Add Vagrantfile lint exceptions

* Address PR comments

Address Mehrdad's PR comments

More PR comments

Documentation tweaks

Add dialout group to user

* Rerun tests

* Spresense fix

* Rerun CI tests

* Rerun tests

* sce loss example

* add comments, remove other tests

* lint

* lint

* jostle

* lint up

* jostle

* uncomment some tests

* proper return

* clean up

* lint

* minor merge errors

Co-authored-by: Andrew Zhao Luo <andrewzhaoluo@system76-pc.localdomain>
Co-authored-by: Mehrdad Hessar <mhessar@octoml.ai>
Co-authored-by: Jiawei Liu <jaway.liu@gmail.com>
Co-authored-by: Tristan Konolige <tkonolige@octoml.ai>
Co-authored-by: Christopher Sidebottom <chris.sidebottom@arm.com>
Co-authored-by: Anastasia Stulova <38433336+AnastasiaStulova@users.noreply.github.com>
Co-authored-by: Ashutosh Parkhi <86472128+ashutosh-arm@users.noreply.github.com>
Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com>
Co-authored-by: Elen Kalda <elen.kalda@arm.com>
Co-authored-by: Anton Sorokin <anton.a.sorokin@intel.com>
Co-authored-by: Chenfan <jcf94@outlook.com>
Co-authored-by: masahi <masahi129@gmail.com>
Co-authored-by: Tantalus13A98B5F <jsl_713@live.com>
Co-authored-by: Valery Chernov <black.chervi@gmail.com>
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: Jason <928090362@qq.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: wjj19950828 <wjjisloser@163.com>
Co-authored-by: heliqi <1101791222@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Swift.Sun <sunjiwei@yeah.net>
Co-authored-by: hwstaff <hwstaff@hwstaffdeMacBook-Pro.local>
Co-authored-by: Cahoon, Brendon <bcahoon@quicinc.com>
Co-authored-by: Lunderberg <Lunderberg@users.noreply.github.com>
Co-authored-by: Yizhi Liu <liuyizhi@apache.org>
Co-authored-by: Siyuan Feng <Hzfengsy@vip.qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Josh Fromm <jwfromm@octoml.ai>
Co-authored-by: Alexander Pivovarov <pivovaa@amazon.com>
Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>
Co-authored-by: Egor Churaev <egor.churaev@gmail.com>
Co-authored-by: Adam Straw <astraw@octoml.ai>
Co-authored-by: Lily Orth-Smith <lilyorthsmith@gmail.com>
Co-authored-by: Jared Roesch <roeschinc@gmail.com>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Michalis Papadimitriou <mikepapadim@users.noreply.github.com>
Co-authored-by: Gavin Uberti <guberti@users.noreply.github.com>

* [Hexagon] Don't use {} initialization with FastRPC structures (apache#9033)

The data members in FastRPC structures aren't guaranteed to remain
in the same order. Replace aggregate initialization with direct,
member-by-member initialization.

* Test

* Minor checkstyle issue

* Test

* Test file

* Revert changed in unit tests

* Change script name

* Test

* Revert format on groovy file

* Remove test file

* Minor change in script

* Minor formating changes

* Revert logic in conditions for changed files

Co-authored-by: Christopher Sidebottom <christopher.sidebottom@arm.com>
Co-authored-by: masahi <masahi129@gmail.com>
Co-authored-by: Anirudh Sundar <quic_sanirudh@quicinc.com>
Co-authored-by: Leandro Nunes <leandro.nunes@arm.com>
Co-authored-by: AndrewZhaoLuo <andrew.zhao.luo@gmail.com>
Co-authored-by: Andrew Zhao Luo <andrewzhaoluo@system76-pc.localdomain>
Co-authored-by: Mehrdad Hessar <mhessar@octoml.ai>
Co-authored-by: Jiawei Liu <jaway.liu@gmail.com>
Co-authored-by: Tristan Konolige <tkonolige@octoml.ai>
Co-authored-by: Christopher Sidebottom <chris.sidebottom@arm.com>
Co-authored-by: Anastasia Stulova <38433336+AnastasiaStulova@users.noreply.github.com>
Co-authored-by: Ashutosh Parkhi <86472128+ashutosh-arm@users.noreply.github.com>
Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com>
Co-authored-by: Elen Kalda <elen.kalda@arm.com>
Co-authored-by: Anton Sorokin <anton.a.sorokin@intel.com>
Co-authored-by: Chenfan <jcf94@outlook.com>
Co-authored-by: Tantalus13A98B5F <jsl_713@live.com>
Co-authored-by: Valery Chernov <black.chervi@gmail.com>
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Co-authored-by: Jason <928090362@qq.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: wjj19950828 <wjjisloser@163.com>
Co-authored-by: heliqi <1101791222@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Swift.Sun <sunjiwei@yeah.net>
Co-authored-by: hwstaff <hwstaff@hwstaffdeMacBook-Pro.local>
Co-authored-by: Cahoon, Brendon <bcahoon@quicinc.com>
Co-authored-by: Lunderberg <Lunderberg@users.noreply.github.com>
Co-authored-by: Yizhi Liu <liuyizhi@apache.org>
Co-authored-by: Siyuan Feng <Hzfengsy@vip.qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Josh Fromm <jwfromm@octoml.ai>
Co-authored-by: Alexander Pivovarov <pivovaa@amazon.com>
Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>
Co-authored-by: Egor Churaev <egor.churaev@gmail.com>
Co-authored-by: Adam Straw <astraw@octoml.ai>
Co-authored-by: Lily Orth-Smith <lilyorthsmith@gmail.com>
Co-authored-by: Jared Roesch <roeschinc@gmail.com>
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Gavin Uberti <guberti@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.