Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for Adreno deployment #22

Open
wants to merge 99 commits into
base: dbarinov/main
Choose a base branch
from

Commits on Nov 1, 2022

  1. [COMMUNITY] Jyotsna Verma -> Reviewer (apache#13251)

    adding Jyotsna to reviewers list
    tmoreau89 authored Nov 1, 2022
    Configuration menu
    Copy the full SHA
    6551b71 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    87f52af View commit details
    Browse the repository at this point in the history

Commits on Nov 2, 2022

  1. [MetaSchedule] Swap the order of RewriteTensorize and VerifyGPUCode t…

    …o reduce tuning time (apache#13259)
    
    * [MetaSchedule] Swap the order of RewriteTensorize and VerifyGPUCode to
    reduce tuning time
    
    * add comment
    masahi authored Nov 2, 2022
    Configuration menu
    Copy the full SHA
    7536068 View commit details
    Browse the repository at this point in the history
  2. [CI] Skip failing Caffe tests due to broken URL (apache#13228)

    See issue apache#13227.
    
    Co-authored-by: driazati <9407960+driazati@users.noreply.github.com>
    lhutton1 and driazati authored Nov 2, 2022
    Configuration menu
    Copy the full SHA
    84fadc4 View commit details
    Browse the repository at this point in the history
  3. [TVMC] Apply constant folding when converting layout (apache#13216)

    This commit ensures that constant folding is applied when a desired
    layout is selected during compilation. It ensures that
    `layout_transform` operations are removed where possible so that
    pattern matching for BYOC backends can work effectively.
    
    A test has been added to check this regression.
    lhutton1 authored Nov 2, 2022
    Configuration menu
    Copy the full SHA
    4ecf303 View commit details
    Browse the repository at this point in the history
  4. Apply group write permissions to Python virtual environment (apache#1…

    …3252)
    
    This commit applies additional write permission to the "tvm-venv"
    group virtual environment. Currently after entering a container from
    a newly built image it dosn't seem possible to install/update Python
    packages. E.g. updating pip will give errors such as:
    ```
    $ pip install --upgrade pip
    ERROR: Could not install packages due to an OSError: [Errno 13]
    Permission denied: '/venv/apache-tvm-py3.7/bin/pip' Check the
    permissions.
    ```
    
    Enabling write access for this group fixes this as long as the
    current user is a member of the "tvm-venv" group.
    lhutton1 authored Nov 2, 2022
    Configuration menu
    Copy the full SHA
    c3c1454 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    da4bb4a View commit details
    Browse the repository at this point in the history
  6. [Hexagon] Add pylint on tests (apache#13233)

    * [Hexagon] Tests pylint
    
    * fix error
    
    * Fix buffer name
    mehrdadh authored Nov 2, 2022
    Configuration menu
    Copy the full SHA
    d261fa8 View commit details
    Browse the repository at this point in the history
  7. [build][relay][te][tir] remove unused vars / args (apache#13266)

    - Fix clang 15.0.3 '-Wunused-but-set-variable' and '-Wunused-lambda-capture' warnings by removing / commenting-out code.
    Christian Convey authored Nov 2, 2022
    Configuration menu
    Copy the full SHA
    404d95f View commit details
    Browse the repository at this point in the history
  8. [Frontend][Tensorflow2] Import graph_def to default graph before call…

    …ing function_def_to_graph_def (apache#13260)
    
    [TF2] Import graph_def to default graph before calling function_def_to_graph_def
    apivovarov authored Nov 2, 2022
    Configuration menu
    Copy the full SHA
    ff6aaeb View commit details
    Browse the repository at this point in the history

Commits on Nov 3, 2022

  1. [Frontend][PaddlePaddle] Fix UnboundLocalError: local variable 'shape… (

    apache#13247)
    
    There are a local variable referenced before assignment in convert_interpolate function. I think varible 'size' is real want to be referenced.
    woobinw authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    d998187 View commit details
    Browse the repository at this point in the history
  2. [skip ci] Revert "[ci] Protect release branches (apache#13208)" (apac…

    …he#13274)
    
    This reverts commit 5acf3f9.
    
    Reverting since this is causing some spam from the ASF Infra bot related
    to https://issues.apache.org/jira/browse/INFRA-23834. As in that issue
    the protections have been applied manually by ASF Infra so this revert
    shouldn't have any real effect
    driazati authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    e9ba986 View commit details
    Browse the repository at this point in the history
  3. [Docs] Minimal dependencies for Fedora/CentOS (apache#13248)

    Minimal dependencies for Fedora/CentOS
    
    This commit indicates how to install minimal set of
    dependencies for building Apache TVM on Fedora and
    CentOS. It supplements existing information for
    Ubuntu and MacOS.
    bkmgit authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    f15afd2 View commit details
    Browse the repository at this point in the history
  4. [build][doc] Fix clang doxygen warnings (apache#13270)

    Fix occurrences of clang's `-Wdocumentation-unknown-command` warning.
    Christian Convey authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    9df3a33 View commit details
    Browse the repository at this point in the history
  5. [build][tir] fix clang redundant-move warning (apache#13268)

    Fix code to address a valid `-Wredundant-move` clang warning.
    Christian Convey authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    0d55312 View commit details
    Browse the repository at this point in the history
  6. [ETHOSN] Inline non-compute-intensive partitions (apache#13092)

    * [ETHOSN] Inline non-compute-intensive partitions
    
    Adds a pass that analyzes functions partitioned for the NPU and inlines
    those that are deemed "non-compute-intensive" back to the main function
    so that they can be considered for other backends. The current heurisic
    for deciding a non-compute-intensive function is to collectively check
    all of the operations in the function have no multiply accumulate
    operations. This heuristic is not optimial; optimization is left for
    future exploration.
    
    This pass is inspired by the "IsComputeIntensiveGraph" pass in the
    TensorRT integration.
    
    Change-Id: I20c197702f5252f102cfc1e4b4635ab836aa7835
    
    * Address comments
    
    * 'inline_non_compute_intensive_partitions' -> 'is_inline_non_compute
    _intensive_partitions_enabled'.
    * remove no MAC operations.
    * fix network test.
    
    Change-Id: Ie1015b27f37e47544bed6f0aff819ee4649de579
    
    * Fix failing unit tests due to optimization
    
    Change-Id: I0ee0af071dc77c91e0ef0f6753506cb40d1d1859
    
    * Add future exploration suggestions
    
    Change-Id: Ie918d7f1059f032282f1f5eeffda38f4febcd59c
    lhutton1 authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    75921fb View commit details
    Browse the repository at this point in the history
  7. [ETHOSN] Throw error message when inference fails (apache#13022)

    * [ETHOSN] Throw error message when inference fails
    
    Previously the runtime would silently skip interence failures and return
    random values as the result. This can make spotting inference failures
    challenging. The runtime now throws a fatal error when inference did not
    complete successfully along with an error message that gives some
    details about the error that occurred.
    
    Change-Id: Iadb6da04ad1c906e3ec49959eb3da0978295aebf
    
    * Address comments
    
    * clarify test file brief
    * add test case for running status
    * add driver stack reference to WaitStatus class
    
    Change-Id: I792742892b761534904816135ae2ffcb3f028b2c
    lhutton1 authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    47da418 View commit details
    Browse the repository at this point in the history
  8. [MetaSchedule] Fix Task Hanging in EvolutionarySearch (apache#13246)

    This PR introduces a new argument for EvolutionarySearch that limits the failures (defined as rounds of no new generated candidate) in the `SampleInitPopulation` stage. In this way we can avoid the task to be hanging forever in special cases, e.g., some postproc always fails. This should fix apache#12330.
    zxybazh authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    1d1db35 View commit details
    Browse the repository at this point in the history
  9. [Bugfix][TIR] Fix version conflict with typing for Python 3.9 (apac…

    …he#13269)
    
    Current type checker for TIR schedule had issue with typing for Python 3.9. 
    This simple patch fixes this problem.
    sunggg authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    215f0e2 View commit details
    Browse the repository at this point in the history
  10. [MetaSchedule] Improve the script for TorchBench model tuning & bench…

    …marking (apache#13255)
    
    This PR adds features to the `python/tvm/meta_schedule/testing/torchbench/run.py`.
    
    - Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory.
    - Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure.
    - Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models.
    - Add option to choose search strategy in MetaSchedule.
    - Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed.
    - Save subgraphs and their example input for debug purpose.
    - Print MetaSchedule profiling information at the end of execution.
    - Detach PyTorch tensor before exporting to dlpack.
    - Fix the sys path to avoid conflict with the `benchmarks` package installed by TorchBench dependency.
    - Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args.
    - Empty cuda cache before starting the actual benchmark.
    yelite authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    b98b9f9 View commit details
    Browse the repository at this point in the history
  11. [Relay] Add tensor rank check for nn.instance_norm (apache#13280)

    Add tensor rank check for `nn.instance_norm`.
    wzh99 authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    90ed632 View commit details
    Browse the repository at this point in the history
  12. [Relay] Enhancement for fold_scale_axis and simplify_expr (apache#13275)

    add(%1, %1) convert to multiply(%1, 2f); enhance fold_scale_axis to fold multiply(%1, 2f) into conv
    
    Signed-off-by: Lei Wen <wenlei03@qiyi.com>
    Co-authored-by: Lei Wen <wenlei03@qiyi.com>
    leiwen83 and wenlei03 authored Nov 3, 2022
    Configuration menu
    Copy the full SHA
    b1a099b View commit details
    Browse the repository at this point in the history

Commits on Nov 4, 2022

  1. [skip-ci][COMMUNITY] New committer Ashutosh Parkhi (apache#13286)

    [COMMUNITY] New committer Ashutosh Parkhi
    tqchen authored Nov 4, 2022
    Configuration menu
    Copy the full SHA
    de8a79d View commit details
    Browse the repository at this point in the history
  2. [TIR][Arith] Use TryCompare to narrow inequalities if possible (apach…

    …e#13024)
    
    Prior to this commit, the result of TryCompare would only be used if
    it could definitively prove a conditional to be either true or false.
    For example, if it is known that `0 <= i`, a conditional of `i <= 0`
    would be left as-is.
    
    This commit introduces rewrite rules to preferentially simplify
    into more restrictive conditions.  Using the same example, if it is
    known that `0 <= i`, a conditional of `i <= 0` would be simplified
    into `i == 0`.  Similarly, if it is known that `0 <= i`, a
    conditional of `i != 0` would be simplified into `0 < i`.
    
    Because this change does not introduce significant overhead, as the
    results of `RewriteSimplifier::Impl::TryCompare` are already
    available, this change is enabled for all use cases and does not
    require a call to `RewriteSimplifier::SetEnabledExtensions`.
    Lunderberg authored Nov 4, 2022
    Configuration menu
    Copy the full SHA
    ccb7d07 View commit details
    Browse the repository at this point in the history
  3. [build][hexagon] remove unused variable (apache#13291)

    Remove unused member variable in the `SimulatorRPCChannel` class.
    Fixes a clang warning.
    Christian Convey authored Nov 4, 2022
    Configuration menu
    Copy the full SHA
    e860884 View commit details
    Browse the repository at this point in the history
  4. [BugFix][Pattern] Fixed a crash when AltPattern and FunctionPattern a…

    …re used nested (apache#13278)
    
    The PatternGroup doesn not check if the FunctionPattern is matched
    while processing the FunctionPattern, but when FunctionPattern
    is nested with AltPattern, the FunctionPattern may not be matched,
    resulting in a crash when looking up matched nodes.
    This commit adds a check at handling FunctionPattern to fix this crash.
    liangW-intellif authored Nov 4, 2022
    Configuration menu
    Copy the full SHA
    6da298b View commit details
    Browse the repository at this point in the history
  5. [build][tir] suppress -Woverloaded-virtual warning (apache#13267)

    - Address a (valid) warning from  clang-15.0.3 regarding the
      `tvm::tir::DataTypeRewriter` class.
    
    - Make some class methods `protected` rather than `public`
      to better reflect authors' intent.
    Christian Convey authored Nov 4, 2022
    Configuration menu
    Copy the full SHA
    dec74cb View commit details
    Browse the repository at this point in the history
  6. [Tensorize] Add logs to comparator to make debugging tensorize failur…

    …es easier (apache#13285)
    
    * [TIR][Tensorize] Add error logs to IR comparator to display what caused tensorization to fail
    
    * lint issues
    nverke authored Nov 4, 2022
    Configuration menu
    Copy the full SHA
    be44e9c View commit details
    Browse the repository at this point in the history

Commits on Nov 5, 2022

  1. [Hexagon] Lint tests part 2 (apache#13271)

    * Hexagon test lint part 2
    
    * fix import
    
    * fix global variable
    
    * fix import issue
    
    * fix import
    
    * fix exception error
    
    * address comments
    mehrdadh authored Nov 5, 2022
    Configuration menu
    Copy the full SHA
    62fadac View commit details
    Browse the repository at this point in the history
  2. [TE] Make elem_offset of the buffers created by te.extern a varia…

    …ble to avoid crash (apache#13297)
    
    * make elem_offset of the buffers created by te.extern a variable
    
    Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
    
    * add test
    
    * fix te extern create_prim_func test
    
    Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
    masahi and Lunderberg authored Nov 5, 2022
    Configuration menu
    Copy the full SHA
    56878fa View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1e79364 View commit details
    Browse the repository at this point in the history
  4. [TIR] Preserve loop annotation after loop partitioning (apache#13292)

    Preserve loop annotations when the loop is get partitioned. Also we bind the loop region info to the analyzer for some cases some partition condition could not get solved due to unknown (but trivial) loop region.
    wrongtest-intellif authored Nov 5, 2022
    Configuration menu
    Copy the full SHA
    732e34f View commit details
    Browse the repository at this point in the history
  5. [FIX] Handle matmul where one inner dimension is unknown (apache#13287)

    Unify the two inner dimensions in the type checker so if one is unknown
    it will be filled in.
    Tristan Konolige authored Nov 5, 2022
    Configuration menu
    Copy the full SHA
    b51c491 View commit details
    Browse the repository at this point in the history

Commits on Nov 6, 2022

  1. [DOCS][TVMC] Use correct argument to reuse tuning records (apache#13302)

    Update tvmc tutorial code to use correct argument for reusing tuning    
    records. Specifically, current code uses tuning_records, which is meant 
    for saving the generated tuning results, not reusing prior results. We  
    should use prior_records instead.
    StrongerXi authored Nov 6, 2022
    Configuration menu
    Copy the full SHA
    f2a7403 View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2022

  1. [Hexagon] Fix Hexagon external libs check (apache#13257)

    When building tvm runtime with hexagon we face the below error if
    USE_HEXAGON_EXTERNAL_LIBS is not defined. This happens because
    USE_HEXAGON_EXTERNAL_LIBS=OFF is defined as the default in
    CMakeLists.txt. The modified condition can check for all cases including
    undefined variable, empty string and OFF
    
    CMake Error at cmake/modules/Hexagon.cmake:203 (message):
      Invalid use of USE_HEXAGON_EXTERNAL_LIBS=OFF; USE_HEXAGON_EXTERNAL_LIBS
      only supports absolute paths and git repository urls
    Call Stack (most recent call first):
      CMakeLists.txt:477 (include)
    quic-sanirudh authored Nov 7, 2022
    Configuration menu
    Copy the full SHA
    60e2c98 View commit details
    Browse the repository at this point in the history
  2. [Relay][Op] Add support for large index fp16 mean and var (apache#13289)

    Add support for large index fp16 mean and var.
    Josh Fromm authored Nov 7, 2022
    Configuration menu
    Copy the full SHA
    dd257e4 View commit details
    Browse the repository at this point in the history
  3. [Bugfix][Runtime] Fix sched_setaffinity in Android (apache#13158)

    * fix sched_setaffinity error on Android
    
    * fix sched_setaffinity error on Android
    
    * fix sched_setaffinity error on Android
    
    * clang format
    
    * add ndk api verion macro
    
    * clang format
    Wanger-SJTU authored Nov 7, 2022
    Configuration menu
    Copy the full SHA
    6b238c4 View commit details
    Browse the repository at this point in the history
  4. [Torch] Fix advanced indexing with boolean mask (apache#13306)

    * [Torch] Fix advanced indexing with boolean mask
    
    * add comment
    masahi authored Nov 7, 2022
    Configuration menu
    Copy the full SHA
    e398d16 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    ce777fd View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    b16a64d View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2022

  1. [Frontend][PaddlePaddle] Add test case for interpolate op convert fun…

    …c… (apache#13277)
    
    Add test case for interpolate op convert function apache#13247
    woobinw authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    904ae77 View commit details
    Browse the repository at this point in the history
  2. [BugFix][Driver] Correctly propogate simple-mode flag in LowerSchedule (

    apache#13311)
    
    Currently one version of `tvm::LowerSchedule` doesn't pass along the input `simple_mode` flag, which causes it to default back to `false`. This commit fixes it by passing along the input flag.
    StrongerXi authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    f869118 View commit details
    Browse the repository at this point in the history
  3. [microTVM] Fix RPC session close on runtime side (apache#13310)

    Currently, the RPC session on C/C++ side does not know if the session  
    was closed on Python side which causes extra read/write on transport   
    while the session is already closed. This commit reuses the Hexagon    
    approach in microTVM to shutdown the RPC session.
    mehrdadh authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    e43841d View commit details
    Browse the repository at this point in the history
  4. [Hexagon] [runtime] Move lock/unlock to HexagonHtp temporarily (apach…

    …e#13318)
    
    Move lock/unlock to HexagonHtp temporarily
    janetsc authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    b807613 View commit details
    Browse the repository at this point in the history
  5. [TIR] Add thread sync if access index doesn't depend on thread index (a…

    …pache#13314)
    
    This PR updates the `src/tir/transforms/thread_storage_sync.cc`, to make it insert storage sync if the access index doesn't depend on the innermost thread index, i.e., being constant wit respect to the innermost thread id. 
    
    This fixes an accuracy problem on model https://github.com/pytorch/benchmark/tree/main/torchbenchmark/models/timm_efficientdet
    yelite authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    c898dc6 View commit details
    Browse the repository at this point in the history
  6. [ETHOSN] Consolidate target string usage (apache#13159)

    * [ETHOSN] Consolidate target string usage
    
    Removes support for a deprecated target string. The deprecation warning
    has been around for a couple of releases now so it should be safe to
    remove. The target to use moving forward is: `ethos-n -variant=n78 ...`
    
    Refactored direct use of a driver stack target string in the testing
    infrastructure to use the same string we expect users to provide. This
    simplified some of the code in codegen and hopefully avoids confusion
    in the future.
    lhutton1 authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    79093a1 View commit details
    Browse the repository at this point in the history
  7. [Adreno][Textures] Fix static memory planner (apache#13253)

    * [Adreno][Textures] Fix static memory planner
    
    Fix memory reusage in static memory planner.
    
    * Move token allocators to separate file
    
    * Add test on TokenAllocator2d
    
    * Apply comments and fix CI
    echuraev authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    be30238 View commit details
    Browse the repository at this point in the history
  8. Fixup libtorch backend build (apache#13320)

    Add clang-format disable for header to prevent reorder.
    Torch header file need to be put at the end since torch's dlpack
    is a little different with tvm's.
    
    Signed-off-by: Lei Wen <wenlei03@qiyi.com>
    Co-authored-by: Lei Wen <wenlei03@qiyi.com>
    leiwen83 and wenlei03 authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    bf77e79 View commit details
    Browse the repository at this point in the history
  9. [TVMScript] Hide trailing return type if None (apache#13308)

    Because the majority of TIR PrimFuncs operate on buffers, write
    their outputs to an output parameter, and do not return a value,
    the `-> None` in the function signature becomes visual noise.
    This commit removes printing of the return type in cases where
    the PrimFunc has no return value.
    Lunderberg authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    15752e4 View commit details
    Browse the repository at this point in the history
  10. [OpenCL][unit tests] Fix opencl cpp unit tests (apache#13254)

    * [OpenCL][unit tests] Fix opencl cpp unit tests
    
    After some changes in Hexagon, the run of cpp opencl tests leads to the
    following error:
    ```
    pluggy.manager.PluginValidationError: unknown hook 'pytest_configure_node' in plugin <module 'tvm.contrib.hexagon.pytest_plugin'
    ```
    Added `pytest_plugin` for OpenCL CPP tests for avoiding this error and
    processing gtest arguments.
    
    * Fix fail than gtest_args option was already added
    
    * Move `gtest_args` deginition to the main testing plugin
    echuraev authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    750ba9f View commit details
    Browse the repository at this point in the history
  11. [microTVM][CRT] Add memory size as project option (apache#13313)

    * Add memory size as project option
    
    * cleanup
    
    * address comments
    
    * address comments
    mehrdadh authored Nov 8, 2022
    Configuration menu
    Copy the full SHA
    16bb1a6 View commit details
    Browse the repository at this point in the history

Commits on Nov 9, 2022

  1. [TIR] Remove redundant add in vnni/arm intrin (apache#13319)

    * [TIR] Remove redundant add in vnni intrin
    
    * Update arm intrin
    
    Co-authored-by: Ubuntu <ubuntu@ubuntu.com>
    vinx13 and Ubuntu authored Nov 9, 2022
    Configuration menu
    Copy the full SHA
    36b1c5c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    244bceb View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    65dbee7 View commit details
    Browse the repository at this point in the history
  4. [AOT] Add CreateExecutorMetadata analysis pass (apache#13250)

    AOT requires the ExecutorCodegenMetadata object to be
    populated containing various pieces of information about
    the compiled module. This commit adds a separate analysis
    pass to create the metadata + some tests for the new pass.
    
    In order to collect the device information correctly,
    AOTLowerMain is extended to attach the device info as a
    function attribute.
    mbaret authored Nov 9, 2022
    Configuration menu
    Copy the full SHA
    0e395c3 View commit details
    Browse the repository at this point in the history
  5. [microTVM][CRT][DOCS] Add a PyTorch tutorial for microTVM with CRT (a…

    …pache#13324)
    
    This commit adds a tutorial to compile and run a PyTorch model using   
    microTVM, the AOT host-driven executor, and C runtime (CRT).
    mehrdadh authored Nov 9, 2022
    Configuration menu
    Copy the full SHA
    fbe174b View commit details
    Browse the repository at this point in the history
  6. [ci] Update Jenkins readme to match new directory structure (apache#1…

    …3333)
    
    Update Jenkins readme to match new directory structure
    alanmacd authored Nov 9, 2022
    Configuration menu
    Copy the full SHA
    999eee8 View commit details
    Browse the repository at this point in the history
  7. [MetaSchedule] Fix the order of applying AutoInline in `ScheduleUsi…

    …ngAnchorTrace` (apache#13329)
    
    * index on concat-fusion-fix: 3ffe5b1 fix te extern create_prim_func test
    
    * Apply AutoInline to the last block after all other blocks are processed
    
    * Do not require CanReverseComputeInline to be true when
    CanComputeInline is false
    
    * add comment
    
    * add test
    
    * cpplint
    masahi authored Nov 9, 2022
    Configuration menu
    Copy the full SHA
    8453c9c View commit details
    Browse the repository at this point in the history
  8. [MetaSchedule] Add JSON Database Validation Scripts (apache#12948)

    * Add validation scripts.
    
    * Fix testing script.
    
    * Fix lint.
    
    * Fix lint.
    
    * Fix inputs.
    
    * Fix lint.
    
    * Fix lint.
    
    * Add timer func.
    
    * Fix ci.
    
    * Address comments.
    
    * Add total time statistics.
    
    * Fix lint.
    zxybazh authored Nov 9, 2022
    Configuration menu
    Copy the full SHA
    5dc4186 View commit details
    Browse the repository at this point in the history

Commits on Nov 10, 2022

  1. [QNN, ONNX] Extension of QLinearMatMul in ONNX front-end for all rank…

    …s of input tensors (apache#13322)
    
    * QLinearMatMul was extended for all ranks of a and b
    
    * CI test for QLinearMatMul was implemented (onnx front-end)
    
    * fix after black check
    
    * numpy type fix
    
    * fix weight scale and zero point, output type
    
    * fix after pylint
    
    * resolve different input types in tests
    
    * skip resolved TODO
    
    * update covering of QLinearMatMul by tests
    
    * pylint fixes
    
    * skip test of QLinearMatMul on CUDA
    
    Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
    vvchernov and vvchernov authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    b4b90d7 View commit details
    Browse the repository at this point in the history
  2. [TIR] Check producer predicate in ReverseComputeInline (apache#13338)

    * [TIR] Disallow reverse inline into a producer with non-trivial predicate
    
    * add test
    
    * Allow cases where the producer predicate can be implied by the new
    predicate of the inlined block
    
    * remove unused variable
    
    * update comment in test to reflect the change in ReverseComputeInline
    masahi authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    6d9d213 View commit details
    Browse the repository at this point in the history
  3. [TOPI] Fix conv2d transpose for small channel (apache#13341)

    * [TOPI] Fix conv2d transpose for small channel
    
    * black
    masahi authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    a16a890 View commit details
    Browse the repository at this point in the history
  4. [Minor][Testing] Consolidate IRs into corresponding functions (apache…

    …#13339)
    
    We moved most of the IR definition into the testing methods correspondingly.
    
    Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
    junrushao and cyx-6 authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    1228104 View commit details
    Browse the repository at this point in the history
  5. [CPP_RPC][ANDROID] Fix cpp_rpc build failure (apache#13305)

    * cpp_rpc build failure for Android devices with NDK version < 23
    
    * * Make environment variable ANDROID_NDK_MAJOR optional.
    
    Co-authored-by: Siva Rama Krishna Reddy B <sivb@blr-ubuntu-ripper.qualcomm.com>
    srkreddy1238 and Siva Rama Krishna Reddy B authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    a0dcab2 View commit details
    Browse the repository at this point in the history
  6. [Hexagon] Make allocate_hexagon_array a hexagon contrib API (apache#1…

    …3336)
    
    Make 'allocate_hexagon_array' a hexagon contrib API
    csullivan authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    3a30df6 View commit details
    Browse the repository at this point in the history
  7. [microNPU] Fixed MergeConstants pass on striped networks (apache#13281)

    This PR fixes the bug in MergeConstants pass on striped networks on Ethos-U NPU.
    
    The issue was caused by _DivideConstants_ pass which is introducing new mod parameters and changing their order. So ethosu_write parameter in some cases is moved from the end of the list to the middle.
    E.g. from:
    `[ethos-u_0_i0, p1, p2, p3, p4, p5, p6, ethosu_write]`
    To:
    `[ethos-u_0_i0, p1, p2, ethosu_write, placeholder, placeholder, placeholder, placeholder, placeholder, placeholder, placeholder, placeholder]`
    
    Updated version of the  _GetArgsToMergeWithoutArgsNotInConstDict_ and _MakeNewConstDict_ methods in passes.cc can now correctly modify const_dict according to the new parameter list.
    sergio-grovety authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    54bd5e1 View commit details
    Browse the repository at this point in the history
  8. [TVMC] Global pass context for compile and tune (apache#13309)

    * [TVMC] Global pass context for compile and tune
    
    Comes as a followup from conversations in apache#13216. By making the pass
    context a global value for both `compile` and `tune` commands, we can
    ensure the pass context is exactly as the user expected and also
    test components such as `convert_graph_layout` under a pass context
    suitable for testing (e.g. add instruments). With this change, it
    becomes the users responsibility to ensure the PassContext they
    select is suitable for the passes that will be run. By default,
    `opt_level` remains as 3 so current workflows that do not alter the pass
    context from the command line / TVMC Python API should not be affected.
    
    Change-Id: I7a601daf6fbe664f77bce1b45efeb7ca29f621b3
    
    * fix vitis-ai test and typo
    
    Change-Id: I04f5bd031ae4717825f42e373bcb0e1e2c1c9d90
    lhutton1 authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    23ade0c View commit details
    Browse the repository at this point in the history
  9. [TIR] Update ReductionIterNotIndexOutputBuffer to check BlockRealizeN… (

    apache#13301)
    
    * [TIR] Update ReductionIterNotIndexOutputBuffer to check BlockRealizeNodes match_buffer statements when validating writes
    
    * Add test to verify that tensorized blocks are properly validated
    
    * update to take into account all match buffer regions.
    
    * lint
    nverke authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    7cd203d View commit details
    Browse the repository at this point in the history
  10. [Docker]Refactor timezone script and NRF installation (apache#13342)

    This PR refactors timezone setup to a separate script that docker/install/ubuntu_install_core.sh
    Also, it adds a script to install NRF and reused in both cortexm docker and RVM installation path.
    mehrdadh authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    c66bb00 View commit details
    Browse the repository at this point in the history
  11. [TIR][Arith] Fix divisor checking in TryConstFold (apache#13348)

    Fix denominator checking in `TryConstFold`.
    wzh99 authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    3a639a4 View commit details
    Browse the repository at this point in the history
  12. [MetaSchedule][Minor] Fix Typo in ApplyCustomRule Schedule Rule (apac…

    …he#13353)
    
    * Fix typo.
    
    * Add regression test.
    zxybazh authored Nov 10, 2022
    Configuration menu
    Copy the full SHA
    b582cd1 View commit details
    Browse the repository at this point in the history

Commits on Nov 11, 2022

  1. [MetaSchedule] Improve inlining and VerifyGPUCode for quantized mod…

    …el workload (apache#13334)
    
    * [MetaSchedule] Add a new schedule rule to inline all scalar constants
    
    * add doc
    
    * reorg
    
    * identify constant block by its structure, not by name
    masahi authored Nov 11, 2022
    Configuration menu
    Copy the full SHA
    93fdf83 View commit details
    Browse the repository at this point in the history
  2. [MetaSchedule][Minor] Allow Zero Run Time In Benchmarking Result (apa…

    …che#13354)
    
    This PR introduces a check to prevent records with run time of zero into the training data of cost model. This is because when working on microTVM there're cases where the run time of certain successful runs is very tiny, such that it got recorded as zero. In such cases, the runtime of 0 would break XGBoost model because it introduces infinite running speed in GFLOPs. A regression test was also added.
    zxybazh authored Nov 11, 2022
    Configuration menu
    Copy the full SHA
    f950b11 View commit details
    Browse the repository at this point in the history
  3. [Bugfix][TIR] Patch for PR#13269 to support Python 3.10 (apache#13350)

    It seems like there is some inconsistency across the python versions and make PR apache#13269 fails at Python 3.10. 
    This patch fixes this issue. 
    
    Co-authored-by: Junru Shao <junrushao1994@gmail.com>
    sunggg and junrushao authored Nov 11, 2022
    Configuration menu
    Copy the full SHA
    6d68aff View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a156636 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    f3eb239 View commit details
    Browse the repository at this point in the history
  6. [MetaSchedule] Fuse loops around shared to global store block in `Mul…

    …tiLevelTilingTensorCore` (apache#13357)
    
    * Fuse shared to global store loops in MultiLevelTilingTensorCore
    
    * update test
    masahi authored Nov 11, 2022
    Configuration menu
    Copy the full SHA
    5364e5a View commit details
    Browse the repository at this point in the history
  7. [TIR][Schedule] Make consistent implementation for GetProducers() & G…

    …etConsumers() (apache#13344)
    
    Currently there are two versions of `GetConsumers()` and `GetProducers()` implementation. Make them consistent to avoid possible bug when there are WAR dependencies.
    wrongtest-intellif authored Nov 11, 2022
    Configuration menu
    Copy the full SHA
    4532712 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    f9ed60a View commit details
    Browse the repository at this point in the history
  9. [TIR] Make syntax of AST nodes different than ops (apache#13358)

    As part of effort of more formal TIR semantics, we want to more
    explicitly differentiate TIR AST nodes (defined in `tir/expr.h`)
    and TIR ops (defined in `tir/op.h`).
    
    A naming convention is that:
    - Lowercased methods, for example, `tvm.tir.mul`, means an TIR op, which
      will be eagerly constant-folded, i.e. `mul(1, 2)` returns `3`
      immediately rather than creating an AST node.
    - Capitalized callable, for example, `Mul`, means creating an AST node
      without constant folding.
    
    This PR makes this behavior more explictly by printing `T.Mul(a, b)`
    directly when `a` and `b` are both constants, rather than sugaring it
    into `mul(a. b)` or `a * b`, so that the difference between an op and
    an AST node is clarified.
    
    Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
    
    Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
    junrushao and cyx-6 authored Nov 11, 2022
    Configuration menu
    Copy the full SHA
    ce0e9ab View commit details
    Browse the repository at this point in the history
  10. [FQ2I] Add cast back to input data type after AvgPool2d (apache#13332)

    [FQ2I] Add cast back to output data type after AvgPool2d
    
    This commit fixes the following issue:
    For the sequence of qnn.dequantize -> avg_pool2d -> conv2d ->
    qnn.quantize FQ2I pass inserts qnn.requantize (or cast) to int32
    unconditionally before AvgPool2d. As a result fake quantized qnn.conv2d
    gets input as int32 dtype, but it is forbidden for qnn.conv2d (supports
    only uint8/int8/int16).
    
    This commit adds the following:
    Add cast back to output data type after AvgPool2d. This preserve input
    dtype == output dtype for this op.
    ibsidorenko authored Nov 11, 2022
    Configuration menu
    Copy the full SHA
    5ffcfd9 View commit details
    Browse the repository at this point in the history
  11. [IRBuilder][Minor] Add intrinsics like T.int32x4 (apache#13361)

    This PR adds all common TIR intrinsics like `T.int32x4`, `T.floatx4`.
    
    Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
    junrushao and cyx-6 authored Nov 11, 2022
    Configuration menu
    Copy the full SHA
    8897983 View commit details
    Browse the repository at this point in the history

Commits on Nov 12, 2022

  1. [TIR][Schedule] Fix cache_read loc detecting and region_cover checking (

    apache#13345)
    
    Fix 2 issues of cache related primitives:
    *  Fix region_cover checking for cache related primitives
    *  Fix CacheLocDetector for nested SeqStmt
    
    Co-authored-by: Min Chen <chen.min@intellif.com>
    multiverstack-intellif and Min Chen authored Nov 12, 2022
    Configuration menu
    Copy the full SHA
    3877117 View commit details
    Browse the repository at this point in the history
  2. [TVMScript] Reorganize the folder structure (apache#12496)

    This PR introduces some minor restructuring of the `python/tvm/script`
    folder structure to make it more convenient for future upstreaming.
    
    Co-authored-by: Yaxing Cai <caiyaxing666@gmail.com>
    junrushao and cyx-6 authored Nov 12, 2022
    Configuration menu
    Copy the full SHA
    b20b7c4 View commit details
    Browse the repository at this point in the history

Commits on Nov 13, 2022

  1. [ci] Assert some tests are not skipped in the CI (apache#12915)

    In this PR, the skipped tests script will also check if tests in the `required_tests_to_run.json` have not been skipped. If there are skipped tests, they will be added to the returned comment. 
    
    I am not entirely sure where it's best to place the `required_tests_to_run` file, so I left it in `tvm/ci/scripts/`. I am happy to take suggestions.
    
    Aims to prevent situations such as apache#12529
    gigiblender authored Nov 13, 2022
    Configuration menu
    Copy the full SHA
    b8384d1 View commit details
    Browse the repository at this point in the history

Commits on Nov 14, 2022

  1. [CI] Separate the ci scripts into Github and Jenkins scripts (apache#…

    …13368)
    
    This PR is a duplicate of apache#12940 and apache#12941. For some reason, I am unable to reopen apache#12940.
    gigiblender authored Nov 14, 2022
    Configuration menu
    Copy the full SHA
    5a767d0 View commit details
    Browse the repository at this point in the history
  2. [TIR][Bugfix] Fix AXIS_SEPARATORS in tir.Schedule.transform_layout (a…

    …pache#13326)
    
    Preivously, the block SREF reuse only included a single step of
    changes, and would have an incorrect mapping if multiple sequential
    changes to the TIR block occurred.  This could happen if a
    `BufferStore` was updated, followed by replacement of `Block` iter
    vars/values.  This commit tracks the Block replacements across each
    usage, to ensure the SREF instances remain valid.
    Lunderberg authored Nov 14, 2022
    Configuration menu
    Copy the full SHA
    b6fae9b View commit details
    Browse the repository at this point in the history
  3. [ci] Fix Jenkins quoting (apache#13380)

    Merging apache#13368 caused CI to pass but run more than it needed to due to
    some failures in determination. This fixes the interpolation to use `"`
    which should correctly pass through the variables
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Nov 14, 2022
    Configuration menu
    Copy the full SHA
    68f51e6 View commit details
    Browse the repository at this point in the history
  4. [CI] Do not merge before running CI on main (apache#13372)

    This PR does not merge `main` if CI is running already on `main`. It aims to avoid a case where a race happens between two subsequent commits, and one of them merges the other.
    
    Fixes apache#12392.
    gigiblender authored Nov 14, 2022
    Configuration menu
    Copy the full SHA
    41a2243 View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2022

  1. Configuration menu
    Copy the full SHA
    3aa16f7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    647be2b View commit details
    Browse the repository at this point in the history
  3. [TFLite] Enable int64 biases for int16 quantized operators (apache#12042

    )
    
    This enables int64 biases for quantized fully connected, requantize
    and transpose convolution in TFLite networks. It goes on top of existing
    int16 support for TFLite frontend.
    
    Add a test case using DS_CNN int16 quantized.
    leandron authored Nov 15, 2022
    Configuration menu
    Copy the full SHA
    034dc67 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    2d3a5b5 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    1e7e790 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    8b27975 View commit details
    Browse the repository at this point in the history
  7. Fix docs

    dsbarinov1 committed Nov 15, 2022
    Configuration menu
    Copy the full SHA
    a91e052 View commit details
    Browse the repository at this point in the history
  8. Fix docs v2

    dsbarinov1 committed Nov 15, 2022
    Configuration menu
    Copy the full SHA
    4f20a25 View commit details
    Browse the repository at this point in the history
  9. Release

    dsbarinov1 committed Nov 15, 2022
    Configuration menu
    Copy the full SHA
    cb5e183 View commit details
    Browse the repository at this point in the history
  10. ToMixedPrecision fix

    dsbarinov1 committed Nov 15, 2022
    Configuration menu
    Copy the full SHA
    7c682ac View commit details
    Browse the repository at this point in the history