Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torchdynamo tuning script #9

Closed
wants to merge 698 commits into from
Closed

Torchdynamo tuning script #9

wants to merge 698 commits into from
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Aug 18, 2022

  1. [MetaSchedule] Handle deserializing empty string RVs in trace (apache…

    …#12481)
    
    * trace.cc
    
    * add tests
    
    * remove assert
    
    * add proper test
    
    * lint
    
    * lint
    AndrewZhaoLuo authored Aug 18, 2022
    Configuration menu
    Copy the full SHA
    fb07351 View commit details
    Browse the repository at this point in the history
  2. [HEXAGON][TOPI] This PR adjusts schedules so >64 length vector loads/…

    …stores are not generated at LLVM level. This is a workaround for an instruction selection issue in current version of llvm for hexagon (apache#12471)
    arangasa authored Aug 18, 2022
    Configuration menu
    Copy the full SHA
    436c17f View commit details
    Browse the repository at this point in the history
  3. [COMMUNITY] Adam Straw -> Reviewer (apache#12480)

    Krzysztof Parzyszek authored Aug 18, 2022
    Configuration menu
    Copy the full SHA
    e140a27 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    aa97f4a View commit details
    Browse the repository at this point in the history
  5. [TVMScript] IRBuilder, IRBuilderFrame base class (apache#12482)

    * [TVMScript] IRBuilder, IRBuilderFrame base class
    
    This PR introduces basic data structures of the generic IRBuilder
    across the codebase.
    
    IRBuilder is a general-purpose IRBuilder that can be used in TIR, Relax
    and any other vendor-specific dialects; IRBuilderFrame is where contexual
    information as stored in the IRBuilder.
    
    * fix linter
    
    * Update include/tvm/script/ir_builder/base.h
    
    Co-authored-by: Junru Shao <junrushao1994@gmail.com>
    cyx-6 and junrushao authored Aug 18, 2022
    Configuration menu
    Copy the full SHA
    250b68e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    da7675c View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    a96bda4 View commit details
    Browse the repository at this point in the history
  8. [HEXAGON] Auto-vectorization (fp16) for v68 (apache#12397)

    * Auto-vectorization (fp16) for v68
    
    * use tvm.testing.main in fp16 test of tanh_slice op
    avquicinc authored Aug 18, 2022
    Configuration menu
    Copy the full SHA
    88928a4 View commit details
    Browse the repository at this point in the history
  9. [TIR] [bfloat16] add bfloat16 promotion for CallNode (apache#12370)

    * add bfloat16 promotion for CallNode
    
    * add softmax to bfloat16 build test
    yangulei authored Aug 18, 2022
    Configuration menu
    Copy the full SHA
    efd7c45 View commit details
    Browse the repository at this point in the history
  10. [CMSIS-NN] Re-use CPU Target Parser (apache#12320)

    Previously `CMSISNNFlags` was derived using logic specific to the external code generator, this converts the external code generator options into a `Target`.
    Mousius authored Aug 18, 2022
    Configuration menu
    Copy the full SHA
    d1e6f39 View commit details
    Browse the repository at this point in the history
  11. [Target] Only append default keys if target doesn't have any yet (apa…

    …che#12474)
    
    * [Target] Only append default keys if target doesn't have any yet
    
    This allows target parsers to provide their own target keys. Without this
    change, the default keys would always be appended, which may or may not
    be desirable.
    
    * Add "cpu" to ARM CPU keys
    
    * Add "cpu" to the keys in the mprofile target parser
    
    * Restore the mprofile cpptest, since the "cpu" key is back
    
    * So the -device attribute is actually needed...
    Krzysztof Parzyszek authored Aug 18, 2022
    Configuration menu
    Copy the full SHA
    6def53a View commit details
    Browse the repository at this point in the history
  12. [ci][tvmbot] Search more users when checking usernames (apache#12491)

    To figure out a user's association with the repo this code before
    searched the associations in the repo filtered by the relevant username.
    GitHub doesn't return the exact match only though, so we have to instead
    collect many results and search through all of them.
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Aug 18, 2022
    Configuration menu
    Copy the full SHA
    72b0f5e View commit details
    Browse the repository at this point in the history

Commits on Aug 19, 2022

  1. Configuration menu
    Copy the full SHA
    5d17e24 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c0d440d View commit details
    Browse the repository at this point in the history
  3. [microTVM] Add config space to dense_dsp schedule (apache#12444)

    * add config space
    
    * lint
    
    * lint
    mehrdadh authored Aug 19, 2022
    Configuration menu
    Copy the full SHA
    8b3401c View commit details
    Browse the repository at this point in the history
  4. [TOPI]fix scatterND large shape problem (apache#12200)

    * fix scatterND large shape problem
    
    * fix thread pool alloca
    
    * add scatternd unit test
    
    * update with comment
    
    * Empty
    
    Co-authored-by: wrongtest <wrongtest0@gmail.com>
    chengven027 and wrongtest-intellif authored Aug 19, 2022
    Configuration menu
    Copy the full SHA
    41be1b4 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    9d6039b View commit details
    Browse the repository at this point in the history
  6. [Fix] Fix some typos (apache#11503)

    Fix some typos in src/.
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    Yulv-git and driazati authored Aug 19, 2022
    Configuration menu
    Copy the full SHA
    bdcfa01 View commit details
    Browse the repository at this point in the history
  7. fix pytest (apache#12483)

    mehrdadh authored Aug 19, 2022
    Configuration menu
    Copy the full SHA
    c83ee08 View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2022

  1. [Relay][Layout] Add FInferCorrectLayout for L2 norm layout transform. (

    …apache#12497)
    
    * [Relay][Layout] FInferCorrectLayout for L2 norm layout change.
    
    * [Relay][Layout] Test for L2 norm layout transform.
    
    * [Relay][Layout] Re-edit test to add multi-dimensional axis list.
    
    * Fix cpplint errors
    
    * Use clang-format-10 rules.
    
    * replace uint with size_t.
    blackkker authored Aug 20, 2022
    Configuration menu
    Copy the full SHA
    1985c01 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    eb31123 View commit details
    Browse the repository at this point in the history
  3. [TIR][Schedule][UX] Beautify TIR Trace Printing (apache#12507)

    Following apache#12197, this PR introduces
    `Schedule.show()` which convenience the user experience in the following
    two aspects:
    - Python syntax highlighting
    - Outputs a schedule function instead of standalone instructions so that
    it's easier to follow.
    
    To demonstrate this change:
    - Before `Schedule.show()` is introduced:
    <img width="555" alt="image" src="https://user-images.githubusercontent.com/22515877/185713487-03722566-1df7-45c7-a034-c1460d399681.png">
    
    - After this change:
    <img width="583" alt="image" src="https://user-images.githubusercontent.com/22515877/185713564-c54f3a9d-cd52-4709-a8b8-d8a61361e611.png">
    junrushao authored Aug 20, 2022
    Configuration menu
    Copy the full SHA
    3b3443b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    125c9ca View commit details
    Browse the repository at this point in the history
  5. [MetaSchedule] Migrate MemoryDatabase to C++ (apache#12514)

    This PR migrates the existing MemoryDatabase, which is implemented in
    python at the moment, to C++. The original intent of having an in-memory
    database that does not persist on disk is merely for testing, but as
    times go on, we found it useful in production workflow, and thus decided
    to migrate it C++ for potentially better performance.
    junrushao authored Aug 20, 2022
    Configuration menu
    Copy the full SHA
    8ee4b60 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    92355f2 View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2022

  1. [TVMScript] Printer entry point (apache#12462)

    This PR:
    
    - Adds an entry point for the TVMScript Unified Printer
    - Adds a helper object class `RootNodeContainer` to provide an injection point for the actual printer implementation to add specialized logic on the root node to print.
    
    Tracking issue: apache#11912
    yelite authored Aug 21, 2022
    Configuration menu
    Copy the full SHA
    cc769fd View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2022

  1. [TVMScript] Printer: add boolean operators to OperationDoc (apache#12518

    )
    
    This PR adds boolean operators to OperationDoc. This is needed by the TIR expression printing because it has `tir::And` and `tir::Or`.
    
    Tracking issue: apache#11912
    yelite authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    2629065 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e9aad35 View commit details
    Browse the repository at this point in the history
  3. [ETHOSN] Remove support for older versions of the driver stack (apach…

    …e#12347)
    
    Removes support for driver stack versions older than 22.05
    (semantic 3.0.1). Additionally, changes the integration to make
    version checks using semantic versioning rather than the previous
    year.month versioning method.
    lhutton1 authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    7c318d7 View commit details
    Browse the repository at this point in the history
  4. [TIR] Support AllocateConst nodes in TensorIR scheduling flow (apache…

    …#12489)
    
    * [TIR] Support AllocConstantNode in CreatePrimFunc
    
    * Handle AllocConstantNode in LeafBlockRemovalPlan
    
    * Properly handle AllocConstNode in BufferAllocationLocator
    
    * handle AllocateConst in EstimateFlops
    
    * remove NDArray printing
    
    * doc update
    
    * add test
    
    * cpplint
    
    * Removed dependency on link-params attribute from target
    
    * Restored NDArray printing to unbreak test
    masahi authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    8146a9b View commit details
    Browse the repository at this point in the history
  5. [ONNX] Fix test to disable default ONNX frontend constant folding (ap…

    …ache#12532)
    
    In TVM ONNX frontend, constants are folded by default, which makes `test_load_model__onnx` to fail because it is looking for "params" that were already converted into constants.
    
    This patch fixes the test to disable constant folding so that we can assert that "params" in the model are present as expected.
    leandron authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    48a8cbd View commit details
    Browse the repository at this point in the history
  6. [CI] Set test dependency on "transformers" package with pytest.import…

    …orskip (apache#12528)
    
    `test_meta_schedule_integration_extract_from_bert_base` depends on the `transformers` package, which is not currently installed in our Docker images.
    
    When running this test currently, it fails with an ImportError. This patch makes this dependency explicit and will make the test to be skipped when the dependency is not installed.
    
    `test_meta_schedule_integration_extract_from_bert_base` is part of the integration tests, which is currently only running on AArch64 and CPU image (both not at the moment with torch installed in the live CI system), so this is another issue to be understood/fixed.
    leandron authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    3896756 View commit details
    Browse the repository at this point in the history
  7. [MicroTVM] expose project options in autotuning (apache#12479)

    * expose project_options in autotuning
    
    * address comment
    
    * address comment
    
    Co-authored-by: Mohamad <mkatanbaf@users.noreply.github.com>
    mkatanbaf and mkatanbaf authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    40fd43e View commit details
    Browse the repository at this point in the history
  8. [TIR][Schedule] Support for specific consumer block targeting in cach…

    …e_read (apache#12505)
    
    * Add optional consumer blocks to cache_read.
    
    * remove comments
    
    * Fully functional
    
    * Add test for consumer targetting.
    
    * Formatting.
    
    * Add missing parameter comment.
    
    * Fix comments
    
    * Simplify type of consumer_blocks in python.
    
    * Change how consumer_blocks is printed in python.
    Josh Fromm authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    3f56851 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    c1bd022 View commit details
    Browse the repository at this point in the history
  10. [ci] xfail failing ethosu codegen tests (apache#12508)

    This adds a testing utility so we can mark parameter combinations as
    xfail without having to manually match each parameter from the name into
    the code. The param strings here come directly from CI logs as in
    https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-12389/5/pipeline
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    4d7e7a8 View commit details
    Browse the repository at this point in the history
  11. [CI] Add alexnet and googlenet caffe model to request hook (apache#12510

    )
    
    This PR intends to move the alexnet and googlenet caffe models from the old link to s3, therefore getting rid of the flakiness in `caffe/test_forward.py` introduced by external url timeouts. 
    
    Fixes apache#12465
    Yuanjing Shi authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    66a31e9 View commit details
    Browse the repository at this point in the history
  12. [LLVM] Add "cl-opt" attribute to target_kind "llvm" (apache#12440)

    * [LLVM] Add "cl-opt" attribute to target_kind "llvm"
    
    Add LLVMTargetInfo class that can be used to query the LLVM
    configuration without forcing an LLVMTarget to be created.
    
    There is no programmatic way to obtain the actual type of an LLVM
    option. The type is necessary to obtain the value of the option,
    hence it must be provided as a part of the option string.
    See src/target/llvm/target_kind.cc for more information about the
    syntax.
    
    * Fix lowercasing of bool value string
    
    * Use std::optional instead of std::pair<..., bool>
    
    * Treat malformed options as fatal errors
    
    * Fix linter
    
    * More unit tests for option parsing, have one case per test
    
    * Remove "option ignored" from fatal error messages
    Krzysztof Parzyszek authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    e5e05fe View commit details
    Browse the repository at this point in the history
  13. [BugFix][UMA] Fix order issue in uma_lower (apache#12447)

    There was a flaw in uma_lower (see issue apache#12410) that lead in some case to a different argument ordering of the cached_func and the Relay function. This results in an incorrect lowering of the primfunc and eventually a wrong result of a run-time error, in some cases.
    
    This commit adds code to correct the described misbehavior and a unit test case to check this end-to-end functionality with a TFLITE model.
    MichaelJKlaiber authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    902343a View commit details
    Browse the repository at this point in the history
  14. [TIR] Add pass to check for out of bounds memory access (apache#12352)

    * [TIR] Add pass to check for out of bounds memory access
    
    This is a conservative static analysis that checks to see if any out of
    bounds array access occurs. It is not enabled by default.
    
    * formatting
    
    * manually construct local irmodule
    
    * update comment
    
    * fix bug in int_set
    Tristan Konolige authored Aug 22, 2022
    Configuration menu
    Copy the full SHA
    1e399fa View commit details
    Browse the repository at this point in the history

Commits on Aug 23, 2022

  1. Configuration menu
    Copy the full SHA
    8e95bba View commit details
    Browse the repository at this point in the history
  2. check for CMSIS_PATH in project generation (apache#12547)

    Co-authored-by: Mohamad <mkatanbaf@users.noreply.github.com>
    mkatanbaf and mkatanbaf authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    3bd1681 View commit details
    Browse the repository at this point in the history
  3. [microTVM] Rework evaluate_model_accuracy into a more generic helper …

    …function (apache#12539)
    
    * Add workaround for apache#12538
    
    * Rework evaluate_model_accuracy into predict_labels_aot
    guberti authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    5cef6bf View commit details
    Browse the repository at this point in the history
  4. [microTVM] Replace static fixtures with parameterization (apache#12530)

    * Replace microTVM static fixtures with parameterization
    
    * [microTVM] Only perform parameterization when fixture is present
    
    * Reformat with black
    
    * Fix Cortex-M tests
    
    * Add docstring to pytest_generate_tests
    
    * Remove trailing space from docstring
    guberti authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    58f2139 View commit details
    Browse the repository at this point in the history
  5. [docs] Add CI contribution instructions (apache#12551)

    This PR documents the steps to introducing a new CI docker image, which we've been doing a lot lately.
    areusch authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    e252d7f View commit details
    Browse the repository at this point in the history
  6. [ACL] Adjust mobilenet test for Keras 2.9 (apache#12541)

    In Keras 2.7, one "reshape" operator was removed from
    the Mobilenet model, making our test which verifies the
    number of operators to be incorrect.
    
    This patch adjusts the operator count so that it is in line
    with the changes in Keras. For reference, the change in
    keras repo was done in hash b6abfaed132 "Remove unnecessary
    reshape layer in MobileNet architecture".
    leandron authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    d26bf80 View commit details
    Browse the repository at this point in the history
  7. [COMMUNITY] @konturn -> Reviewer (apache#12543)

    Co-authored-by: Leandro Nunes <leanun01@e123855.arm.com>
    leandron and Leandro Nunes authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    3983a47 View commit details
    Browse the repository at this point in the history
  8. Fix TFLite 2.9 tests (apache#12130)

    This pr fixes the tests that will be broken when we will update TFLite to
    the 2.9 version.
    
    We will update TensorFlow and TFLite versions to 2.9 so that we can
    benefit from improvements in packaging to support multiple platforms
    and Operating Systems.
    Nicola Lancellotti authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    383bd41 View commit details
    Browse the repository at this point in the history
  9. [CMSIS-NN] Pad fusion with QNN Conv2D (apache#12353)

    Pass that fuses nn.pad and qnn.conv2d for CMSIS-NN target.
    ashutosh-arm authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    52779f1 View commit details
    Browse the repository at this point in the history
  10. [CI][AArch64] Skip libgomp failures in integration tests (apache#12554)

    Some integration tests are failing when running in CI machines that
    have torch installed (validated only in AARch64 for now), with an
    error message related to libgomp, similar to the one above:
    
    OSError: /.../dist-packages/torch/lib/libgomp-d22c30c5.so.1: cannot
    allocate memory in static TLS block
    
    As part of enabling the integration tests in AArch64, I'm marking this
    tests as skipped, so that tests can start executing and don't regress
    while we take time to investigate these specific failures.
    leandron authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    d271678 View commit details
    Browse the repository at this point in the history
  11. [ETHOSN] Fix requantize output conversion (apache#12540)

    Fixes a small issue when converting the output information to the support library API. The `requantize_info` output datatype needed updating with the output datatype from the relay function to ensure the graph is compiled correctly by the support library. Included a test to prevent regression in the future.
    lhutton1 authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    ff46fa1 View commit details
    Browse the repository at this point in the history
  12. [Relay] Add Rsqrt to SimplifyExpr (apache#12363)

    * Add Rsqrt to SimplifyExpr
    
    * fix unit tests
    Matthew Brookhart authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    dd7ae2d View commit details
    Browse the repository at this point in the history
  13. [AutoTVM] Add support for text buffers to ApplyHistoryBest (apache#12521

    )
    
    Currently, AutoTVM's ApplyHistoryBest class does not support loading tuning logs from memory. This is a pet peeve of mine, as it requires you to work with a tempfile whenever writing autotuning tests. This is also just strange, as the rest of AutoTVM has support for text buffers (e.g. tvm.autotvm.callback.log_to_file supports passing in a text buffer, letting us write to but not read from them).
    
    Additionally, ApplyHistoryBest handles input arguments very unintuitively. Before this PR, it allowed users to pass string filepaths, a list of string filepaths, or an Iterable (such as a list) of input and result tuples. However, it did not support taking in StringIO objects as mentioned above, nor pathlib.Path objects, nor combinations of a filepath and an Iterable of tuples.
    
    In a perfect world, we would change ApplyHistoryBest to take as input a path-like object, file-like object, or an Iterable of input and result tuples (similar to what ApplyGraphBest takes as an argument). However, this would break the existing functionality to take as input a list of filepaths.
    
    To be backwards compatible, while fixing this issue, this pull request defines a new type inside dispatcher.py:
    
    Records = Union[
        Union[str, bytes, Path],  # Path-like objects
        TextIOBase,  # File-like objects
        Iterable[Tuple[MeasureInput, MeasureResult]],
    ]
    It then rewrites ApplyHistoryBest.load so it takes the following arguments:
    
    def load(self, records: Union[Records, Iterable[Records]]):
    This PR also adds unit tests for this new functionality, and fixes a relevant bug in tests/micro/common/test_autotune.py in which a StringIO object was passed to apply_history_best, causing it to appear to pass but not actually read any data.
    guberti authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    da5836f View commit details
    Browse the repository at this point in the history
  14. [skip ci][ci] Mark more ethosu tests with xfail (apache#12560)

    See apache#12511 for context. Since more parameterizations are popping up as
    failed, this disables whole tests rather than specific combinations of
    parameters.
    driazati authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    1d71c1b View commit details
    Browse the repository at this point in the history
  15. [CI] Remove Vela from ci_cpu (apache#12533)

    While the dependencies for microNPU and CMSIS-NN moved into ci_cortexm,
    Vela is still installed in ci_cpu. As a result, we have some of the microNPU tests outside of
    test_ethosu folder failing since they use precence of Vela to decide whether to skip the
    test.
    
    This change will
    * remove Vela from ci_cpu
    * remove unnecessary PATH update
    ekalda authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    99b9b74 View commit details
    Browse the repository at this point in the history
  16. [ETHOSN] Add support for special indices of Reshape (apache#12556)

    This pr adds support for the special indices values of the reshape operator for the Arm(R) Ethos(TM)-N NPU.
    Nicola Lancellotti authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    4d104e5 View commit details
    Browse the repository at this point in the history
  17. [MicroTVM] add heap-size to project options (apache#12390)

    * heap-size is added to project options
    
    * change stm32l4r5zi recommended heap size
    
    * change stm32l4r5zi recommended heap size
    
    * addressing comments
    
    * addressing comments
    
    * addressing comments
    
    Co-authored-by: Mohamad <mkatanbaf@users.noreply.github.com>
    mkatanbaf and mkatanbaf authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    8c23469 View commit details
    Browse the repository at this point in the history
  18. Replace std::result_of (deprecated in C++17) with std::invoke_result,…

    … NFC (apache#12562)
    Krzysztof Parzyszek authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    13ebbfb View commit details
    Browse the repository at this point in the history
  19. Add using directives for otherwise hidden virtual functions, NFC (apa…

    …che#12561)
    
    This silences warning
    ```
    warning: 'foo' hides overloaded virtual functions [-Woverloaded-virtual]
    ```
    typically caused by overriding only some overloads of `VisitExpr_` from
    a set defined in the base class.
    Krzysztof Parzyszek authored Aug 23, 2022
    Configuration menu
    Copy the full SHA
    8174d08 View commit details
    Browse the repository at this point in the history

Commits on Aug 24, 2022

  1. [Target] Remove deprecated parameters from target (apache#12416)

    * remove depricated parameters in target
    
    * lint
    
    * fix cpp tests
    
    fix
    
    * remove more configs in test files
    
    * address comments
    
    * fix error
    
    * fix hexagon
    
    * fix micro tutorial
    
    * fix integration tests
    
    * fix hexagon
    
    * lint
    
    * fix unittest
    
    * fix readme
    
    * fix assert executor in target
    
    * address comments
    
    * fix tutorials
    
    * fix hexagon target
    
    * fix tutorial
    
    * fix for tutorials
    
    * hexagon
    mehrdadh authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    c15cc5e View commit details
    Browse the repository at this point in the history
  2. [PyTorch][Fix] Fix for numerically unstable logsigmoid (apache#12563)

    * Fix numerical instability for log sigmoid
    
    Fix numerical instability for log sigmoid in pytorch frontend
    
    * update
    
    * add test for overflow check
    
    * merging two tests
    crawlingcub authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    5778261 View commit details
    Browse the repository at this point in the history
  3. [microNPU] Force compute_cycles_hint to be interpreted as an int64 va…

    …lue (apache#12558)
    
    `compute_cycles` can be the size of an int64 value, however it seems
    that when that value is attached to the IR as a pragma from Python,
    it is interpreted as an `int`, rather than `int64_t`. This commit adds
    an explicit cast to ensure the value is interpreted correctly.
    
    The reason these values started appearing very large and randomly is
    still yet to be solved, although the hope is that this fix will unblock
    CI.
    
    Change-Id: Idcdd7d37af1acd665590c87624446a025b50eb3d
    lhutton1 authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    e468dc2 View commit details
    Browse the repository at this point in the history
  4. [CI][CMSIS-NN] Running tests parallel using pytest-xdist (apache#12557)

    Introducing -n auto for CMSIS-NN tests to run them in
    parallel with pytest-xdist. This is needed because of
    additional parameterization done over cpu variants.
    
    Change-Id: I02e1b37ead0b0a562b5b1b2dacfeb3fdd7cc1ce3
    ashutosh-arm authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    90b2f0d View commit details
    Browse the repository at this point in the history
  5. [ETHOSN] Add support for resize (apache#12535)

    This commit adds support for the `resize` operator for
    Arm(R) Ethos(TM)-N NPU.
    Nicola Lancellotti authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    989e5a1 View commit details
    Browse the repository at this point in the history
  6. [TIR][CompactBufferAllocation] Improve upperbound estimation of buffe…

    …r compaction (apache#12527)
    
    Hi, this change wants to add some minor updation to region estimator used by buffer compaction:
    - Add and clearify among `EstimateRegionStrictBound`, `EstimateRegionLowerBound` and `EstimateRegionUpperBound`
       
      Originally we have `EstimateRegionLowerBound`, actually it implements strict bound estimation IMO. Now add `upper` and `strict` version for where we actually want them.
    
    - When estimating upperbounds (eg. in buffer compaction), try estimate each dimension independently when they are dependent accesses where `EstimateRegionLowerBound` is expected to fail. 
    
      Eg, `A[i, i], 3 < i < 16`  fails via `EstimateRegionLowerBound` who check indices be independent. But we can still try best to invoke strict bound analysis on each dimension individually.
    
    - If range->extent == 1 for `EvalSet(range, dom)`, invoke `EvalSet(range->min, dom)` instead.
      
      Eg, `EvalSet([k*k, k*k+1), dom_k)` results to [-inf, +inf] due to current algorithm limitation but  `EvalSet(k*k, dom_k)` results to a range which makes more sense.
    wrongtest-intellif authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    1ec2c36 View commit details
    Browse the repository at this point in the history
  7. [Target] Replace IsaAnalyzer with Target Features (apache#12322)

    This is clean up to use the new `target.features` instead of `IsaAnalyzer`.
    Mousius authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    592148a View commit details
    Browse the repository at this point in the history
  8. [CI] Set test python.contrib.test_onnx.test_resize as xfail (apache#1…

    …2568)
    
    `python.contrib.test_onnx.test_resize` is failing due to a numerical
    accuracy issue, reported in apache#12567. This patch marks that test as
    an xfail, so that other tests can be enabled, while this one is
    investigated separately.
    leandron authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    6e79f64 View commit details
    Browse the repository at this point in the history
  9. [ETHOSN] Support multiply conversion to depthwise (apache#12403)

    Multiply can be supported when offloaded to the NPU by a conversion to a depthwise convolution operation. This is only supported when the multiply operation has a single single variable input with the other being a constant of shape [1, ..., C]. This commit adds a new pass "ConvertEquivalents" (name subject to change) to handle this conversion before codegen.
    lhutton1 authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    a0fe74b View commit details
    Browse the repository at this point in the history
  10. [TIR] Expose Vector-related API in Python (apache#12571)

    This PR exposes the following TIR operation in python:
    
    - `vectorlow`: tested [here](https://github.com/apache/tvm/blob/592148abf6866a41eefa736efca067d42f5aea86/python/tvm/tir/tensor_intrin/arm_cpu.py#L62)
    - `vectorhigh`: tested [here](https://github.com/apache/tvm/blob/592148abf6866a41eefa736efca067d42f5aea86/python/tvm/tir/tensor_intrin/arm_cpu.py#L79)
    - `vectorcombine`: add new unittest
    
    Co-Authored-By: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    038523e View commit details
    Browse the repository at this point in the history
  11. [Hexagon] Add support to run on multiple devices (apache#12504)

    * working in parralel using worker
    
    * creating launchers per test and clean up
    
    * clean up
    
    * ci change to distrube tests
    
    * ci work with any number of devices
    
    * fix running on simulator
    
    * adding function docstring
    
    * fix android_serial_number to always return a list of string
    
    * lint issue
    
    * fix internal error when skipping tests while androideserial number is not set
    
    * lint issue
    farshidsp authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    bf65b39 View commit details
    Browse the repository at this point in the history
  12. [Hexagon] Fix missing pytest import (apache#12565)

    * Add pytest
    
    * lint
    mehrdadh authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    f53ee0c View commit details
    Browse the repository at this point in the history
  13. [TOPI][Hexagon] Implement quantized avgpool (apache#12340)

    * [TOPI][Hexagon] Implement quantized avgpool
    
    * Fix pylint errors
    
    * Needed to adjust input padding for int8 buffer layout
    
    * Fix formatting issue
    
    * Add unit test for fixed-point conversion utility function
    
    Also, address review comments.
    
    * Remove pytest.skip for test_avg_pool2d_slice.py to enable on-target testing
    
    * Fix formatting issue
    
    * Update python/tvm/topi/hexagon/utils.py
    
    Co-authored-by: Christian Convey <christian.convey@gmail.com>
    
    * Update comments and error messages
    
    * Address review comments
    
    * Import Tuple from typing
    
    * Address pylint error
    
    Co-authored-by: Christian Convey <christian.convey@gmail.com>
    jverma-quic and cconvey authored Aug 24, 2022
    Configuration menu
    Copy the full SHA
    1afd059 View commit details
    Browse the repository at this point in the history

Commits on Aug 25, 2022

  1. [microTVM] Fix build directory exists error (apache#12575)

    When you build a project from existing project directory using `tvm.micro.project.GeneratedProject.from_directory` it would show up error if build directory previously existed.
    mehrdadh authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    17989e8 View commit details
    Browse the repository at this point in the history
  2. [MicroTVM] fix compile error when the compiler implements char as uns…

    …igned (apache#12519)
    
    When compiling tvm with micro on the compiler which implements char as unsigned(such as arm-linux-gcc), there is an error:
    `src/runtime/crt/graph_executor/load_json.c:218:12: error: result of comparison of constant -1 with expression of type 'char' is always false [-Werror,-Wtautological-constant-out-of-range-compare]`
    `    if (ch == EOF || ch == '\r' || ch == '\n') {`
    The reason is because the implementation of char is undefined, so it's better to specify here that it is signed.
    Lucien0 authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    b8fbfe2 View commit details
    Browse the repository at this point in the history
  3. [TIR] Expose shift_left and shift_right to Python (apache#12584)

    This PR exposes the following TIR operation in python:
    
    - `shift_left`: tested [here](https://github.com/apache/tvm/blob/1afd0593956066635ee49297b731726c9218c91c/tests/python/unittest/test_tir_transform_simplify.py#L487)
    - `shift_right`: add new unittest
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    cd8fd91 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    9aac161 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    b387384 View commit details
    Browse the repository at this point in the history
  6. [PyTorch] Add aten::new_empty (apache#12591)

    This PR intends to add `aten::new_empty` which is used for model like `hf_Longformer`.
    
    cc: @masahi
    Yuanjing Shi authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    40bdea8 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    fb7cf97 View commit details
    Browse the repository at this point in the history
  8. [microTVM][Zephyr] Add recommended heap size for NRF and qemu_x86 (ap…

    …ache#12585)
    
    This PR sets recommended heap size for qemu_x86 and NRF board to fix memory size with models like VWW using AoT host driven executor.
    mehrdadh authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    cc19cdd View commit details
    Browse the repository at this point in the history
  9. [CI] Assert some unittests are not skipped in CI (apache#12436)

    This PR adds a script that does a diff of skipped tests between the latest successful build on the main and the current branch. Then, it posts a comment with the report on the open PR. 
    
    apache#11670
    gigiblender authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    56b7c8a View commit details
    Browse the repository at this point in the history
  10. [DOC] fix code-block error in debuggging TVM part (apache#12597)

    The code block in part Debuggging TVM is not showing up. 
    
    Just fix it.
    huanmei9 authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    61c034a View commit details
    Browse the repository at this point in the history
  11. [CI] github_cc_reviewers: Catch all exceptions so all reviewers can b…

    …e processed (apache#12578)
    
    In a recent change, `github.post` throws `RuntimeError` instead of `HTTPError` when the requested reviewer isn't a project collaborator. This prevents other reviewers to be added to the PR, for example, https://github.com/apache/tvm/runs/8001367110?check_suite_focus=true.
    
    This PR changes the caller to catch any exception so the execution won't be interrupted.
    
    Co-authored-by: driazati <9407960+driazati@users.noreply.github.com>
    yelite and driazati authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    b547106 View commit details
    Browse the repository at this point in the history
  12. [microNPU] Remove xfail from tests relating to apache#12511 (apache#1…

    …2570)
    
    Removes tests previously marked as xfail since the issue has now
    been resolved.
    lhutton1 authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    399f2e9 View commit details
    Browse the repository at this point in the history
  13. [ETHOSN] Support conversion of add to depthwise (apache#12531)

    In similar fashion to the conversion of mul to depthwise, this commit
    converts add when one input is a constant of shape [1, ..., n] to a
    depthwise convolution. If neither input is a constant, the add is
    offloaded naturally like before.
    
    The addition testing has been improved to use pytest features.
    lhutton1 authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    f7c1436 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    21db1eb View commit details
    Browse the repository at this point in the history
  15. [CUDA][CodeGen] Fix cuda codegen's fp16 inf literal (apache#12581)

    * Fix cuda codegen's fp16 inf literal
    
    * add relay testcase
    wrongtest-intellif authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    bb00a15 View commit details
    Browse the repository at this point in the history
  16. [ci] Default to n=2 for test parallelism (apache#12414)

    * Revert "[skip ci] Revert "[ci] Default to n=2 for test parallelism (apache#12376)" (apache#12413)"
    
    This reverts commit 478b672.
    
    * [ci] Default to n=2 for test parallelism
    
    This is attempt #2 of apache#12376 which was reverted in apache#12413. The changes
    in `plugin.py` should keep all the tests on the same node so sporadic
    failures don't happen due to scheduling.
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    01fcdfc View commit details
    Browse the repository at this point in the history
  17. [Runtime] Change default alignment to 64 bytes. (apache#12586)

    * Change default alignment to 64 bits.
    
    * Run dlpack test a few times.
    
    * Update alignment in tests.
    
    * Revert mma alignment change.
    
    * Change default printing of buffer.
    
    * Change crt runtime default allocation.
    Josh Fromm authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    8d60b3c View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    5db38ba View commit details
    Browse the repository at this point in the history
  19. [skip ci][Community] Wuwei Lin -> PMC (apache#12605)

    [Community] Wuwei Lin -> PMC
    yzhliu authored Aug 25, 2022
    Configuration menu
    Copy the full SHA
    a9f7c32 View commit details
    Browse the repository at this point in the history

Commits on Aug 26, 2022

  1. [TOPI][Bugfix] Make semantics of empty axis in squeeze consistent…

    … with Relay (apache#12596)
    
    * Fix empty axis of `squeeze` in TOPI.
    
    * Add test case for `squeeze` with empty `axis`.
    
    * Add LLVM target for `test_squeeze`.
    wzh99 authored Aug 26, 2022
    Configuration menu
    Copy the full SHA
    3224817 View commit details
    Browse the repository at this point in the history
  2. [TIR] Expose Memory Copy-Related PTX Builtins (apache#12611)

    * Expose Memory Copy-Related PTX Builtins
    
    This PR exposes the following TIR operation in python:
    
    `ptx_ldmatrix`: tested
    `ptx_cp_async`: tested
    `ptx_commit_group`: tested
    `ptx_wait_group`: tested
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    
    * apply code review suggestion
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Aug 26, 2022
    Configuration menu
    Copy the full SHA
    4f431c8 View commit details
    Browse the repository at this point in the history
  3. [TIR][Schedule] enhance compute_at and reverse_compute_at primitive t…

    …o choose possible position (apache#12450)
    
    Current TIR "compute_at" primitive will compute at it's closest consumers. When a block has multiple producers, whoever compute at later who is behind. But for some special hardware, we usually hope keep the a certain order whatever it's compute at early or late.
    eg: block A and block B are producers of block C. block A compute at block C first and block B compute at block C later. We hope the result is block B->block A->block C under some loop var.
    yincs-intellif authored Aug 26, 2022
    Configuration menu
    Copy the full SHA
    e02f2f9 View commit details
    Browse the repository at this point in the history
  4. [SimplifyExpr] Add simplify for dq->arg funcs (apache#12580)

    * add simplify for dq->arg funcs
    
    * add comments, fix lint
    
    * move comments to the right spots
    Matthew Brookhart authored Aug 26, 2022
    Configuration menu
    Copy the full SHA
    d171b4a View commit details
    Browse the repository at this point in the history
  5. [Hexagon] Initial support for meta schedule tuning (apache#12587)

    Enables AutoTVM-style, template-based tuning for Hexagon.
    
    To run compiled code on Hexagon, we need to use Hexagon `Session` object https://github.com/apache/tvm/blob/dc522a6ff65b68532cd1bba43827cd981114df2c/python/tvm/contrib/hexagon/session.py#L35 in the metaschedule `RPCRunner`. But for RPC "session", `RPCRunner` expects an instance of `RPCSession`, https://github.com/apache/tvm/blob/53fe5966823eee4e011d7228bceab3c82c1d9caa/python/tvm/rpc/client.py#L32,  to be created and used by various customizable functions. 
    
    Since `RPCSession` and Hexagon `Session` have slightly different API, we cannot use `RPCRunner` with customizable functions directly. So I introduced an alternative implementation of `RPCRunner` for Hexagon.
    
    The test is disabled for simulator since `HexagonLauncherSimulator` is not pickle-able due to its `multiprocessing.Process` attribute: https://github.com/apache/tvm/blob/c97895e0ffb512e73c89de7cdee9846f052244fc/python/tvm/contrib/hexagon/build.py#L614
    
    
    Output log from tuning `vrmpy` dense (included in the test)
    
    ```
     ID | Name |      FLOP | Weight | Speed (GFLOPS) | Latency (us) | Weighted Latency (us) | Trials | Terminated
    --------------------------------------------------------------------------------------------------------------
      0 | main | 150994944 |      1 |       380.3399 |     397.0000 |              397.0000 |     32 |
    --------------------------------------------------------------------------------------------------------------
    ```
    masahi authored Aug 26, 2022
    Configuration menu
    Copy the full SHA
    d87fa85 View commit details
    Browse the repository at this point in the history
  6. [TIR] More hygenic TVM_SREF macros (apache#12607)

    Previously, the `TVM_SREF_TO_BLOCK`, `TVM_SREF_TO_FOR`, and
    `TVM_TYPE_AS` macros required both the input and output variables.
    The input variable name is useful for improving the error message
    returned, but the output variable name isn't necessary for this
    functionality, and prevents the macro from being used as part of an
    expression.
    
    * Generate an immediately-invoked lambda expression to allow for an
      independently-scoped `result` variable.
    
    * Use parentheses around the input argument, in case the sref is
      the result of an expression.
    
    * Update all call sites to remove the macro argument providing the
      first argument.
    Lunderberg authored Aug 26, 2022
    Configuration menu
    Copy the full SHA
    49b3c72 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    2e83e03 View commit details
    Browse the repository at this point in the history
  8. Replace '> >' in templates with >>, NFC (apache#12615)

    The problem with greedy lexing of >> as an operator was solved in
    C++11, and now templates no longer require spaces between >'s.
    Krzysztof Parzyszek authored Aug 26, 2022
    Configuration menu
    Copy the full SHA
    23e7944 View commit details
    Browse the repository at this point in the history
  9. [Hexagon] Asynchronous DMA support (apache#12411)

    Adds adds asynchronous DMA support through the Hexagon User DMA engine with unit tests to validate basic functionality. Asynchronous DMA support here means the ability to "kick off" asynchronously a number of DMAs using the Copy API and then to Poll for or Wait on a number of "in flight" (not done) DMAs. Enables future testing and development for asynchronous memory copy on Hexagon. For now, Hexagon DMA support remains synchronous in nature through existing hexagon_user_dma_1d_sync interface which uses asynchronous capable HexagonUserDMA class in a synchronous way --- calling Copy and Wait back to back for each request.
    
    * use ring buffer to store DMA descriptors
    
    * add RingBuffer class; used by HexUserDMA to store descriptors
    
    * add test to overflow the HexagonUserDMA ring buffer
    adstraw authored Aug 26, 2022
    Configuration menu
    Copy the full SHA
    7f1856d View commit details
    Browse the repository at this point in the history

Commits on Aug 27, 2022

  1. [MetaSchedule][UX] Make Database with-able (apache#12520)

    `ApplyHistoryBest` right now plays a role as the database adaptor to query inside the database.
    In fact, the logic could be simplified and users only have to deal with `Database` instead of this
    extra object.
    
    - [x] Add `EnterWithScope`/`ExitWithScope`/`Current` to Database
    - [x] Migrate `te_filter_func` => "tir_filter" in Relay's pass context
    - [x] Migrate `f_take_tuning_record` => "Database.query_tuning_record"
    - [x] Migrate `TECompiler` to use `Database`
    - [x] Remove apply-history-best
    
    Next PR:
    - Migrate `f_direct_dispatch` (potentially unify with `apply_fixed_schedule`?)
    junrushao authored Aug 27, 2022
    Configuration menu
    Copy the full SHA
    370abe6 View commit details
    Browse the repository at this point in the history
  2. [TIR] Expose MMA-related PTX builtins (apache#12623)

    Expose MMA-related PTX builtins
    
    This PR exposes the following TIR operation in python:
    
    `ptx_mma`: tested
    `ptx_mma_sp`: tested
    `mma_store`: add new unittest
    `mma_fill`: add new unittest
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Aug 27, 2022
    Configuration menu
    Copy the full SHA
    5344128 View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2022

  1. [MetaSchedule] Introduce ScheduleFnDatabase (apache#12626)

    Following apache#12520, this PR introduces `ScheduleFnDatabase`, a mocked
    database to allow injecting handcrafted schedules provided by a schedule
    function.
    
    The schedule function comes with the following signature:
    
    ```python
    def schedule_fn(
      sch: tir.Schedule,
    ) -> bool:
      task_name = sch.mod.attrs["task_name"]
      # ^^^ provides an optional name of the task queried
      ...
    ```
    
    This mocked database helps incorporate the existing testing utility
    `apply_fixed_schedule` more formally into the MetaSchedule-Relay build
    pipeline, and allows further extension to Relax with the same interface.
    
    Next as another follow-up, we will introduce ConcatDatabase that allows
    mixing multiple databases, including the mocked and ones from JSON
    files.
    junrushao authored Aug 29, 2022
    Configuration menu
    Copy the full SHA
    648a29a View commit details
    Browse the repository at this point in the history
  2. [Refactor] Replace std::tie with structured bindings (apache#12610)

    * [Refactor] Replace std::tie with structured bindings
    
    With C++17 enabled in apache#12337, using
    structured bindings to replace cases where `std::tie` is used to
    define local variables.
    
    * Added missing header for <optional>
    
    * Silenced unused variable warnings after structured bindings
    
    This is a bug in gcc version 7, resolved in gcc 8.  While gcc version
    7 is used for CI, we'll need to silence unused variable warnings
    resulting from using only part of a structured binding.
    
    More information: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81767
    Lunderberg authored Aug 29, 2022
    Configuration menu
    Copy the full SHA
    3d41ac3 View commit details
    Browse the repository at this point in the history
  3. [QNN] Align output_scale/zero_point of sigmoid to Torch (apache#12624)

    * [QNN] Align output_scale/zero_point of sigmoid to Torch
    
    * [QNN] Align output_scale/zero_point of sigmoid to Torch
    zhaoyang-star authored Aug 29, 2022
    Configuration menu
    Copy the full SHA
    c5c99a4 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    0de2219 View commit details
    Browse the repository at this point in the history
  5. [ci] Don't update Jenkinsfile timestamp on image updates (apache#12621)

    The timestamp in the Jenkinsfile is there to prevent post-merge
    conflicts from different PRs that edit the templates merging
    non-sequentially. This is not an issue when a line is edited in place
    though, which is often the case when Docker image tags are updated. This
    PR makes it so the timestamp is not updated in these cases which should
    reduce merge conflicts on these types of PRs.
    driazati authored Aug 29, 2022
    Configuration menu
    Copy the full SHA
    c31a762 View commit details
    Browse the repository at this point in the history
  6. [Utils] Handled Callable in tir.schedule._type_checker (apache#12633)

    Previously, `Callable` was handled as an atomic type.  This worked
    when it was included as last element of a `Union[]` annotation with no
    subtypes, but raised an error for other use cases, including
    `Optional[Callable]`.
    
    This commit adds explicit checks for `Callable` type annotations to
    validate whether the argument is callable, but doesn't recursively
    validate the signature of the callable object, because lambda
    functions cannot have type
    annotations. (https://peps.python.org/pep-3107/#lambda)
    Lunderberg authored Aug 29, 2022
    Configuration menu
    Copy the full SHA
    74988d3 View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2022

  1. [TIR] Improved error messages for PrimExpr operator overloads (apache…

    …#12638)
    
    Previously, type-checks in boolean operators on `PrimExpr` would
    state that the type is incorrect, but further investigation would be
    required in order to determine what expression caused the error.
    After this commit, error messages for these type checks include the
    expression that was used, and the dtype of that expression.
    Lunderberg authored Aug 30, 2022
    Configuration menu
    Copy the full SHA
    9e88723 View commit details
    Browse the repository at this point in the history
  2. [ci] Move non-task CI scripts into ci/ folder (apache#12609)

    [CI] Update Hexagon image to install boost (apache#12613)
    
    The new image has xgboost installed, which I need for apache#12587
    
    Validated in https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/ci-docker-staging/279/pipeline
    
    Co-authored-by: masahi <masahi129@gmail.com>
    driazati and masahi authored Aug 30, 2022
    Configuration menu
    Copy the full SHA
    5287d8f View commit details
    Browse the repository at this point in the history
  3. [TVMScript] support float inf, -inf and nan in TVMScript parser and p…

    …rinter (apache#12618)
    
    * support float inf, -inf and nan in TVMScript parser and printer
    
    * address comment and fix lint
    
    * use type_extensions.Literal
    
    * address comments
    
    * fix win build
    
    * remove template
    Yuanjing Shi authored Aug 30, 2022
    Configuration menu
    Copy the full SHA
    58ee935 View commit details
    Browse the repository at this point in the history
  4. [microTVM][ARM-DSP] Fix pool schedule (apache#12653)

    When I built keyword spotting ONNX model, there was an issue with the pool schedule because certain schedules like broadcast and elemwise do not have input tensors.
    mehrdadh authored Aug 30, 2022
    Configuration menu
    Copy the full SHA
    b44f134 View commit details
    Browse the repository at this point in the history
  5. [microTVM]Fix test util functions (apache#12641)

    * Fix test utils
    * Update python/tvm/micro/testing/utils.py
    
    Co-authored-by: driazati <9407960+driazati@users.noreply.github.com>
    mehrdadh and driazati authored Aug 30, 2022
    Configuration menu
    Copy the full SHA
    d421e32 View commit details
    Browse the repository at this point in the history
  6. [Hexagon] Expose gtest output through runtime exception (apache#12502)

    Expose Hexagon gtest output in CI by raising it as a runtime exception rather than printing it to stdout.
    adstraw authored Aug 30, 2022
    Configuration menu
    Copy the full SHA
    1c32798 View commit details
    Browse the repository at this point in the history
  7. [microTVM][Zephyr] Add missing CMSIS-NN source files to cmake file (a…

    …pache#12642)
    
    This PR adds missing CMSIS-NN source files to Zephyr cmake template file for models like keyword spotting, anomaly detection, VWW and image classification.
    mehrdadh authored Aug 30, 2022
    Configuration menu
    Copy the full SHA
    775520c View commit details
    Browse the repository at this point in the history
  8. [ci] Add mechanism for trust on certain CI scripts (apache#12604)

    This makes it so changes to certain files from users not listed in
    `CONTRIBUTING.md` are not tested in CI. This is necessary since these
    scripts run on the baremetal EC2 instances and not inside Docker
    containers, so they can affect other builds and potentially grab Jenkins
    secrets. This checks out the version from the upstream for the listed
    files after running `git checkout`. Tested in CI: [positive](https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-12604/6/pipeline/) and [negative](https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-12604/9/pipeline/)
    driazati authored Aug 30, 2022
    Configuration menu
    Copy the full SHA
    caf326f View commit details
    Browse the repository at this point in the history

Commits on Aug 31, 2022

  1. [MetaSchedule] Complete NCHW Conv2D Winograd Kernel Scheduling (apach…

    …e#12648)
    
    * Complete winograd scheduling.
    
    * Fix test.
    zxybazh authored Aug 31, 2022
    Configuration menu
    Copy the full SHA
    f7cc992 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f114d55 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c2824a8 View commit details
    Browse the repository at this point in the history
  4. [ETHOSN] Improve inferring new shape of the Reshape operator (apache#…

    …12594)
    
    Fixes the case when reshape is > 4 dims. While this cannot be offloaded to the NPU, the check was previously producing an error preventing further compilation. The correct behavior is to ensure the check returns False and not offload the reshape.
    Nicola Lancellotti authored Aug 31, 2022
    Configuration menu
    Copy the full SHA
    acbbd9f View commit details
    Browse the repository at this point in the history
  5. [TIR][TVMScript] Update printer / parser to make T.allocate return bu…

    …ffer var (apache#12412)
    
    * Updated TVMScript syntax of `T.allocate` to return buffer var.
    
    * Added syntax sugar for `T.decl_buffer`. When `data` field is not
      specified, `data` will be implicitly created via `Allocate` stmt.
      
    * Updated the existing test cases. Most test cases can be updated by
      changing `T.allocate` to `T.decl_buffer`. `T.allocate` in some tests
      are updated to `T.allocate` + `T.buffer_decl`, to maintain the
      legacy behavior of allocation and implicit buffer declaration (will
      be followed up in future PR to adopt `T.decl_buffer`).
    vinx13 authored Aug 31, 2022
    Configuration menu
    Copy the full SHA
    0c37454 View commit details
    Browse the repository at this point in the history
  6. [Torch][AArch64] Skip test_load_model___wrong_language__to_pytorch (a…

    …pache#12660)
    
    This patch makes test_load_model___wrong_language__to_pytorch to be
    skipped in AArch64 due to a bug that can be reproduced when enabling
    Integration Tests in machines with Torch installed in TVM.
    
    ```
    The error message seen is:
    OSError: /usr/local/lib/python3.7/dist-packages/torch/lib/
    libgomp-d22c30c5.so.1: cannot allocate memory in static TLS block
    ```
    
    While the test needs further investigation, it is being set as
    skipped so other tests can be enabled and not to regress and allow
    time for the investigation to be made.
    
    This relates to the issue described in apache#10673.
    leandron authored Aug 31, 2022
    Configuration menu
    Copy the full SHA
    d54c065 View commit details
    Browse the repository at this point in the history
  7. [ci] Add linter for PR title and body (apache#12367)

    * [skip ci][ci] Fix Jenkinsfile (apache#12387)
    
    This got out of date after merging apache#12178
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    
    * Address comments
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Aug 31, 2022
    Configuration menu
    Copy the full SHA
    a399e6c View commit details
    Browse the repository at this point in the history

Commits on Sep 1, 2022

  1. [TIR] Allow string/buffer arguments to Schedule cache_read/write (apa…

    …che#12661)
    
    Previously, the argument needed to be an integer specifying the index
    into the read/write regions of a block.  Now, the argument can be a
    string specifying the name of the buffer, or the Buffer object itself.
    This is a follow-up from apache#11624.
    Lunderberg authored Sep 1, 2022
    Configuration menu
    Copy the full SHA
    c6516a5 View commit details
    Browse the repository at this point in the history
  2. [ETHOSN] Fix tests pylint errors (apache#12649)

    This pr fixes pylint errors in tests/python/contrib/test_ethosn as reported in issue apache#11414.
    Nicola Lancellotti authored Sep 1, 2022
    Configuration menu
    Copy the full SHA
    aa6c712 View commit details
    Browse the repository at this point in the history
  3. [Relay] Extract intermediate node by its expression ID (apache#12646)

    [Relay] Extract Intermediate Expr by relay expr ID for analysis
    
    modify doc comments
    
    Co-authored-by: Bin Li <binli1@amd.com>
    sisleyli and Bin Li authored Sep 1, 2022
    Configuration menu
    Copy the full SHA
    38ba8c0 View commit details
    Browse the repository at this point in the history
  4. [Hexagon] Implement fixed_point_multiply op through intrinsics. (apac…

    …he#12659)
    
    This commit adds high-performance implementation of fixed_point_multiply
    operation based on Hexagon intrinsics for vmpye/vmpyo instructions.
    
    Benchmarking of 'fixed_point_multiply' op with (1,8,56,56,32) input
    tensor on Qualcomm SM8350:
      * default implementation: 10.06 ms
      * optimized implementation: 1.42 ms
      * speedup: 7x times (!!!)
    
    Please note that this is introducing a small round-up error for some
    corner cases with negative shift argument (The same as for ARM CPU, see
    PR#5980). This is because we are rounding twice instead than only once:
      * original q_multiply_shift: round(x*y*2^-s)
      * hexagon q_multiply_shift: round(round(x*y)*2^-s)
    ibsidorenko authored Sep 1, 2022
    Configuration menu
    Copy the full SHA
    038f15b View commit details
    Browse the repository at this point in the history
  5. [MetaSchedule] Fix autoinline for single const consumer block (apache…

    …#12668)
    
    fix autoinline and add test
    Yuanjing Shi authored Sep 1, 2022
    Configuration menu
    Copy the full SHA
    32f9a5f View commit details
    Browse the repository at this point in the history
  6. Add methods to get and set late-bound constants. (apache#12664)

    * Add methods to read and restore late-bound constants on Executable.
    
    * Add bindings for new functions
    
    * Cleanup
    
    * Fix function name
    
    * Add tests for python API to access new load/save functions
    
    * Add another tests for python API to access new load/save functions where there are no constants
    rkimball authored Sep 1, 2022
    Configuration menu
    Copy the full SHA
    effcd22 View commit details
    Browse the repository at this point in the history
  7. [Adreno] Change compute/schedule for ToMixedPrecision pass (apache#12537

    )
    
    * [Adreno] Change compute/schedule for ToMixedPrecision pass
    
    * Address CI fails
    
    * address PR comments
    
    * Fix AutoTVM flow
    elvin-n authored Sep 1, 2022
    Configuration menu
    Copy the full SHA
    e814f79 View commit details
    Browse the repository at this point in the history
  8. [hexagon][tests] re-enable maxpool hardware test (apache#12676)

    - Re-enable test_max_pool2d_slice.py when run on Hexagon
      hardware (as opposed to hexagon-sim).
    
      This is now safe because apache#11928
      has been fixed.
    Christian Convey authored Sep 1, 2022
    Configuration menu
    Copy the full SHA
    54786bb View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    50dad0d View commit details
    Browse the repository at this point in the history
  10. [MetaSchedule] Introduce Union and OrderedUnion in Database (apac…

    …he#12628)
    
    Following up apache#12520 and apache#12626, this PR introduces two database classes:
    `UnionDatabase` and `OrderedUnionDatabase`, both of which allow users to
    organically compose multiple databases together, so that the high-level
    IR (Relay, Relax) could select the best tuning records according to
    running time or a preferred order given by users.
    
    To each query, `UnionDatabase` returns the best record among all the
    databases given; Instead, `OrderedUnionDatabase` returns he record from
    the first database that responds to the query.
    
    Used together, users may specify complicated dispatching patterns like
    below:
    
    Examples below demonstrate the usecases of and difference between
    UnionDatabase and OrderDatabase.
    
    Assumption:
    * db1, db2 do not have tuning records for the target workload.
    * Each of db3, db4, db5 has tuning records r3, r4, r5 for target
    workload respectively.
    
    ```python
    #### Case 1. `UnionDatabase`:
    merged_db = ms.database.UnionDatabase(
        db1, # no record
        db2, # no record
        db3, # has r3
        db4  # has r4
    )
    # returns the better one between r3 and r4
    merged_db.query_tuning_record(..., target_workload)
    
    ### Case 2. `OrderedUnionDatabase`
    merged_db = ms.database.OrderedUnionDatabase(
        db1, # no record
        db2, # no record
        db3, # has r3
        db4  # has r4
    )
    # returns r3
    merged_db.query_tuning_record(..., target_workload)
    
    ### Case 3. Mix-use scenario
    merged_db = ms.database.UnionDatabase(
        db1, # no record
        db2, # no record
        db3, # has r3
        ms.database.OrderedUnionDatabase( # returns r4
            db4,  # has r4
            db5,  # has r5
        )
    )
    # returns the better one between r3 and r4
    merged_db.query_tuning_record(..., target_workload)
    
    ### Case 4. Another mix-use scenario
    merged_db = ms.database.UnionDatabase(
        db1, # no record
        db2, # no record
        db3, # has r3
        ms.database.UnionDatabase( # returns the better one between r4 and r5
            db4,  # has r4
            db5,  # has r5
        )
    )
    # returns the best one among r3, r4 and r5
    merged_db.query_tuning_record(..., target_workload)
    
    ### Case 5. Yet another mix-use scenario
    merged_db = ms.database.OrderedUnionDatabase(
        db1, # no record
        db2, # no record
        ms.database.UnionDatabase( # returns the better one between r3 and r4
            db3, # has r3
            db4, # has r4
        )
        db5,  # has r5
    )
    # returns the better one between r3 and r4
    merged_db.query_tuning_record(..., target_workload)
    ```
    
    Co-authored-by: sunggg <49998730+sunggg@users.noreply.github.com>
    junrushao and sunggg authored Sep 1, 2022
    Configuration menu
    Copy the full SHA
    eecb7fd View commit details
    Browse the repository at this point in the history

Commits on Sep 2, 2022

  1. Configuration menu
    Copy the full SHA
    8ca8f24 View commit details
    Browse the repository at this point in the history
  2. [COMMUNITY] Yaxing Cai -> Reviewer (apache#12683)

    Please join me in welcoming Yaxing Cai (@cyx-6) as a new reviewer in TVM. Yaxing has brought the PackedFunc into TVM object system ([RFC-051](apache/tvm-rfcs#51)), designed and implemented the new parser infrastructure for TVMScript and meta-programming ([RFC-079](apache/tvm-rfcs#79))
    
    - [Commits History](https://github.com/apache/tvm/commits?author=cyx-6)
    - [Code Review](https://github.com/apache/tvm/pulls?q=reviewed-by%3Acyx-6+)
    Hzfengsy authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    4acddb1 View commit details
    Browse the repository at this point in the history
  3. [PyTorch] Fix aten::arange for pytorch (apache#12681)

    fix arange for pytorch nightly 20220815
    Yuanjing Shi authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    b2d6600 View commit details
    Browse the repository at this point in the history
  4. [MetaSchedule][UX] Convenient Object Creation (apache#12643)

    This PR introduces a set of `.create` methods making it easier to create
    MetaSchedule objects.
    
    For example:
    
    ```python
    ms.database.JSONDatabase(...)
    ms.database.create("json")
    
    ms.runner.RPCRunner(...)
    ms.runner.create("rpc")
    ```
    
    Besides, this PR allows `JSONDatabase` to be created via `work_dir`:
    
    ```python
    db = ms.database.create("json", work_dir="/path/to/db/")
    db = ms.database.create(work_dir="/path/to/db/")  # or even simpler
    ```
    junrushao authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    bb56f2a View commit details
    Browse the repository at this point in the history
  5. [ETHOSN] Fix some more pylint issues (apache#12675)

    Fixing a few more pylint issues caught when using pylint==2.9.3.
    
    Change-Id: Ie7ca61e1a8083a40e0ffccf1418192966884707a
    lhutton1 authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    445a14f View commit details
    Browse the repository at this point in the history
  6. [ETHOSN] Add support for concatenate with negative axis (apache#12686)

    Supports offloading concatenate with a negative axis to the NPU. In addition, parameterized the concatenate unit tests.
    lhutton1 authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    0549a08 View commit details
    Browse the repository at this point in the history
  7. [ci][tvmbot] Trigger GitHub Actions after merging (apache#12361)

    This fixes the issue where merging from GitHub Actions (i.e. with the default `GITHUB_TOKEN`) doesn't trigger post merge GitHub Actions on the commit it creates in `main`. Instead these jobs are triggered manually by a call to the Actions API after the merge has taken place.
    
    This also updates the tvmbot testing code (and by extension some of the other CI testing code) to remove the fixtures for each test in favor of constructing them from a single sample at runtime, this makes it a lot easier to add new tests and see what is different between each data sample and clean up the testing anti-patterns that were there before (e.g. `run()` instead of `pytest.mark.parameterize`, but none of the tests in `test_ci.py` have changed)
    
    Tested in driazati#36 which ran https://github.com/driazati/tvm/actions/runs/2881047903
    driazati authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    7c7b0f7 View commit details
    Browse the repository at this point in the history
  8. [AutoTVM][Testing] Add tune_relay scripts (apache#12685)

    Example:
    
    ```bash
    python -m tvm.autotvm.testing.tune_relay  \
           --workload bert_base               \
           --input-shape '[1,64]'             \
           --target "llvm"                    \
           --num-trials 800                   \
           --rpc-host 192.168.6.66            \
           --rpc-port 4445                    \
           --rpc-key 3090ti                   \
           --work-dir /logs/autotvm-bert_base \
           --cache-dir /cache-workloads       \
           --graph-tuner True                 \
           --cpu-flush True                   \
           --backend graph
    ```
    junrushao authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    0cbf3aa View commit details
    Browse the repository at this point in the history
  9. [ci] Add tests for PR linter (apache#12680)

    This adds some checks for the current usages of the PR linter and fixes the case where the script would error uncleanly when a PR body was `null`.
    driazati authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    4ed6564 View commit details
    Browse the repository at this point in the history
  10. [Adreno] Define memory_info for global.texture* (apache#12647)

    There are now many warnings in the tuning process about undefined memory information when using textures. A definition is required as textures* are tagged.
    Icemist authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    2734d04 View commit details
    Browse the repository at this point in the history
  11. [Web][Emscripten] Update EMCC C++ standard to C++17 (apache#12693)

    As a follow-up to apache#12337, updating
    the EMCC flags from `-std=c++14` to `-std=c++17`.
    Lunderberg authored Sep 2, 2022
    Configuration menu
    Copy the full SHA
    28cad58 View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2022

  1. [ETHOSN] Use pytest parameterization for integration tests (apache#12688

    )
    
    Using pytest parameterization helps identify the particular parameter combinations that are failing for a given test. Additionally, it can be useful when parallelizing the tests. This commit makes sure that "trials" have been replaced by parameterization as well as completing a general cleanup.
    lhutton1 authored Sep 5, 2022
    Configuration menu
    Copy the full SHA
    5dcf622 View commit details
    Browse the repository at this point in the history

Commits on Sep 6, 2022

  1. [Apps] Pin android_camera TensorFlow/Keras dependency version (apache…

    …#12710)
    
    At the moment, android camera is installing latest TF and Keras
    which is causing the following issue in CI:
    
    ```
      File ".../keras/dtensor/lazy_variable.py", line 26, in <module>
        from tensorflow.python.trackable import base as trackable
    ModuleNotFoundError: No module named 'tensorflow.python.trackable'
    ```
    
    This patch fixes the versions in the last known working versions
    of both: TF 2.9.1 and Keras 2.9.
    leandron authored Sep 6, 2022
    Configuration menu
    Copy the full SHA
    b3edb6e View commit details
    Browse the repository at this point in the history
  2. [Hexagon][Runtime] Better support for 2-tier memory (apache#12574)

    - Introduce 'global.ddr' memory scope:
      - Like 'global', this allocates memory from the Hexagon SoC's
        DDR memory.
      - Like 'global.vtcm', the specified tensor shape must be 1d
        or 2d, where 2d indicates Hexagon's "indirect tensor"
        (i.e., discontiguous) allocation scheme.
    
    - Change memory-alignment strategy to always be 2048-byte aligned
      on Hexagon.  (This can be refined in the future, but for now it
      ensures all allocations meet the strictest alignment requirements
      for any Hexagon operations.)
    Christian Convey authored Sep 6, 2022
    Configuration menu
    Copy the full SHA
    832cffa View commit details
    Browse the repository at this point in the history
  3. [TIR][StorageRewrite] Allow in-place buffer reuse of non-flat memory (a…

    …pache#12655)
    
    * [TIR][StorageRewrite] Allow in-place buffer reuse of non-flat memory
    
    Previously, shared buffer use was entirely disabled for non-flat
    memory, since the existing checks for shared memory assume flat 1-d
    spaces.  This was enforced in `FindAlloc` and validated in
    `PrepareNewAlloc`.  The validation in `PrepareNewAlloc` could trigger,
    if the buffer sharing was due to an in-place operation, and not
    through the `FindAlloc` function.
    
    In-place operations do not require N-d packing, nor do they introduce
    ambiguity in how different code generators may interpret non-flat
    physical indices.  Therefore, this commit relaxes the validation in
    `PrepareNewAlloc`, allowing buffer reuse of non-flat buffers for
    in-place operations.
    
    * Update new StorageRewrite with correct allocate/buffer_decl usage
    Lunderberg authored Sep 6, 2022
    Configuration menu
    Copy the full SHA
    744649e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    d4201a9 View commit details
    Browse the repository at this point in the history
  5. [Hexagon] Add optimized schedule for nn.pad (apache#12714)

    Motivation:
    In case of quantized models nn.pad operation typically is not fused with QNN ops
    and lives as a standalone operation. In this case it uses default injective
    schedule for Hexagon target and it is not optimized very well (based on
    analysis of real models like ResNet50 INT8).
    
    What was done:
    New schedule for Pad operation was implemented instead of default injective schedule.
    For Hexagon target injective schedule does fusion of all axis and vectorization
    on 128/64/32 (depends on dtype). It works fine for Add, Sub, etc... but not for Pad.
    New optimized schedule does these steps (fusion+vectorization) only if last tensor
    dimension is divisible by 128/64/32 (depends on dtype). It was done only for Hexagon,
    for other targets (x86, cuda, etc.) there is no changes and it uses default injective
    schedule.
    
    Benchmark results on Snapdragon 888:
    
    4d NHWC layout with ((0, 0), (1, 1), (1, 1), (0, 0)) padding, "uint8" dtype:
    
    shape              | default schedule, ms | optimized schedule, ms |      speedup      |
    -------------------|----------------------|------------------------|-------------------|
    (1, 112, 112, 32)  |         10,03        |           0.2          |    50.1x times    |
    (1, 56, 56, 128)   |         0,099        |          0,085         |  ~1x (no speedup) |
    ---------------------------------------------------------------------------------------|
    
    4d NCHW layout with ((0, 0), (0, 0), (1, 1), (1, 1)) padding, "uint8" dtype:
    
    shape              | default schedule, ms | optimized schedule, ms |      speedup      |
    -------------------|----------------------|------------------------|-------------------|
    (1, 128, 56, 56)   |         10.96        |          1.38          |    7.9x times     |
    (1, 32, 126, 126)  |          1.66        |          1.58          |  ~1x (no speedup) |
    (1, 32, 128, 128)  |         13.98        |          2.66          |    5.25x times    |
    ---------------------------------------------------------------------------------------|
    
    5d NCHWc layout with ((0, 0), (0, 0), (1, 1), (1, 1), (0, 0)) padding, "uint8" dtype:
    
    shape              | default schedule, ms | optimized schedule, ms |      speedup      |
    -------------------|----------------------|------------------------|-------------------|
    (1, 4, 56, 56, 32) |          6.39        |          0.29          |     22x times     |
    (1, 56, 56, 128)   |          0.15        |          0.15          |  ~1x (no speedup) |
    ---------------------------------------------------------------------------------------|
    
    Summary:
    For some input tensors we get up to 50x times speedup, for other performance is the same.
    No performance degradations were detected.
    ibsidorenko authored Sep 6, 2022
    Configuration menu
    Copy the full SHA
    141b17b View commit details
    Browse the repository at this point in the history
  6. [TVMC] Run module once by default (apache#12713)

    * [TVMC] Run module once by default
    
    Currently executing `tvmc run module.tar` will run the input model
    twice. For benchmaking this is to be expected as the first run is used
    to prime caches etc before taking a measurement. However, this seems a
    bit unintuitive to have as default, especially when benchmarking is not
    always intended. In this sense, this commit aims to amend the
    number of runs for the default: `tvmc run module.tar` to a single run.
    
    After inspection, this seems to be down to the use of the `.benchmark()`
    method which runs (1 + repeat * number) executions in total. This means
    that at least two runs are required (i.e. when repeat=1, number=1). It
    also seems that it is only necessary to benchmark the model when
    `--print-time` has been set from the CLI POV. From the python interface
    POV, benchmarking is always run, but this may not always be necessary.
    
    This commit makes use of the `.run()` method to singularly execute the
    model by default. From the CLI this will be used when `--print-time` is
    set to False whereas from the python interface this will be used when
    `benchmark=False`. Otherwise, the `.benchmark()` method will be used
    as before. Complementary to this change `repeat`, `number` and
    `end_to_end` parameters are only used when either `--print-time` or
    `benchmark` are set to True - and the documentation has been updated to
    indicate this.
    
    Change-Id: I18a38a9d430d660264f7fce5caf0779aa059fed3
    
    * improve documentation with number of exectuions when benchmarking
    
    Change-Id: Iecf557594420fcc9f3abcec5ce7d952db2c94271
    lhutton1 authored Sep 6, 2022
    Configuration menu
    Copy the full SHA
    da48e13 View commit details
    Browse the repository at this point in the history

Commits on Sep 7, 2022

  1. [Docs] Add Commit Message Guideline (apache#12689)

    This commit adds the Commit Message Guideline text to Apache TVM
    documentation in ./docs/contribute/pull_request.rst, under section
    'Submit a Pull Request', below subsection 'Guidelines', as a subsection
    named “Commit Message Guideline”. The text in the second-last item in
    subsection 'Guidelines' that mentions PR tags is also updated to refer
    to this guideline.
    
    This documentation will help guide contributors on how to write good
    commit messages when submitting code / creating Pull Requests, in
    accordance with RFC-0088:
    
    https://github.com/apache/tvm-rfcs/blob/main/rfcs/0088-commit-message-guideline.md
    gromero authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    85bf80c View commit details
    Browse the repository at this point in the history
  2. [TIR] Fix pragma_loop_partition_hint attrs should check it's value (a…

    …pache#12699)
    
    Current LoopPartition doesn't check the value of attribute key "pragma_loop_partition_hint". Whatever I set pragma_loop_partition_hint to True or False, the result is same, which is confused for debug.
    
    This PR fix pragma_loop_partition_hint attribute key should check it's value.
    yincs-intellif authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    6cd31e7 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    291dd2f View commit details
    Browse the repository at this point in the history
  4. [ETHOSN] Add support for transpose convolution (apache#12674)

    Adds support for offloading transpose convolution with an optional bias
    to the NPU.
    
    Co-authored-by: Samuel Panijel <samuel.panijel@arm.com>
    Co-authored-by: Leo Blonk <leo.blonk@arm.com>
    3 people authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    b55ffcd View commit details
    Browse the repository at this point in the history
  5. [microTVM][Zephyr] Enable -O2 optimization on build by default (apach…

    …e#12718)
    
    * add spped optimization flag
    
    * trigger
    
    * add exception for qemu_riscv64
    mehrdadh authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    ff9a530 View commit details
    Browse the repository at this point in the history
  6. [HEXAGON] [TOPI] Dequantize (apache#12677)

    dequantize op hexagon
    avquicinc authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    269d536 View commit details
    Browse the repository at this point in the history
  7. [Build] Update C++ standard to C++17 for AOT, iOS, VTA (apache#12712)

    Follow-up from apache#12337 and
    apache#12693, updating a few additional
    locations that specified C++14.
    Lunderberg authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    2622ac9 View commit details
    Browse the repository at this point in the history
  8. [TVMScript] IRBuilder methods for IRModule (apache#12694)

    * IRBuilder methods for `IRModule`
    
    This PR introduces IRBuilder methods for `IRModule`.
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    
    * apply code review suggestion
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    010c662 View commit details
    Browse the repository at this point in the history
  9. [TFLite][CI] Update TensorFlow dependency to 2.9.1 (apache#12131)

    This updates the TF version to be used in TVM CI to 2.9.1,
    which brings improvements so that more platforms are supported by
    official packages.
    
    When building TFLite, an update to CMake was also required,
    which is updated now to 3.18.4.
    
    ethos-u-vela dependency is also updated, from version 3.2.0 to 3.4.0
    so that it is closer to the TensorFlow version being proposed here.
    
    This PR updates the Docker images scripting to install TF and TFLite.
    
    Change-Id: I290085f0c018ad57606f1295494c19ff6e1af2dd
    leandron authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    bee5627 View commit details
    Browse the repository at this point in the history
  10. [ci] Add onnx model to S3 (apache#12716)

    Addresses this CI failure on `main`:
    https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/4235/pipeline/
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    7f788dc View commit details
    Browse the repository at this point in the history
  11. [ci] Re-balance shards (apache#12473)

    Replace '> >' in templates with >>, NFC (apache#12615)
    
    The problem with greedy lexing of >> as an operator was solved in
    C++11, and now templates no longer require spaces between >'s.
    
    Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com>
    driazati and Krzysztof Parzyszek authored Sep 7, 2022
    Configuration menu
    Copy the full SHA
    546a7da View commit details
    Browse the repository at this point in the history

Commits on Sep 8, 2022

  1. [TIR] Add unroll_loop_with_partition_hint_no_interval attr in LoopPar…

    …titionConfig to unroll loop (apache#12631)
    
    [TIR] Add unroll_loop_with_partition_hint_no_interval attr in LoopPartitionConfig
    to unroll loop
    yincs-intellif authored Sep 8, 2022
    Configuration menu
    Copy the full SHA
    abb2aa0 View commit details
    Browse the repository at this point in the history
  2. [OpenCLML] CLML Profiling fixes corresponding to OpenCL Timer recent … (

    apache#12711)
    
    * [OpenCLML] CLML Profiling fixes corresponding to OpenCL Timer recent changes.
    
    * [OpenCLML] Review comments.
    
    * * review comment
    srkreddy1238 authored Sep 8, 2022
    Configuration menu
    Copy the full SHA
    6be04d7 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    62bdc91 View commit details
    Browse the repository at this point in the history
  4. [Relay] Change when int8 operations are converted to int16 on Arm (ap…

    …ache#12671)
    
    Currently, Relay QNN uses its `helper_no_fast_int8_hw_legalization` to convert most `int8` convolution and dense operations into `int16` ones on Arm. This currently occurs on ARM chips except for `v8.2a` chips with `dotprod` support.
    
    However, this behavior means that `int8` operations are replaced with `int16` ones on Cortex-M chips. On these chips `int16` is substantially slower, as while it saves a few sign extension operations, it doubles the amount of memory loads we need to perform. 
    
    This PR changes when `helper_no_fast_int8_hw_legalization` is used on Arm, and instead makes **not** doing this replacement the standard. We will only do this replacement if we are on a chip with ASIMD support but without `v8.2a` and `dotprod`. This ensures that Cortex-M microcontrollers do not have `int8` operations turned into `int16` ones.
    
    I have also verified that this does, in fact, improve performance for some common models. For example, MobileNet_v1_0.25 on the Cortex-M4 saw a 10% performance improvement, compared to before this change. Accuracy does not seem to be affected.
    guberti authored Sep 8, 2022
    Configuration menu
    Copy the full SHA
    cd99ca6 View commit details
    Browse the repository at this point in the history
  5. [CI][AArch64] Mark tests to be skipped due to torch crash (apache#12730)

    Some integration tests are not being run on CI due to the
    configuration of the machine with onnx and torch not calling
    the integration tests script.
    
    This patch skips two more tests failing with the error message
    below:
    
    ```
    "OSError: /.../torch/lib/libgomp-d22c30c5.so.1:
    cannot allocate memory in static TLS block"
    ```
    leandron authored Sep 8, 2022
    Configuration menu
    Copy the full SHA
    2d36e46 View commit details
    Browse the repository at this point in the history
  6. [MetaSchedule] Mark two tests as xfail (apache#12733)

    This patch marks two tests as xfail for further investigation:
    * test_meta_schedule_integration_extract_from_resnet_with_filter_func
    * test_meta_schedule_integration_extract_from_resnet
    leandron authored Sep 8, 2022
    Configuration menu
    Copy the full SHA
    4f4bc26 View commit details
    Browse the repository at this point in the history
  7. [Test] Add tvm.testing.requires_libtorch (apache#12737)

    Create a specific test dependency to map to USE_LIBTORCH, which
    is disabled by deafult, and is independent from torch being
    installed on the underlying machine, so it causes problems in
    machines that have torch installed but TVM is build with
    USE_LIBTORCH OFF.
    
    Mark tests.python.contrib.test_libtorch_ops.test_backend with
    this new decorator.
    leandron authored Sep 8, 2022
    Configuration menu
    Copy the full SHA
    ed63012 View commit details
    Browse the repository at this point in the history
  8. [TIR] Handle axis_separators during FlattenBuffer (apache#12652)

    * [TIR] Moved tir.FlattenBuffer to occur before tir.LowerOpaqueBlock
    
    For buffers with more than one physical axis, the `axis_separators`
    are required in order to know which groups of logical axes to fuse
    into each physical axis.  The implementation in `tir.FlattenBuffer`
    assumed that all buffers were being flattened to a single physical
    axis.  Because `tir.LowerOpaqueBlock` replaces the
    `BlockNode::alloc_buffers` with `Allocate` nodes, `tir.FlattenBuffer`
    no longer has access to the axis separators and performs inconsistent
    flattening for `Allocate` as opposed to `BufferLoad`/`BufferStore`.
    This was introduced in apache#12172, which
    decoupled the lowering/flattening steps.
    
    The commit reorders the `tir.FlattenBuffer` to occur before
    `tir.LowerOpaqueBlock`, to make use of the axis separators.  Any
    `Allocate` nodes that exist at that point (e.g. from hand-written
    schedules) are still flattened to 1-d physical buffers, but the
    `BlockNode::alloc_buffers` are flattened according to the axis
    separators.
    
    * Add unit test to validate non-flat memory after tvm.lower
    
    * Explicitly write T.reads for test on BufferRegion updates
    
    * Update incorrect docstring for test
    
    * Use DeclBuffer information in FlattenBuffer
    
    The DeclBuffer node can be inserted during LowerOpaqueBlock, then
    provide the missing Buffer information required to flatten the
    allocation.
    
    * Use T.allocate in unit tests
    
    With the insertion of `DeclBuffer` nodes, `LowerOpaqueBlock` no longer
    needs to be before `FlattenBuffer`, and has been moved back to its
    original position.  Revering the tests to use `T.allocate` instead of
    `T.alloc_buffer` more closely represents the functions as they are
    being lowered.
    
    * Fix usage of T.decl_buffer in updated tests
    
    * Update LowerOpaqueBuffer to expect the DeclBuffer nodes
    
    * Strip DeclBuffer annotation in FlattenBuffer
    
    The DeclBuffer annotations aren't yet supported in all passes.  This
    restricts them to being introduced in LowerOpaqueBuffer, then
    immediately removed in FlattenBuffer.
    
    * Strip out all DeclBuffer nodes in FlattenBuffer
    
    * Update unit tests to remove expectation of DeclBuffer nodes
    Lunderberg authored Sep 8, 2022
    Configuration menu
    Copy the full SHA
    b2bd434 View commit details
    Browse the repository at this point in the history
  9. [TIR] Update region min/extent in ReplaceBufferMutator (apache#12725)

    Prior to this commit, `ReplaceBufferMutator` only checks
    `BufferRegionNode::buffer` to determine if a `BufferRegion` needs to
    be replaced, and doesn't check the `BufferRegionNode::region`.  As a
    result, updating `T.reads(A[B[i]])` would fail to replace `B`.
    
    This commit checks `BufferRegionNode::region` for buffer usage to
    resolve this issue.
    Lunderberg authored Sep 8, 2022
    Configuration menu
    Copy the full SHA
    299ca26 View commit details
    Browse the repository at this point in the history
  10. Move static array initialization into a function go avoid link errors (

    …apache#12678)
    
    * Move static array initialization into a function go avoid link errors
    
    * Fix line length
    rkimball authored Sep 8, 2022
    Configuration menu
    Copy the full SHA
    64031d5 View commit details
    Browse the repository at this point in the history

Commits on Sep 9, 2022

  1. [TIR, Schedule] Check consumer in-bound and covered in reverse_comput…

    …e_inline (apache#12717)
    
    * [TIR, Schedule] Generate consumer-in-bound predicate after reverse_compute_inline
    
    * Check consumer block iters are covered
    
    * fix lint
    vinx13 authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    89ce171 View commit details
    Browse the repository at this point in the history
  2. [ci][docker] Use CMake 3.20.0 for cortexm (apache#12744)

    The Zephyr project builds require 3.20.0 to work correctly
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    1c5ffc6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    cb08a12 View commit details
    Browse the repository at this point in the history
  4. [CI] Update Docker images to bring TF 2.9 and integration tests (apac…

    …he#12738)
    
    [CI] Update Docker images to tag 20220908-060034-62bdc91b1
    
    Updates all Docker images to tag 20220908-060034-62bdc91b1, to
    update TensorFlow/TFLite/Keras to 2.9, and cascaded dependencies
    such as numpy. Updates ethos-u-vela to 3.4.0.
    
    It also brings ONNX and PyTorch to ci_arm, to enable Integration
    tests to be run in CI.
    
    Standadises the minimum CMake version required in CI to be 3.18.4,
    fixing apps/microtvm/zephyr_cmsisnn to require this version.
    
    Finally, adds a new import error in the tutorials documentation
    which doesn't affect the final result. The new warning added is
    'absl:Found untraced functions such as _jit_compiled_convolution_op'
    leandron authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    90fb79b View commit details
    Browse the repository at this point in the history
  5. Aligned CMSIS-NN SHA in TVM to CMSIS top of tree (apache#12723)

    Aligned CMSIS-NN SHA in TVM to top of tree of CMSIS.
    
    -Aligned buffer size APIs to CMSIS implementations.
    -Updated the tests to match new CMSIS context buffer sizes.
    -This change needs updates to cortex-m docker image.
    
    Change-Id: I13f1ad29fe0ef02f08660eca4c818b5d66145ffc
    ashutosh-arm authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    7596964 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    1d32c40 View commit details
    Browse the repository at this point in the history
  7. [TVMScript] Base IRBuilder methods for PrimFunc (apache#12745)

    Base IRBuilder methods for `PrimFunc`
    
    This PR introduces base IRBuilder methods for `PrimFunc`.
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    8bd81e6 View commit details
    Browse the repository at this point in the history
  8. [TVMScript][TIR] Clarify scope of BlockNode::iter_vars (apache#12726)

    Previously, it was ambiguous whether `BlockNode::iter_vars` were
    in-scope for `BlockRealizeNode::predicate`.  `ConvertBlocksToOpaque`
    treated them as in-scope, and applied a mapping from `iter_vars` to
    `iter_values`.  Similarly, TVMScript printing places `T.where`
    statements below the `T.axis` statements, where `T.axis` definitions
    are in scope.  However, `BlockRealizeNode::SEqualReduce` and
    `BlockRealizeNode::SHashReduce` do not visit the block and `iter_vars`
    until after visiting the predicate, placing the `iter_vars` out of
    scope.
    
    This commit updates the printing of `T.where` to be above `T.axis`,
    and updates `ConvertBlocksToOpaque` to report an error if the
    predicate contains references to `BlockNode::iter_vars`.  After this
    commit, these three usages all consistently treat
    `BlockNode::iter_vars` as out of scope for
    `BlockRealizeNode::predicate`.
    Lunderberg authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    14999f8 View commit details
    Browse the repository at this point in the history
  9. [OpenCL] Enable OpenCL for GPU tests (apache#12490)

    * Add opencl target in test build script
    
    * Fix fp16 test and compile test for opencl
    
    * fix lint
    
    * Fix relay OpenCL texture tests
    
    * Fix lint
    
    * Enable relay OpenCL tests
    
    * Fix opencl relay texture tests
    
    * fix lint
    
    * Remove OpenCL gtest variable
    
    * Fix unbound variable
    
    * Skip tests that are not supported in CI
    
    * fix lint
    
    * Add path for opencl gtest directory
    
    * Fix opencl gtests include directory
    
    * Enable OpenCL googletest. Fix bug in opencl timer test
    
    * testing fix for build cpp tests
    
    * update googletest git version for opencl tests build
    
    * update cmakelist
    
    * Update CMakeList
    
    * Update CMakeList
    
    * Disable opencl googletests
    
    * update Opecnl.cmake
    
    * fix Opecnl.cmake
    
    * Apply comments. Remove xfail decerator for opencl tests. Now specific tests are skipped in the environment script
    
    * minor code changes
    
    * apply comments
    
    * apply comment
    
    * skip test in ci by decorator
    
    * fix pytest skipif warnings
    
    * Fix skipif for opencl gtests
    valmat07 authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    574794e View commit details
    Browse the repository at this point in the history
  10. [Frontend][Paddle] Fix op in paddle did't transmit layout information (

    …apache#12658)
    
    [Frontend][Paddle] Fix adaptive_avg_pool2d in paddle did't transmit layout information
    blackkker authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    b21bf66 View commit details
    Browse the repository at this point in the history
  11. [TIR][Arith] Add more strict checking in imm construction and folding. (

    apache#12515)
    
    * Add more strict check in tir imm construction and folding.
    
    * fix bool-compare compile error
    
    * fix some illegal imm construction in testcases
    
    * do not test i64 overflow behaviour because it is not consistent on cython and ctypes
    
    * fix float32 testcase
    
    * auto-inferred dtype should be int64 when value exceeds int32 range
    
    * add floatimm range check for fp16 and fp32
    
    * add more folding testcases and fix store fp32 folding result to double
    
    * fix i386 fp16 cases
    wrongtest-intellif authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    029fa46 View commit details
    Browse the repository at this point in the history
  12. [TOPI][Hexagon] Add test and schedule for uint8 resize2d (apache#12559)

    * [TOPI][Hexagon] Add test and schedule for uint8 resize2d
    
    * Fix correctness issue
    
    * Reformat
    
    * Remove cubic from testing
    
    * Remove unnecessary else
    trahman-quic authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    4c05656 View commit details
    Browse the repository at this point in the history
  13. [TOPI][Hexagon] Implement quantized elementwise for hexagon (apache#1…

    …2606)
    
    * [TOPI][Hexagon] Add test and schedule for uint8 resize2d
    
    * Fix correctness issue
    
    * Reformat
    
    * [TOPI][Hexagon] Implement quantized elementwise
    
    * Reformat
    
    * Address review comments
    
    * Reformat
    
    * Revert
    
    * Address review comments
    trahman-quic authored Sep 9, 2022
    Configuration menu
    Copy the full SHA
    2eed663 View commit details
    Browse the repository at this point in the history

Commits on Sep 10, 2022

  1. [ETHOSN] Update driver stack version to 22.08 (apache#12650)

    Updates the driver stack used by the NPU to the latest released version
    (semantic version 3.1.0), while maintaining backwards compatibility for
    the previous version 22.05 (semantic 3.0.1) during the migration period.
    In addition, support for split is re-introduced as this is now supported
    in 22.08.
    
    Change-Id: I86bce3469f0b8ad52e66461ae055dec6717b3527
    lhutton1 authored Sep 10, 2022
    Configuration menu
    Copy the full SHA
    76f91b4 View commit details
    Browse the repository at this point in the history

Commits on Sep 12, 2022

  1. Configuration menu
    Copy the full SHA
    286fade View commit details
    Browse the repository at this point in the history
  2. [TVMScript] Base IRBuilder methods for Block (apache#12748)

    This PR introduces base IRBuilder methods for `Block`.
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Sep 12, 2022
    Configuration menu
    Copy the full SHA
    4c863fc View commit details
    Browse the repository at this point in the history
  3. [MetaSchedule] Fix typo of compare between GlobalVar and str (apache#…

    …12704)
    
    fix typo of compare between GlobalVar and str
    wrongtest-intellif authored Sep 12, 2022
    Configuration menu
    Copy the full SHA
    a63d03a View commit details
    Browse the repository at this point in the history
  4. [CI] Always install into a python venv in ci containers (apache#12663)

    This PR changes all ci_ to install TVM Python dependencies in a
    virtualenv separate from the system Python dependencies.
    
     Sets the stage for adding the poetry-based dependency
    generator to the CI container build process.
    
    * Always install into a python venv in ci containers.
    * Respect Dockerfile ENV PATH modifications in
    docker/bash.sh lookups.
    areusch authored Sep 12, 2022
    Configuration menu
    Copy the full SHA
    a047e02 View commit details
    Browse the repository at this point in the history
  5. [Hexagon] Add Hand written HVX conv2d (apache#12204)

    * [Hexagon] Add Hand written HVX conv2d
    
    Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com>
    
    * Address review comments
    
    Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com>
    
    * Add some more comments and a file rename
    
    * Add gtest unit tests for blockize/deblockize
    
    * Add gtest unit tests fp16 utils
    
    Co-authored-by: Krzysztof Parzyszek <kparzysz@quicinc.com>
    quic-sanirudh and Krzysztof Parzyszek authored Sep 12, 2022
    Configuration menu
    Copy the full SHA
    b22b872 View commit details
    Browse the repository at this point in the history
  6. [TFLite] Support quantized GREATER op in TFLite frontend (apache#12754)

    Support GREATER quantization operation conversion as part of issue apache#9187 Continuation of apache#11519.
    dchauhan-arm authored Sep 12, 2022
    Configuration menu
    Copy the full SHA
    1222398 View commit details
    Browse the repository at this point in the history
  7. [Hexagon] Validate 2-d physical shapes for TIR-derived schedules (apa…

    …che#12662)
    
    Previously, the test cases only tested TE-based schedules.  This
    commit runs the same tests for equivalent TIR-based schedules as
    well.  This is intended to catch Hexagon-specific regressions, such as
    the one resolved in apache#12652.
    Lunderberg authored Sep 12, 2022
    Configuration menu
    Copy the full SHA
    9671aee View commit details
    Browse the repository at this point in the history
  8. [AutoTVM] Fix None feature in AutoTVM tuning (apache#12760)

    This PR introduces a couple of fixes to make AutoTVM working more
    robustly:
    - Fixed a very rarecase that `None` could pop up in AutoTVM features;
    - Fixed a misuse of `ARGS` in the testing script;
    - Fixed the filename for caching.
    junrushao authored Sep 12, 2022
    Configuration menu
    Copy the full SHA
    4d27664 View commit details
    Browse the repository at this point in the history
  9. [MetaSchedule][Test] Migrate AddRFactor to SEqual (apache#12758)

    This PR migrates the usage of `check_trace` to `check_sketch`,
    which prefers structural equality of TIRs insteda of string equalty
    of traces.
    junrushao authored Sep 12, 2022
    Configuration menu
    Copy the full SHA
    a23b71c View commit details
    Browse the repository at this point in the history

Commits on Sep 13, 2022

  1. [MetaSchedule][Test] Migrate check_trace to check_sketch (apache#…

    …12764)
    
    * Migrate AutoBind
    
    * Migrate RandomComputeLocation
    
    * Migrate CrossThreadReduction
    
    * Migrate ParallelVectorizeUnroll
    junrushao authored Sep 13, 2022
    Configuration menu
    Copy the full SHA
    ef784d6 View commit details
    Browse the repository at this point in the history
  2. [Hexagon] Create tests to showcase vtcm loading capabilities on Hexag…

    …on. (apache#12667)
    
    * [Hexagon] Increase max buffer size for tvm_rpc_android to 1GB.
    
    * [Hexagon] Make errors more clear when unable to allocate VTCM buffers and throw an error to fail early.
    
    * [Hexagon] Add mem_copy_DLTensor to enable directly calling DMA for mem copies.
    
    * [Hexagon] Add new tests as examples of the performance to expect when copying data to VTCM.
    
    * [Hexagon] Reduce rpc max size.
    
    * [Hexagon] Fix test_parallel_hvx_load_vtcm.py test output to be human readable.
    
    * Comment out tests that only work on 8Gen1 HDKs to get CI to pass
    nverke authored Sep 13, 2022
    Configuration menu
    Copy the full SHA
    8058423 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    64635b7 View commit details
    Browse the repository at this point in the history

Commits on Sep 14, 2022

  1. [FQ2I] Quantized constant bias (apache#12666)

    * support fp32 constants in quantized bias add
    
    * add a test
    
    * clean up comment
    
    * assert the bias is floating point as well as constant before requantizing
    Matthew Brookhart authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    ab8fe34 View commit details
    Browse the repository at this point in the history
  2. [Hybrid] Fix handling AST subcription for Python3.9 (apache#12769)

    fixed apache#9955, this is covered by the existing test case `tests/python/relay/test_op_level3.py::test_unique`
    vinx13 authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    91bd9a3 View commit details
    Browse the repository at this point in the history
  3. [AOT] Add AOTLowerMain pass to lower a Relay main into TIR (apache#12550

    )
    
    This is a pass refactored out of the AOTExecutorCodegen. Instead of
    combining all of the functionality of the AOTExecutorCodegen into a
    single monolithic pass, this pass only handles the lowering of the
    Relay main function into TIR. Tests for the pass are included.
    mbaret authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    f7f2cda View commit details
    Browse the repository at this point in the history
  4. [OpenCLML] More ops and network coverage (apache#12762)

    Added operators pooling (avg, max), binary operators (add, subtract, multiply, min, max) and concat.
    Clip operator with min=0 and max=6 is remapped to relu6 to take advantage of CLML acceleration
    without sub graphing this to fallback path.
    
    Added new test cases for above listed operators and also end-to-end network test cases for Resnet50
    & InceptionV3.
    
    CLML support FP16 arithmetic mode which gives significant performance boost over FP32. This PR
    enhances FP16 usage based on Operator datatype in relay graph.
    
    Co-authored-by: Krishna Raju quic_kvegiraj@quicinc.com
    Co-authored-by: Shwetank Singh quic_shwesing@quicinc.com
    srkreddy1238 authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    2aa0d1f View commit details
    Browse the repository at this point in the history
  5. [Relay][TE] Use Relay parameter name to generated TE tensor name (apa…

    …che#10516)
    
    * [Relay][TE] Use Relay parameter name to generated TE tensor name
    
    Previously, the TE placeholders representing relay function parameters
    were all named `"placeholder"`, which could be difficult to follow
    when debugging larger functions.
    Lunderberg authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    a408493 View commit details
    Browse the repository at this point in the history
  6. [CI] Set USE_CMSISNN and USE_ETHOSU off in task_config_build_cpu.sh (a…

    …pache#12456)
    
    The dependencies for these have moved into ci_cortexm Docker
    image, so there is not much point in building them for ci_cpu as we
    can't run the associated tests.
    ekalda authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    a0cbefb View commit details
    Browse the repository at this point in the history
  7. [TVMScript] IRBuilder methods for PrimFunc (apache#12755)

    This PR introduces remaining IRBuilder methods for `PrimFunc`.
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    3d7439e View commit details
    Browse the repository at this point in the history
  8. [TIR][Meta-Schedule] Tuple-reduction scheduling support (apache#11639)

    [TIR][MetaSchedule] Support Tuple Reduction
    
    This PR improves our TIR scheduling primitives/transformations (rfactor & cross-thread reduction)
    designed for reduction operators, so that they can be applied to blocks of tuple-reduction.
    MasterJH5574 authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    421ff76 View commit details
    Browse the repository at this point in the history
  9. Fixed pylint issues after moving to venv in ci_lint docker (apache#12775

    )
    
    Following change introduced installing python dependencies inside
    virtual environments: apache#12663
    Previous to this fix, a different version of python was being
    picked up that didn't catch the issues fixed in this commit.
    
    Change-Id: Ie290d9474a799311e07d293fa1b8299326b11661
    ashutosh-arm authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    296565a View commit details
    Browse the repository at this point in the history
  10. [microTVM][Zephyr] Fix PLL freq. in overlay for nucleo_l4r5zi board (a…

    …pache#12756)
    
    * [microTVM][Zephyr] Fix PLL freq. in overlay for nucleo_l4r5zi board
    
    Commit 1d32c40 ("Add project overlay to overwrite device tree configs")
    added overlay for setting 'clock-frequency' property of node 'rcc' to
    120 MHz, however to effectively change the PLL frequency that drivers
    the core it's necessary also to overlay the attributes for the 'pll'
    node. This commit does that.
    
    Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
    
    * Remove div-p and div-q properties from overlay
    
    Remove div-p and div-q properties from the overlay file since values for
    these properties will be inherited from the 'pll' that is overlaid.
    
    Since currently microTVM does not use any subsystem which relies on
    clocks associated to either P or Q params, these params can be left
    unchanged for now.
    
    Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
    gromero authored Sep 14, 2022
    Configuration menu
    Copy the full SHA
    e5adb83 View commit details
    Browse the repository at this point in the history

Commits on Sep 15, 2022

  1. [Arith][Refactor] Return Optional<PrimExpr> from TryConstFold (apache…

    …#12784)
    
    Prior to this commit, the templated `TryConstFold` utility returned an
    undefined `PrimExpr` to represent a failure to perform constant
    folding.  This commit makes this explicit by returning
    `Optional<PrimExpr>` instead.
    Lunderberg authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    397cf87 View commit details
    Browse the repository at this point in the history
  2. [TIR, Schedule] Add schedule primitive PadEinsum (apache#12750)

    * [TIR, Schedule] Add schedule primitive PadEinsum
    
    Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
    
    * lint
    
    * [TIR] Fix producer indices check in PadEinsum
    
    * address comments
    
    * simplify lambda expr
    
    * fix
    
    Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
    vinx13 and spectrometerHBH authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    1f8b5de View commit details
    Browse the repository at this point in the history
  3. [Arith] Simplify nested if_then_else (apache#12749)

    [Arith] Simplify nested if_then_else
    
    Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
    vinx13 and spectrometerHBH authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    9b10425 View commit details
    Browse the repository at this point in the history
  4. [Docker][CI][RISC-V] Build riscv-isa-sim (spike) in ci_riscv Docker i…

    …mage to enable RISC-V unit testing (apache#12534)
    
    * Remove CSI-NN from ci_cortexm docker image
    
    * [Docker] [RISC-V] Split up CSI-NN2 installation script into several files
    
    [Docker] [RISC-V] move gcc toolchain installation out of csi-nn2 script
    
    [Docker] [RISC-V] move qemu installation out of csi-nn2 script
    
    * use updated version of qemu
    
    * [Docker] [RISC-V] Install newlib (baremetal) gcc toolchain
    
    * [Docker] [RISC-V] Install spike simulator
    
    * [Docker] move initialization of timezone and DEBIAN_FRONTEND to ubuntu_install_core.sh script
    PhilippvK authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    f5517d4 View commit details
    Browse the repository at this point in the history
  5. [Target] Print deprecation warning before canonicalisation in build m…

    …odule (apache#12747)
    
    Hopefully fixes apache#12742, as the warning should only be printed when a user passes `target_host`, in the current case if the user passes `None` as `target_host` it'll be processed by `canon_target_map_and_host` which seems to always produce a `target_host` and thus triggering the warning despite the user doing nothing wrong.
    Mousius authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    c900250 View commit details
    Browse the repository at this point in the history
  6. [ci] Add retries to docker push (apache#12773)

    This should mitigate failures like in
    https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/4274/pipeline.
    This also moves the `retry` function to a script now that we have
    PR apache#12604.
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    c00ce57 View commit details
    Browse the repository at this point in the history
  7. [ci][docker] Always build cmake from source (apache#12774)

    This should fix some version drift in the current cmake versions in the
    Docker containers (currently running all of 3.10, 3.16, 3.18, and 3.20)
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    111a88d View commit details
    Browse the repository at this point in the history
  8. [ci] Remove author check from ping bot (apache#12788)

    This has been working fine for a while, this code opens it up so it's
    not limited to the authors in apache#9983.
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    5b43c62 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    afad20d View commit details
    Browse the repository at this point in the history
  10. [TVMScript] IRBuilder methods for For (apache#12786)

    This PR introduces remaining IRBuilder methods for `For`.
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    6a05184 View commit details
    Browse the repository at this point in the history
  11. [TVMScript] Fix parse minimal i32 literal for tir script (apache#12772)

    This change tries to fix an issue due to apache#12515.
    
    Previously the logic for `-2147483648` is  `parse(-literal)` = `-parse(literal)`, and all integer literals are converted to i32 (either the literal value actually overflow or not).
    
    Since after apache#12515, parse `2147483648` results in an i64 typed integer rather than i32, `-2147483648` then becomes an i64 integer too, which is not reasonable.
    wrongtest-intellif authored Sep 15, 2022
    Configuration menu
    Copy the full SHA
    9a3b3dd View commit details
    Browse the repository at this point in the history

Commits on Sep 16, 2022

  1. [community] Fix outdated contributor GitHub usernames (apache#12799)

    These couple names were linking to 404 pages, this PR updates them to
    their current counterparts.
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    c96cc11 View commit details
    Browse the repository at this point in the history
  2. [TIR] Add extra simpliciation in region cover analysis (apache#12800)

    Added extra simplify step to eliminate false negative cases.
    vinx13 authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    e6525a3 View commit details
    Browse the repository at this point in the history
  3. [MetaSchedule] Enable Clone Function for Task-Level Classes (apache#1…

    …2796)
    
    This PR introduces a clone function for each of the task-level MetaSchedule classes for convenient class deep copying.
    
    - [x] ScheduleRule
    - [x] Postproc
    - [x] Mutator
    - [x] SpaceGenerator
    - [x] SearchStrategy
    - [x] TuneContext
    zxybazh authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    02c2eae View commit details
    Browse the repository at this point in the history
  4. [MetaSchedule][Test] MLT uses SEqual tests (apache#12805)

    This PR finishes migration from `check_trace` (string-based equality
    check on TIR trace) to `check_sketch` (SEqual-based equality check on
    TIR). Here, we split multi-level-tiling into 3 files:
    - Plain multi-level tiling without any intrinsics
    - Multi-level tiling with intrinsics like VNNI, DP4a
    - Multi-level tiling with TensorCore which comes with different handling
    
    Besides, we cleaned up the testing folder and removed several methods
    that are no longer useful for unittests.
    junrushao authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    77d0a28 View commit details
    Browse the repository at this point in the history
  5. [TVMScript] IRBuilder methods for Axis (apache#12808)

    This PR introduces remaining IRBuilder methods for `Axis`.
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    c0d2734 View commit details
    Browse the repository at this point in the history
  6. [ci][docker] Fix nightly Docker tests (apache#12804)

    These were broken due to this missing guard:
    https://ci.tlcpack.ai/job/docker-images-ci/job/docker-image-run-tests/223/console
    
    Co-authored-by: driazati <driazati@users.noreply.github.com>
    driazati and driazati authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    9b17f34 View commit details
    Browse the repository at this point in the history
  7. [MetaSchedule][Minor]Fix Random State Fork in TuneContext Clone Funct…

    …ion (apache#12811)
    
    Fix random state fork in TuneContext Clone function.
    zxybazh authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    6b3be49 View commit details
    Browse the repository at this point in the history
  8. Fix for import requests and import caffe failures (apache#12813)

    Recently virtual environments were introduced in the
    docker images which was a great contribution to
    localize errors: apache#12663. In this fix, link to the caffe is
    created inside this virtual env instead of adding it
    to the system path of python. This fix also removes
    importing request package where not needed.
    
    Fixes apache#12663
    ashutosh-arm authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    8f8b6d8 View commit details
    Browse the repository at this point in the history
  9. [Hexagon] Reduce the number of tests run for VTCM testing in order to… (

    apache#12783)
    
    [Hexagon] Reduce the number of tests run for VTCM testing in order to speedup CI.
    nverke authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    43d9a3b View commit details
    Browse the repository at this point in the history
  10. [Hexagon] [runtime] Protect access to global HexagonBufferManager map (

    …apache#12807)
    
    * Protect access to global buffer manager map
    
    * Fix lint
    janetsc authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    7c96e25 View commit details
    Browse the repository at this point in the history
  11. [ci] Fix docs push (apache#12810)

    This was missing a repo checkout and failing as in
    https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/4302/pipeline.
    This also adds in the changes from apache#12719:
    
    Fixes apache#12600. The original solution there doesn't actually fix the
    issue, there would need to be some job queue that could make sure to
    reject old pushes. Since this case is pretty rare, generally the next
    commit that comes along and builds will fix everything up so we can
    ignore failures that happen on `push`es.
    driazati authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    5d0a167 View commit details
    Browse the repository at this point in the history
  12. [ci] Add bot to post welcome comment (apache#12695)

    This would post the comment that the tests bot and the docs comment bot
    uses straightaway when a PR is posted. This will contain links to
    generic info about posting PRs (and obviate the
    `.github/PULL_REQUEST_TEMPLATE.md`) as well as dynamic info about the
    specific PR (filled in later by the respective bots). This would make
    things like the auto-cc bot more transparent since it would have a link
    to the relevant issue.
    
    Tested live here: driazati#21 (comment)
    driazati authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    e037ae4 View commit details
    Browse the repository at this point in the history
  13. [Testing] Add decorator tvm.testing.requires_cuda_compute_version (ap…

    …ache#12778)
    
    * [Testing] Add decorator tvm.testing.requires_cuda_compute_version
    
    Previously, individual unit tests would call
    `tvm.contrib.nvcc.get_target_compute_version` and return early.  This
    was repeated boilerplate in many tests, and incorrectly reported a
    test as `PASSED` if the required infrastructure wasn't present.
    
    This commit introduces `tvm.testing.requires_cuda_compute_version`, a
    decorator that checks the CUDA compute version and applies
    `pytest.mark.skipif`.  If required infrastructure isn't present, a
    test will be reported as `SKIPPED`.
    
    * requires_cuda_compute_version skips test when no GPU is present
    Lunderberg authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    aded9d4 View commit details
    Browse the repository at this point in the history
  14. [Hexagon] Add debug option to hexagon pytest (apache#12795)

    * add debug option to hexagon pytest
    
    * address comment
    mehrdadh authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    bb80f19 View commit details
    Browse the repository at this point in the history
  15. [Hexagon] [runtime] Improve runtime resource management (apache#12727)

    * First pass at improving runtime resource management
    
    * Add unit test
    
    * Fix lint and clang format errors
    
    * Disable resource reset for simulator
    
    * Moved acquire/release calls to session object, separate buffer managers for non-runtime (static) and runtime (dynamic).
    
    * Fix lint errors
    
    * Fix lint errors
    
    * Improve robustness of session shutdown
    
    * Fix lint
    
    * Address feedback
    
    * Only allow call to Acquire in a clean state
    
    * Use a pointer to indicate the "active" manager
    janetsc authored Sep 16, 2022
    Configuration menu
    Copy the full SHA
    38f53e8 View commit details
    Browse the repository at this point in the history

Commits on Sep 17, 2022

  1. [TVMScript] IRBuilder methods for Block (apache#12815)

    This PR introduces remaining IRBuilder methods for `Block`.
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Sep 17, 2022
    Configuration menu
    Copy the full SHA
    41b65a3 View commit details
    Browse the repository at this point in the history
  2. [TIR] Support pattern matching argmax/argmin generated by TOPI (apach…

    …e#12827)
    
    This PR introduces two reducers to TIR reduction part, so that rfactor and cross-thread reduction can be applied to those functions who contains argmax/argmin computation generated by TOPI.
    MasterJH5574 authored Sep 17, 2022
    Configuration menu
    Copy the full SHA
    2cae905 View commit details
    Browse the repository at this point in the history
  3. [TIR] Construct the inverse in SuggestIndexMap (apache#12797)

    Computing the inverse mapping requires arithmetic analysis which is not guaranteed to cover all cases. We provide the pre-defined inverse index map instead.
    vinx13 authored Sep 17, 2022
    Configuration menu
    Copy the full SHA
    91cce56 View commit details
    Browse the repository at this point in the history
  4. [BugFix][TIR] Fix Buffer LCA Detector (apache#12819)

    Prior to this PR, the LCA detector of buffers in TIR didn't take buffer memory scopes and GPU hierarchy into consideration. An consequent issue is that, when an intermediate buffer is in global memory, TIR's lowering passes don't necessarily allocated the intermediate buffer outside all `blockIdx`. As a result, the global intermediate buffer is allocated under a GPU thread block, which is illegal.
    
    This PR fixes this issue by fixing the LCA detector, making it be aware of the buffer memory scopes and GPU hierarchy. With this fix, the global intermediate buffers are all allocated outside `blockIdx`.
    MasterJH5574 authored Sep 17, 2022
    Configuration menu
    Copy the full SHA
    e92f5d4 View commit details
    Browse the repository at this point in the history
  5. [TVMScript] Add more helper functions to the printer infra (apache#12829

    )
    
    This PR is split from apache#12492, to make the necessary updates to the printer infra for future PRs of TIR printer.
    
    Tracking issue: apache#11912
    
    Co-authored-by: Greg Bonik <gbonik@octoml.ai>
    yelite and gbonik authored Sep 17, 2022
    Configuration menu
    Copy the full SHA
    1ecf084 View commit details
    Browse the repository at this point in the history

Commits on Sep 18, 2022

  1. [MetaSchedule] Relax conditions of rule Cross-Thread Reduction (apach…

    …e#12825)
    
    This PR relaxes the conditions of Meta-Schedule schedule rule CrossThreadReduction. The rules are previously a bit over-strict, and some workloads with small reduction loop length are unable to be optimized by cross-thread reduction automatically. In this PR, we relax the rules so that such workloads can be optimized.
    MasterJH5574 authored Sep 18, 2022
    Configuration menu
    Copy the full SHA
    d1871a6 View commit details
    Browse the repository at this point in the history
  2. [TVMScript] IRBuilder methods for Stmt (apache#12830)

    This PR introduces  IRBuilder methods for `Assert`, `Let`, `Realize`, `Evaluate`, `LaunchThread`, `EnvThread`.
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Sep 18, 2022
    Configuration menu
    Copy the full SHA
    b2c5add View commit details
    Browse the repository at this point in the history
  3. [TVMScript] IRBuilder methods for Stmt (apache#12831)

    This PR introduces  IRBuilder methods for
    `allocate`, `Let`, `allocate_const`, `attr`,  `While`, `If/Then/Else`, `decl_buffer`, `buffer_store`, `prefetch`.
    
    Co-authored-by: yongwww <yongcale@gmail.com>
    cyx-6 and yongwww authored Sep 18, 2022
    Configuration menu
    Copy the full SHA
    052e702 View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2022

  1. [Frontend][TFLite] fix detection_postprocess's non_max_suppression_at…

    …trs["force_suppress"] (apache#12593)
    
    * [Frontend][TFLite]fix detection_postprocess's non_max_suppression_attrs["force_suppress"]
    
    Since tvm only supports operators detection_postprocess use_regular_nms
    is false, which will suppress boxes that exceed the threshold regardless
    of the class when implementing NMS in tflite, in order for the results
    of tvm and tflite to be consistent, we need to set force_suppress to
    True.
    
    * [Frontend][TFLite]fix detection_postprocess's non_max_suppression_attrs[force_suppress]
    
    Added a test case that reproduces inconsistent results between tvm and tflite
    When the force_suppress is false,it will get a good result if you set the force_suppress as true
    czh978 authored Sep 19, 2022
    Configuration menu
    Copy the full SHA
    60cf692 View commit details
    Browse the repository at this point in the history
  2. [TIR] Implement API for padded layout transformations (apache#12720)

    Implementation of API in `tvm.tir.schedule` for layout transformations
    with padding, as part of apache#12261,
    item "Insert pad value into generated TIR, using `tir::if_then_else`,
    `builtin::assume`, and `builtin::undef`".
    
    Following the RFC discussion in
    apache/tvm-rfcs#77 (comment) and
    apache/tvm-rfcs#77 (comment),
    this commit preferentially rewrites the loops that surround a padded
    transformation where possible, in order to express padding in terms of
    `tir::if_then_else`.
    Lunderberg authored Sep 19, 2022
    Configuration menu
    Copy the full SHA
    2af9b90 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f417555 View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2022

  1. Add missing args

    yelite committed Sep 20, 2022
    Configuration menu
    Copy the full SHA
    0123e1a View commit details
    Browse the repository at this point in the history
  2. Add todo

    yelite committed Sep 20, 2022
    Configuration menu
    Copy the full SHA
    be0a16c View commit details
    Browse the repository at this point in the history

Commits on Sep 21, 2022

  1. Configuration menu
    Copy the full SHA
    23a6658 View commit details
    Browse the repository at this point in the history

Commits on Sep 22, 2022

  1. Configuration menu
    Copy the full SHA
    058b8ee View commit details
    Browse the repository at this point in the history

Commits on Sep 23, 2022

  1. Configuration menu
    Copy the full SHA
    03d630f View commit details
    Browse the repository at this point in the history