forked from apache/tvm
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Torchdynamo tuning script #9
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…#12481) * trace.cc * add tests * remove assert * add proper test * lint * lint
…stores are not generated at LLVM level. This is a workaround for an instruction selection issue in current version of llvm for hexagon (apache#12471)
* [TVMScript] IRBuilder, IRBuilderFrame base class This PR introduces basic data structures of the generic IRBuilder across the codebase. IRBuilder is a general-purpose IRBuilder that can be used in TIR, Relax and any other vendor-specific dialects; IRBuilderFrame is where contexual information as stored in the IRBuilder. * fix linter * Update include/tvm/script/ir_builder/base.h Co-authored-by: Junru Shao <junrushao1994@gmail.com>
* Auto-vectorization (fp16) for v68 * use tvm.testing.main in fp16 test of tanh_slice op
* add bfloat16 promotion for CallNode * add softmax to bfloat16 build test
Previously `CMSISNNFlags` was derived using logic specific to the external code generator, this converts the external code generator options into a `Target`.
…che#12474) * [Target] Only append default keys if target doesn't have any yet This allows target parsers to provide their own target keys. Without this change, the default keys would always be appended, which may or may not be desirable. * Add "cpu" to ARM CPU keys * Add "cpu" to the keys in the mprofile target parser * Restore the mprofile cpptest, since the "cpu" key is back * So the -device attribute is actually needed...
To figure out a user's association with the repo this code before searched the associations in the repo filtered by the relevant username. GitHub doesn't return the exact match only though, so we have to instead collect many results and search through all of them. Co-authored-by: driazati <driazati@users.noreply.github.com>
* add config space * lint * lint
* fix scatterND large shape problem * fix thread pool alloca * add scatternd unit test * update with comment * Empty Co-authored-by: wrongtest <wrongtest0@gmail.com>
Fix some typos in src/. Co-authored-by: driazati <driazati@users.noreply.github.com>
…apache#12497) * [Relay][Layout] FInferCorrectLayout for L2 norm layout change. * [Relay][Layout] Test for L2 norm layout transform. * [Relay][Layout] Re-edit test to add multi-dimensional axis list. * Fix cpplint errors * Use clang-format-10 rules. * replace uint with size_t.
Following apache#12197, this PR introduces `Schedule.show()` which convenience the user experience in the following two aspects: - Python syntax highlighting - Outputs a schedule function instead of standalone instructions so that it's easier to follow. To demonstrate this change: - Before `Schedule.show()` is introduced: <img width="555" alt="image" src="https://user-images.githubusercontent.com/22515877/185713487-03722566-1df7-45c7-a034-c1460d399681.png"> - After this change: <img width="583" alt="image" src="https://user-images.githubusercontent.com/22515877/185713564-c54f3a9d-cd52-4709-a8b8-d8a61361e611.png">
This PR migrates the existing MemoryDatabase, which is implemented in python at the moment, to C++. The original intent of having an in-memory database that does not persist on disk is merely for testing, but as times go on, we found it useful in production workflow, and thus decided to migrate it C++ for potentially better performance.
This PR: - Adds an entry point for the TVMScript Unified Printer - Adds a helper object class `RootNodeContainer` to provide an injection point for the actual printer implementation to add specialized logic on the root node to print. Tracking issue: apache#11912
) This PR adds boolean operators to OperationDoc. This is needed by the TIR expression printing because it has `tir::And` and `tir::Or`. Tracking issue: apache#11912
…e#12347) Removes support for driver stack versions older than 22.05 (semantic 3.0.1). Additionally, changes the integration to make version checks using semantic versioning rather than the previous year.month versioning method.
…#12489) * [TIR] Support AllocConstantNode in CreatePrimFunc * Handle AllocConstantNode in LeafBlockRemovalPlan * Properly handle AllocConstNode in BufferAllocationLocator * handle AllocateConst in EstimateFlops * remove NDArray printing * doc update * add test * cpplint * Removed dependency on link-params attribute from target * Restored NDArray printing to unbreak test
This PR introduces remaining IRBuilder methods for `Axis`. Co-authored-by: yongwww <yongcale@gmail.com>
These were broken due to this missing guard: https://ci.tlcpack.ai/job/docker-images-ci/job/docker-image-run-tests/223/console Co-authored-by: driazati <driazati@users.noreply.github.com>
…ion (apache#12811) Fix random state fork in TuneContext Clone function.
Recently virtual environments were introduced in the docker images which was a great contribution to localize errors: apache#12663. In this fix, link to the caffe is created inside this virtual env instead of adding it to the system path of python. This fix also removes importing request package where not needed. Fixes apache#12663
apache#12783) [Hexagon] Reduce the number of tests run for VTCM testing in order to speedup CI.
…apache#12807) * Protect access to global buffer manager map * Fix lint
This was missing a repo checkout and failing as in https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/4302/pipeline. This also adds in the changes from apache#12719: Fixes apache#12600. The original solution there doesn't actually fix the issue, there would need to be some job queue that could make sure to reject old pushes. Since this case is pretty rare, generally the next commit that comes along and builds will fix everything up so we can ignore failures that happen on `push`es.
This would post the comment that the tests bot and the docs comment bot uses straightaway when a PR is posted. This will contain links to generic info about posting PRs (and obviate the `.github/PULL_REQUEST_TEMPLATE.md`) as well as dynamic info about the specific PR (filled in later by the respective bots). This would make things like the auto-cc bot more transparent since it would have a link to the relevant issue. Tested live here: driazati#21 (comment)
…ache#12778) * [Testing] Add decorator tvm.testing.requires_cuda_compute_version Previously, individual unit tests would call `tvm.contrib.nvcc.get_target_compute_version` and return early. This was repeated boilerplate in many tests, and incorrectly reported a test as `PASSED` if the required infrastructure wasn't present. This commit introduces `tvm.testing.requires_cuda_compute_version`, a decorator that checks the CUDA compute version and applies `pytest.mark.skipif`. If required infrastructure isn't present, a test will be reported as `SKIPPED`. * requires_cuda_compute_version skips test when no GPU is present
* add debug option to hexagon pytest * address comment
* First pass at improving runtime resource management * Add unit test * Fix lint and clang format errors * Disable resource reset for simulator * Moved acquire/release calls to session object, separate buffer managers for non-runtime (static) and runtime (dynamic). * Fix lint errors * Fix lint errors * Improve robustness of session shutdown * Fix lint * Address feedback * Only allow call to Acquire in a clean state * Use a pointer to indicate the "active" manager
This PR introduces remaining IRBuilder methods for `Block`. Co-authored-by: yongwww <yongcale@gmail.com>
…e#12827) This PR introduces two reducers to TIR reduction part, so that rfactor and cross-thread reduction can be applied to those functions who contains argmax/argmin computation generated by TOPI.
Computing the inverse mapping requires arithmetic analysis which is not guaranteed to cover all cases. We provide the pre-defined inverse index map instead.
Prior to this PR, the LCA detector of buffers in TIR didn't take buffer memory scopes and GPU hierarchy into consideration. An consequent issue is that, when an intermediate buffer is in global memory, TIR's lowering passes don't necessarily allocated the intermediate buffer outside all `blockIdx`. As a result, the global intermediate buffer is allocated under a GPU thread block, which is illegal. This PR fixes this issue by fixing the LCA detector, making it be aware of the buffer memory scopes and GPU hierarchy. With this fix, the global intermediate buffers are all allocated outside `blockIdx`.
) This PR is split from apache#12492, to make the necessary updates to the printer infra for future PRs of TIR printer. Tracking issue: apache#11912 Co-authored-by: Greg Bonik <gbonik@octoml.ai>
…e#12825) This PR relaxes the conditions of Meta-Schedule schedule rule CrossThreadReduction. The rules are previously a bit over-strict, and some workloads with small reduction loop length are unable to be optimized by cross-thread reduction automatically. In this PR, we relax the rules so that such workloads can be optimized.
This PR introduces IRBuilder methods for `Assert`, `Let`, `Realize`, `Evaluate`, `LaunchThread`, `EnvThread`. Co-authored-by: yongwww <yongcale@gmail.com>
This PR introduces IRBuilder methods for `allocate`, `Let`, `allocate_const`, `attr`, `While`, `If/Then/Else`, `decl_buffer`, `buffer_store`, `prefetch`. Co-authored-by: yongwww <yongcale@gmail.com>
…trs["force_suppress"] (apache#12593) * [Frontend][TFLite]fix detection_postprocess's non_max_suppression_attrs["force_suppress"] Since tvm only supports operators detection_postprocess use_regular_nms is false, which will suppress boxes that exceed the threshold regardless of the class when implementing NMS in tflite, in order for the results of tvm and tflite to be consistent, we need to set force_suppress to True. * [Frontend][TFLite]fix detection_postprocess's non_max_suppression_attrs[force_suppress] Added a test case that reproduces inconsistent results between tvm and tflite When the force_suppress is false,it will get a good result if you set the force_suppress as true
Implementation of API in `tvm.tir.schedule` for layout transformations with padding, as part of apache#12261, item "Insert pad value into generated TIR, using `tir::if_then_else`, `builtin::assume`, and `builtin::undef`". Following the RFC discussion in apache/tvm-rfcs#77 (comment) and apache/tvm-rfcs#77 (comment), this commit preferentially rewrites the loops that surround a padded transformation where possible, in order to express padding in terms of `tir::if_then_else`.
yelite
pushed a commit
that referenced
this pull request
Feb 17, 2023
* Relax pretty printer initial prototype * call into TVMScriptPrinter for PrimFuncs * most round-trip tests pass * address comments * implement relax.output syntax for dataflow block outputs * remove leftover comments * fix Var constructor on ShapeExpr annotation * fix DataflowVar as well
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.