forked from apache/tvm
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upate #8
Merged
Merged
upate #8
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* [Refactor] Avoid Override Generic Op Strategy in "hls.py" * Fix The Broken CI Test Cases
Set the number of cores for scripts and builds that run inside the RVM based on the specified number of cores for the VM. Currently Vagrant doesn't set env. variable TVM_CI_NUM_CORES with the number of cores available in the VM created by Vagrant, as a consequence the scripts and builds (like the ones used to build TVM and QEMU) that run inside the VM after it is created will use the default number of only 2 CPUs, so not using the full CPU resources available in the VM, in case there are more than 2 cores available. This commit sets TVM_CI_NUM_CORES equal to the number of cores available in the VM created by Vagrant so the builds (which use that environment variable to find out the number of CPUs that must be used for the builds) can use all the CPUs available, speeding up the builds. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
- Move "from_device" argument definition from "vulkan" target to all targets. - Add device querying to TargetInternal::FromConfig, using "from_device" argument. If present, these have lower priority than explicitly-specified attributes, but higher priority than the default attribute values. - Add default no-op DeviceAPI::GetTargetProperty. Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
* [Runtime] Add graph_executor get_input_index API. In graph_executor use case, user can use set_input with input index to set input parameter, but there is no straight forward way to get correct index number with input name, here provide get_input_index API to do such work. * Update python/tvm/contrib/graph_executor.py Co-authored-by: Cody Yu <comaniac0422@gmail.com> * Update python/tvm/contrib/graph_executor.py Co-authored-by: Cody Yu <comaniac0422@gmail.com> * Update src/runtime/graph_executor/graph_executor.cc Co-authored-by: Cody Yu <comaniac0422@gmail.com> * Update python/tvm/contrib/graph_executor.py Co-authored-by: Cody Yu <comaniac0422@gmail.com> Co-authored-by: Cody Yu <comaniac0422@gmail.com>
* [Target] Allow for spaces in target attributes. Some target parameters, such as the device_name on vulkan, have spaces in them. This prevented round-trips between string and Target objects, which can occur in some cases. * [Vulkan] Fixed "device_name" property querying. * [Target] Switched from escaped spaces to quoted spaces. Instead of -attr=value\ with\ spaces, will instead be written as -attr='value with spaces'. Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
* [AMP] Do not allow fp16 cast on arange inputs * add test * Add comment explaining the issue with fp16 "end"
Platform boards passed to base-box-tool.py need to be a subset of platform boards support by 'tests/micro/zephyr --microtvm-platforms='. Currently base-box-tool.py only accepts the 'stm32f746xx' ST board, which is not supported by 'tests/micro/zephyr --microtvm-platforms='. As a consequence if one passes '--microtvm-platform=stm32f746xx' to base-box-tool.py the 'tests/micro/zephyr' test will fail. That commmit fixes it by adding two new platforms to base-box-tool ('stm32f746xx_nucleo' and 'stm32f746xx_disco') which are supported by tests/micro/zephyr and by removing the nonexistent 'stm32f746xx' platform. The new platform boards are quite similar and share the same USB VID and PID. Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
- Pass parameters through TVMRetValue as std::string instead of runtime::String - Remove escaping of spaces inside quotes for target attributes. Updated unit test to verify round-trip behavior. - Added missing "device_type" query for Vulkan. Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
We kill the rpc server in the del function. When a server co-exist with remote resources in the same function scope, the destruction order is not determined. This can cause server to be destructed before the actual remote array. As a side effect, it can cause sometime test to timeout due to waiting on the socket.
* Fix support for linking to only libtvm_runtime also ensures that the ResNet example uses the new support. * Fix build.rs to rebuild if the Python script changes Co-authored-by: Jared Roesch <roeschinc@gmail.com>
#8660) Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
* Add transpose support for tensorrt batch_matmul * Address PR comment * Refactor to add ONNX_DEFAULT_CONFIGS
* fix * fix * lint
* [TENSORIR] Add `from_legacy_te_schdule` attr to TE PrimFuncs The `from_legacy_te_schedule` marks PrimFuncs created from TE scheduling. Passes that only operate on TE scheduling check this attrs and no op if it is not found. If `from_legacy_te_schedule` is false or not set, then it is assumed that the PrimFunc is from TensorIR. Passes specific to TensorIR now check for the absence of this attr. * formatting * enable passes regardless of te or not
* Move flake8 to ci_lint This fixes the scenario where you lint with ci_lint but it can still fail in PR due to flake8 being injected only into the Mac build. * Disable flake8 until the docker changes have landed
* Add linear congruential engine. * Fix typo. * Minor fix. * Fix comments and intros. * Change to unsigned. * Minor comment fix. * Fix unsigned rand state to signed.
* fuse dence sum * remove excess copying * dev LSTM in ONNX * alternative implementation of LSTM in onnx frontend. It is quicker than current one without tuning * LSTM_dev2 was implemented in onnx frontend * LSTM dev in pytorch frontend * LSTM cell implementation was transferred to common place. Unneccessary code was removed * lint fixes * Weights permutation for LSTM layer in onnx frontend * LSTM cell description was added * arguments and values were renamed. descriptions of some methods were added * LSTM output shape and actvations input format were fixed in onnx frontend * empty. tvm-ci test * unbind method was transferred from onnx frontend to common.py * unbind method was transferred from pytorch frontend to common.py * lstm cell was transferred from op/layers.py to frontend/common.py * clean up weight dictionary initialization * fix pytorch frontend wrapper over unbind method * minor fix of comments * empty. tvm-ci test restart * empty. tvm-ci test restart Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
…d target (#8542) * [Onnx][UnitTests] Excluded additional onnx tests - The onnx tests `test_basic_convinteger`, `test_convinteger_with_padding`, `test_range_float_type_positive_delta_expanded`, and `test_range_int32_type_positive_delta_expanded` don't run correctly on CUDA targets, so they are added to the exclusion. - Parametrized over the relative directory name, rather than the full directory name. This improves readability of the pytest output, and keeps the same parametrized test name across different python version. - Changed the target-specific skips to check the target kind, rather than the full target string. * [UnitTests] Apply correct requires_gpu() pytest marks for parametrized target Prevoiusly, the addition of tvm.testing._target_to_requirement pytest marks was handled by the parametrize_targets function. The _auto_parametrize_target function assumed that a unit test that was already parametrized had all markings needed. If a unit test was explicitly parametrized using @pytest.mark.parametrize, these marks would be missing. In most cases, this explicit use of @pytest.mark.parametrize('target', ...) should be avoided, but has value in the case of marking with multiple parameters with @pytest.mark.parametrize('target,other', ...). This use case isn't yet supported by the tvm.testing.parameters function. Therefore, if this occurs, detect it and add the appropriate marks. * [UnitTest] Bugfix, applying requires_* markers to parametrized targets. Initial implementation did work correctly with @tvm.testing.parametrize_targets. Also, went through all cases where "target" is used to parametrize on something other than a target string, and renamed. * [Onnx] Switched from using pytest.skip to tvm.testing.known_failing_targets After merging of the `tvm.testing.parametrize_targets` and `tvm.testing._auto_parametrize_target` code paths, `known_failing_targets` can be used in both cases. * [Testing] Enable `Target` object as argument to _target_to_requirement Previously, tvm.testing._target_to_requirement required the argument to be a string. This commit allows it to be either a string or a `tvm.target.Target`. * [Testing] Auto-target parametrization, handle pytest ParameterSet If the unit test has already been parametrized with pytest.params to add parameter-specific marks, respect those existing marks. This can happen in some cases in the CI, uncertain yet what is causing them. Maybe pytest-xdist related, but there's some difficulty in reproducing it locally. Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
* add hex indicator to message * add pytest skip * trigger * trigger
* conv2d working, fixing conv2d_depthwise * Depthwise conv2d working. * Make convinteger work on cuda. * Simplify code and add tests. * Formatting. * Fixed fallback broadcasting. * Fix fallback broadcasting. * Formatting. * Fix lint * Merge with new test parameterization.
…#8529) * [Topi][Testing] Minor cleanup for python reference implementations - Use input dtype for dilate/conv2d accumulate in python impl. Previously, the python implementations of dilation and conv2d would use numpy default dtype in some cases, rather than the input data's dtype. - Added fallback for datatypes not supported by scipy.signal.convolve2d (e.g. float16). - Refactored to avoid duplication, use common get_pad_tuple functionality. * [Topi][UnitTests] Added float16 tests to test_topi_dense.py * [Topi][UnitTests] Added float16 to test_topi_conv2d_nchw.py * [Topi][Float16] Added float16 tests for depthwise conv2d. * [UnitTests] Explicitly set seed for float16 tests Intended to avoid flaky test failures later due to rounding errors. * [UnitTests] Fixed a few failing unit tests. - ref_data must be a test fixture, not acquired through request.getfixturevalue, in order to have the random_seed be known. - dilate_python's return value didn't follow `out_dtype`. - The test_topi_conv3d tests had the reference results computed in float64, due to dilate_python() not respecting the input data type. With the correct dtype, the tolerances needed to be slightly widened. Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
* Add Arduino CLI support to ci-qemu * Install latest version of Arduino SDK * Remove unnecessary --fix-missing * Tweak to clarify what URLs go with what * Retrigger CI * Temporarily replace buggy Spresense core
…ut (#8677) * add timeout * rename timeout and change timeout to a reasonable value * fix tests after project api merge * retrigger because of flaktest
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
* Fix Rust CI * Turn Rust CI back on
* [Docs] Added documentation on pytest target parametrization. Follow-up from #8542, to document existing features. * [Docs] Updated pytest parametrization documentation following review Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
* Fix obvious memory leak in function.rs * Update object pointer
GPU memory is only released once the PackedFunc for evaling the model is gced by Python. In CI we're noticing intermittent 'CUDA: Out of memory' failures while processing the tutorials, and tracing showed there was no gc happening between items. Not confident this will solve the problem but worth a try.
* refactor host to qemu * remove unused variables * remove skip-build arg * fix microtvm test script
* [Docker] Refactor/clean-up of docker/bash.sh - Added detailed help message, displayed using `-h` or `--help`. - Optional flags handled using `getopt`, can now occur in any order. - `--mount` flag may occur more than once. - Switched from short arguments to docker-run to long arguments (e.g. `--volume` instead of `-v`). Short arguments are good shortcuts for interactive work, but can be more difficult to read in longer scripts. - Mount the `.tvm_test_data` folder, to avoid re-downloading test data already available in the host environment. * [Docker] docker/bash.sh CI fix Dash-prefixed arguments as part of the command now require prefixing with -- to separate them from arguments intended for docker/bash.sh * [Docker] docker/bash.sh, consistent quoting * [Docker] Added --repo-mount-point for docker/bash.sh * [Docker] Updated command-line parsing of docker/bash.sh - Maintained previous behavior, any unrecognized flags after the docker/bash.sh are part of the command, no -- is needed. (e.g. docker/bash.sh ci_gpu make -j2) - Reverted changes to Jenskinsfile to add a --, no longer needed. * [Docker] Fixed multi-argument commands * [Docker] docker/bash.sh check permissions before mounting ~/.tvm_test_data * [Docker] Consistent workplace directory in docker/bash.sh for Jenkins Some locations in the CI perform build commands outside of the build steps (e.g. tests/scripts/task_ci_setup.sh#L38), and cmake doesn't like it if the build directory changes. These should probably be moved into the build steps of the CI, and be packed in tvm_multilib in the Jenkinsfile, but for the meantime maintaining a consistent /workspace directory on all CI nodes allows cmake to run. * [Docker] Updated bash.sh for MacOS compatibility MacOS has an older version of bash that handles arrays slightly differently. All instances of array expansion `"${ARRAY[@]}"` should instead be written as `${ARRAY[@]+"${ARRAY[@]}"}`. Otherwise, `set -u` will erroneously complain about an undefined variable. See https://stackoverflow.com/a/61551944 for details. Even though this is an older version of bash (observed in version 3.2.57), this is the last major version available under GPLv2 and is therefore the default version on MacOSX. At some point, the `docker/bash.sh` could be migrated to python for ease of maintenance/testing.
* [Docs][UnitTest] Updated target parametrization documentation The intended audience are developers writing unit tests, or debugging unit tests that have failed. Therefore, moving the recommended style to the top of the section, and the implementation details to the bottom. * Documentation updates as recommended by tkonolige
* Refactor AOT Test Utils parameters into object `compile_and_run` was getting quite complicated to understand as well as being mostly duplicated by `comile_and_run_multiple_models`. This patch pulls out some common parameters into a data class `AOTTestNetwork` which makes it clearer what each parameter is doing and provides documentation. * Rename Network -> Model and sizebytes -> size_bytes
* Convert AOT to TECompiler This removes the dependency on "compile_engine.h" from aot_executor_codegen.cc. This required a few changes to how AOT was operating: * AOT run_model is now based on the post lowering main_module * AOTOnDemandAllocator is ran twice to ensure SIDs are updated post-lowering * Moved to using tec::UpdateFunctionMetadata Tests are passing, but would appreciate other validation 😸 * Clarify reasoning behind replanning memory later * Use main_func_info rather than bespoke logic in AOT This moves from using the bespoke AOT UpdateMainWorkspaceSize to the LoweredModule main_func_info property to unify with Graph executor codegen.
* clean up typerel * add layout transform when input is 3D * add test * update doc to clarify that only 2D input data is supported * add weight_layout attribute in dense * remove explicit layout transform from dense_alter_op.py * Add DensePackInferCorrectLayout to insert layout transform * relax type rel * revert type rel relax and add check on dim * introduce DensePackAttrs to avoid breaking dense op * try fixing arm compute lib test * Update tests/python/contrib/test_arm_compute_lib/test_dense.py Co-authored-by: lhutton1 <35535092+lhutton1@users.noreply.github.com> * formatting Co-authored-by: lhutton1 <35535092+lhutton1@users.noreply.github.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
* [UnitTest] Updated tolerances to avoid flaky unit test. The result was correct, but the atol was just small enough to trigger a CI error for a value that was close to zero in an unrelated PR at #8670. https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/PR-8670/16/pipeline/#step-236-log-1703 * Also updated 32-bit version of test_conv2d_nchw
* alternative chunk op was implemented in pytorch frontend. aten::unsafe_chunk was added to op map in pytorch frontend * chunk was replaced by new one in pytorch frontend. it is faster in 2.5 times Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
This PR is part of the TensorIR upstreaming effort (#7527), which adds the one schedule primitive storage_align. Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com>
jiangjiajun
pushed a commit
that referenced
this pull request
Sep 22, 2021
* WIP support per-channel quantization * more WIP * More WIP * fix issue with per-channel bias_add * Fix fake quantize tests (#4) * Fixed fake quantize issues. * Formatting. * Cleanup unused imports * Fix real int8 tests. * Add Relu * One more little one (#5) * Fixed fake quantize issues. * Formatting. * Cleanup unused imports * Fix real int8 tests. * Fix requantize shape bug. * Non-working Per-channel Dense * Fix legalization for non spatial operators. (#6) * Fix legalization for non spatial operators. * Fix axis checks for end2end functionality. * fix axis normalization fix lint fix lint again * Per channel fq2i (#8) * WIP support per-channel quantization * more WIP * More WIP * fix issue with per-channel bias_add * Fix fake quantize tests (#4) * Fixed fake quantize issues. * Formatting. * Cleanup unused imports * Fix real int8 tests. * Add Relu * One more little one (#5) * Fixed fake quantize issues. * Formatting. * Cleanup unused imports * Fix real int8 tests. * Fix requantize shape bug. * Non-working Per-channel Dense * Fix legalization for non spatial operators. (#6) * Fix legalization for non spatial operators. * Fix axis checks for end2end functionality. * fix axis normalization fix lint fix lint again * Fix bug in requantize dimension expansion. * Format. Co-authored-by: Josh Fromm <jwfromm@octoml.ai> * respond to review comments respond to review comments Co-authored-by: Josh Fromm <jwfromm@octoml.ai>
jiangjiajun
pushed a commit
that referenced
this pull request
Sep 22, 2021
* WIP support per-channel quantization * more WIP * More WIP * fix issue with per-channel bias_add * Fix fake quantize tests (#4) * Fixed fake quantize issues. * Formatting. * Cleanup unused imports * Fix real int8 tests. * Add Relu * One more little one (#5) * Fixed fake quantize issues. * Formatting. * Cleanup unused imports * Fix real int8 tests. * Fix requantize shape bug. * Non-working Per-channel Dense * Fix legalization for non spatial operators. (#6) * Fix legalization for non spatial operators. * Fix axis checks for end2end functionality. * fix axis normalization fix lint fix lint again * Per channel fq2i (#8) * WIP support per-channel quantization * more WIP * More WIP * fix issue with per-channel bias_add * Fix fake quantize tests (#4) * Fixed fake quantize issues. * Formatting. * Cleanup unused imports * Fix real int8 tests. * Add Relu * One more little one (#5) * Fixed fake quantize issues. * Formatting. * Cleanup unused imports * Fix real int8 tests. * Fix requantize shape bug. * Non-working Per-channel Dense * Fix legalization for non spatial operators. (#6) * Fix legalization for non spatial operators. * Fix axis checks for end2end functionality. * fix axis normalization fix lint fix lint again * Fix bug in requantize dimension expansion. * Format. Co-authored-by: Josh Fromm <jwfromm@octoml.ai> * respond to review comments * start dtos * wip depth_to_space * dtos ident Co-authored-by: Matthew <mbrookhart@octoml.ai> Co-authored-by: Josh Fromm <jwfromm@octoml.ai>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.