-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LANG] Add tile operation #5
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tqchen
added a commit
to tqchen/tvm
that referenced
this pull request
May 26, 2018
tqchen
added a commit
that referenced
this pull request
May 29, 2018
tqchen
added a commit
to tqchen/tvm
that referenced
this pull request
Jul 6, 2018
tqchen
pushed a commit
to tqchen/tvm
that referenced
this pull request
Jul 12, 2018
tqchen
pushed a commit
that referenced
this pull request
Jul 12, 2018
tqchen
pushed a commit
to tqchen/tvm
that referenced
this pull request
Aug 4, 2018
sergei-mironov
referenced
this pull request
in sergei-mironov/tvm
Aug 8, 2018
sergei-mironov
referenced
this pull request
in sergei-mironov/tvm
Aug 8, 2018
jroesch
referenced
this pull request
in jroesch/tvm
Aug 29, 2018
Restores the tests which were lost in the repo port, and makes it possible for conv2d to typecheck, as well as some integration tests. * Add back python tests that were missing from the repo move (history lost, sorry) * Ensure shape evaluator doesn't trip up on type vars or type ids * Add tests for shape evaluator when faced with type var or type id * Use Strings as key for Attributes, repair tests and uses * Add preliminary information for conv2d operator * Add test of attr propagation * Ensure attributes hash by string value rather than string pointer identity (tests still failing though, idk why) * Adjust shape equality checks to use ordinary visitor, as nested type id information was not transferring * Repair integration test by using std::unordered_map for attrs, add cases * Add regression test for alpha-eq comparison of type IDs across nested shapes (was losing the type id equality map in alpha_eq * Add more integration test variants * Correct import names in tests * Add clarifying comment to typechecker * Missing paren in operators.py * Correct python source directory for mypy
tmoreau89
pushed a commit
to tmoreau89/tvm
that referenced
this pull request
Jan 2, 2019
* group conv pass all * pass mobilenet
tmoreau89
pushed a commit
to tmoreau89/tvm
that referenced
this pull request
Mar 22, 2019
* group conv pass all * pass mobilenet
tmoreau89
pushed a commit
to tmoreau89/tvm
that referenced
this pull request
Mar 22, 2019
* group conv pass all * pass mobilenet
Merged
kevinthesun
pushed a commit
to kevinthesun/tvm
that referenced
this pull request
Jan 9, 2020
* dense * add biforst; remove comments * address comment
kevinthesun
pushed a commit
to kevinthesun/tvm
that referenced
this pull request
Feb 6, 2020
fix lint bitpack strategy bitserial_dense (apache#6) * update strategy * address comments fix a few topi test Dense strategy (apache#5) * dense * add biforst; remove comments * address comment Refactor x86 conv2d_NCHWc (apache#4) * Refactor x86 conv2d * Add x86 depthwise_conv2d_NCHWc * Add back topi x86 conv2d_nchw * Merge x86 conv2d_nchw and conv2d_NCHWc * Minor fix for x86 conv2d fix more strategy Add x86 conv2d_NCHWc_int8 strategy (apache#8) * Add x86 conv2d_NCHWc_int8 strategy * Remove contrib_conv2d_nchwc_int8 * Fix generic conv2d_NCHWc for int8 * Fix topi arm_cpu conv2d_NCHWc_int8 update x86 conv2d enable specify relay ops to be tuned for autotvm add cuda conv2d strategy add conv2d strategy for rocm add conv2d strategy for hls add conv2d strategy for arm cpu add conv2d strategy for mali add conv2d strategy for bifrost add conv2d strategy for intel graphics clean up and fix lint remove template keys from autotvm remove 2 in the func name address comments fix
kevinthesun
pushed a commit
to kevinthesun/tvm
that referenced
this pull request
Feb 11, 2020
fix lint bitpack strategy bitserial_dense (apache#6) * update strategy * address comments fix a few topi test Dense strategy (apache#5) * dense * add biforst; remove comments * address comment Refactor x86 conv2d_NCHWc (apache#4) * Refactor x86 conv2d * Add x86 depthwise_conv2d_NCHWc * Add back topi x86 conv2d_nchw * Merge x86 conv2d_nchw and conv2d_NCHWc * Minor fix for x86 conv2d fix more strategy Add x86 conv2d_NCHWc_int8 strategy (apache#8) * Add x86 conv2d_NCHWc_int8 strategy * Remove contrib_conv2d_nchwc_int8 * Fix generic conv2d_NCHWc for int8 * Fix topi arm_cpu conv2d_NCHWc_int8 update x86 conv2d enable specify relay ops to be tuned for autotvm add cuda conv2d strategy add conv2d strategy for rocm add conv2d strategy for hls add conv2d strategy for arm cpu add conv2d strategy for mali add conv2d strategy for bifrost add conv2d strategy for intel graphics clean up and fix lint remove template keys from autotvm remove 2 in the func name address comments fix
kevinthesun
pushed a commit
to kevinthesun/tvm
that referenced
this pull request
Feb 12, 2020
fix lint bitpack strategy bitserial_dense (apache#6) * update strategy * address comments fix a few topi test Dense strategy (apache#5) * dense * add biforst; remove comments * address comment Refactor x86 conv2d_NCHWc (apache#4) * Refactor x86 conv2d * Add x86 depthwise_conv2d_NCHWc * Add back topi x86 conv2d_nchw * Merge x86 conv2d_nchw and conv2d_NCHWc * Minor fix for x86 conv2d fix more strategy Add x86 conv2d_NCHWc_int8 strategy (apache#8) * Add x86 conv2d_NCHWc_int8 strategy * Remove contrib_conv2d_nchwc_int8 * Fix generic conv2d_NCHWc for int8 * Fix topi arm_cpu conv2d_NCHWc_int8 update x86 conv2d enable specify relay ops to be tuned for autotvm add cuda conv2d strategy add conv2d strategy for rocm add conv2d strategy for hls add conv2d strategy for arm cpu add conv2d strategy for mali add conv2d strategy for bifrost add conv2d strategy for intel graphics clean up and fix lint remove template keys from autotvm remove 2 in the func name address comments fix
merrymercy
pushed a commit
that referenced
this pull request
Feb 24, 2020
* relay op strategy fix lint bitpack strategy bitserial_dense (#6) * update strategy * address comments fix a few topi test Dense strategy (#5) * dense * add biforst; remove comments * address comment Refactor x86 conv2d_NCHWc (#4) * Refactor x86 conv2d * Add x86 depthwise_conv2d_NCHWc * Add back topi x86 conv2d_nchw * Merge x86 conv2d_nchw and conv2d_NCHWc * Minor fix for x86 conv2d fix more strategy Add x86 conv2d_NCHWc_int8 strategy (#8) * Add x86 conv2d_NCHWc_int8 strategy * Remove contrib_conv2d_nchwc_int8 * Fix generic conv2d_NCHWc for int8 * Fix topi arm_cpu conv2d_NCHWc_int8 update x86 conv2d enable specify relay ops to be tuned for autotvm add cuda conv2d strategy add conv2d strategy for rocm add conv2d strategy for hls add conv2d strategy for arm cpu add conv2d strategy for mali add conv2d strategy for bifrost add conv2d strategy for intel graphics clean up and fix lint remove template keys from autotvm remove 2 in the func name address comments fix * fix bugs * lint * address comments * add name to op implement * Modify topi tests (#9) * Add pooling, reorg, softmax and vision * Add lrn * fix topi test * fix more topi test * lint * address comments * x * fix more tests & bugs * Modify more tests (#10) * Modify tests for bitserial_conv2d, bitserial_dense, bitserial_conv2d_rasp and bnn * Minor fix * More minor fix * fix more test * try to update vta using strategy * fix cpptest * x * fix rebase err * Fix two tests (#11) * change autotvm log format * lint * minor fix * try fix vta test * fix rebase err * tweak * tmp hack for vta pass * fix tutorial * fix * fix more tutorials * fix vta tutorial * minor * address comments * fix * address comments * fix cpptest * fix docs * change data structure name and api * address comments * lint * fix rebase err * updates * fix winograd test * fix doc * rebase * upgrade tophub version number * fix bug * re-enable vta tsim test after tophub is upgraded * fix vta test to use the correct args so the config can be found in tophub Co-authored-by: Yao Wang <kevinthesunwy@gmail.com>
tqchen
pushed a commit
to tqchen/tvm
that referenced
this pull request
Mar 29, 2020
tqchen
pushed a commit
to tqchen/tvm
that referenced
this pull request
Mar 29, 2020
* relay op strategy fix lint bitpack strategy bitserial_dense (apache#6) * update strategy * address comments fix a few topi test Dense strategy (apache#5) * dense * add biforst; remove comments * address comment Refactor x86 conv2d_NCHWc (#4) * Refactor x86 conv2d * Add x86 depthwise_conv2d_NCHWc * Add back topi x86 conv2d_nchw * Merge x86 conv2d_nchw and conv2d_NCHWc * Minor fix for x86 conv2d fix more strategy Add x86 conv2d_NCHWc_int8 strategy (apache#8) * Add x86 conv2d_NCHWc_int8 strategy * Remove contrib_conv2d_nchwc_int8 * Fix generic conv2d_NCHWc for int8 * Fix topi arm_cpu conv2d_NCHWc_int8 update x86 conv2d enable specify relay ops to be tuned for autotvm add cuda conv2d strategy add conv2d strategy for rocm add conv2d strategy for hls add conv2d strategy for arm cpu add conv2d strategy for mali add conv2d strategy for bifrost add conv2d strategy for intel graphics clean up and fix lint remove template keys from autotvm remove 2 in the func name address comments fix * fix bugs * lint * address comments * add name to op implement * Modify topi tests (apache#9) * Add pooling, reorg, softmax and vision * Add lrn * fix topi test * fix more topi test * lint * address comments * x * fix more tests & bugs * Modify more tests (apache#10) * Modify tests for bitserial_conv2d, bitserial_dense, bitserial_conv2d_rasp and bnn * Minor fix * More minor fix * fix more test * try to update vta using strategy * fix cpptest * x * fix rebase err * Fix two tests (apache#11) * change autotvm log format * lint * minor fix * try fix vta test * fix rebase err * tweak * tmp hack for vta pass * fix tutorial * fix * fix more tutorials * fix vta tutorial * minor * address comments * fix * address comments * fix cpptest * fix docs * change data structure name and api * address comments * lint * fix rebase err * updates * fix winograd test * fix doc * rebase * upgrade tophub version number * fix bug * re-enable vta tsim test after tophub is upgraded * fix vta test to use the correct args so the config can be found in tophub Co-authored-by: Yao Wang <kevinthesunwy@gmail.com>
prashantsail
pushed a commit
to prashantsail/incubator-tvm
that referenced
this pull request
May 15, 2020
support output tensors as function args, multiple functions
apivovarov
pushed a commit
to apivovarov/tvm
that referenced
this pull request
May 16, 2020
* relay op strategy fix lint bitpack strategy bitserial_dense (apache#6) * update strategy * address comments fix a few topi test Dense strategy (apache#5) * dense * add biforst; remove comments * address comment Refactor x86 conv2d_NCHWc (apache#4) * Refactor x86 conv2d * Add x86 depthwise_conv2d_NCHWc * Add back topi x86 conv2d_nchw * Merge x86 conv2d_nchw and conv2d_NCHWc * Minor fix for x86 conv2d fix more strategy Add x86 conv2d_NCHWc_int8 strategy (apache#8) * Add x86 conv2d_NCHWc_int8 strategy * Remove contrib_conv2d_nchwc_int8 * Fix generic conv2d_NCHWc for int8 * Fix topi arm_cpu conv2d_NCHWc_int8 update x86 conv2d enable specify relay ops to be tuned for autotvm add cuda conv2d strategy add conv2d strategy for rocm add conv2d strategy for hls add conv2d strategy for arm cpu add conv2d strategy for mali add conv2d strategy for bifrost add conv2d strategy for intel graphics clean up and fix lint remove template keys from autotvm remove 2 in the func name address comments fix * fix bugs * lint * address comments * add name to op implement * Modify topi tests (apache#9) * Add pooling, reorg, softmax and vision * Add lrn * fix topi test * fix more topi test * lint * address comments * x * fix more tests & bugs * Modify more tests (apache#10) * Modify tests for bitserial_conv2d, bitserial_dense, bitserial_conv2d_rasp and bnn * Minor fix * More minor fix * fix more test * try to update vta using strategy * fix cpptest * x * fix rebase err * Fix two tests (apache#11) * change autotvm log format * lint * minor fix * try fix vta test * fix rebase err * tweak * tmp hack for vta pass * fix tutorial * fix * fix more tutorials * fix vta tutorial * minor * address comments * fix * address comments * fix cpptest * fix docs * change data structure name and api * address comments * lint * fix rebase err * updates * fix winograd test * fix doc * rebase * upgrade tophub version number * fix bug * re-enable vta tsim test after tophub is upgraded * fix vta test to use the correct args so the config can be found in tophub Co-authored-by: Yao Wang <kevinthesunwy@gmail.com>
jcf94
added a commit
to jcf94/tvm
that referenced
this pull request
Jun 22, 2020
* Add feature * Add cost_model, meta_tile_rewrite_policy * Add MetaTileRewritePolicy basic UT
tqchen
pushed a commit
that referenced
this pull request
Jul 15, 2020
…generating (#5962) * Code migration Start (#1) * Init commit: Code migration Start * Add loop_state.cc/h * Add ComputeDAG basic test * Split transform_step out & Update more UTs (#3) * Split transform_step out * Update GetProducers & GetConsumers * Update UTs * Add UT for CacheReadWrite & Some bug fix * Add search_task, measure and serialization (#4) * Add FollowSplit & FollowFusedSplit tests * Update dag.InferBound & its UT * Add search_task, measure and serialization * Update Serialization UT * Add MetaTileRewritePolicy (#5) * Add feature * Add cost_model, meta_tile_rewrite_policy * Add MetaTileRewritePolicy basic UT * Basic Python API for State (#6) * Add Basic Python API for State * Add UTs for State * Add Python API: Measure & Task (#7) * Update the return value of state operation * Add task * Copy measure.py & utils.py * Fix LocalBuilder * Fix LocalRunner * Add ansor.auto_schedule() API; First AutoSchedule working version(#8) * Add basic Python support for ansor.auto_schedule * Update AutoSchedule API * Bug fix for get the attach point of a fused iter * Update UT after infer bug fix * Bug fix & Add python serialization API (#10) * Delete C++ UT hack since Python is ready * Add ndarray.non_empty * Update Serialization python API * Improve code style, python wrapper and test cases (#11) * Update c++ code style and unit test * Update python State wrapper and test cases * fix unit tests * Add RPCRunner & OpenCL/CUDA test (#12) * Add RPCRunner & OpenCL search test * Add CUDA search test * Add RPCRunner test * rebase to upstream/master * Add Ansor basic tutorial (#13) * Add basic tutorial * migrate feature extraction (#14) * Add XGBModel & RPCRunnerWarpper (#15) * Add XGBModel & RPCRunnerWarpper * Revert "Add Parallel Granularity Mutation" * Migrate workload_registry.py (#16) * add workload registry * update * update * add task scheduler (#17) * Add conv2d cuda tutorial with workload registry (#18) * add tune_test.py (the old tune_wkl.py) (#19) * add tune_test.py (the old tune_wkl.py) * update * fix measure * fix for gpu * Code refine for tune_test.py & Add a pre load callback (#20) * Bug fix for tutorials * Add PreLoadMeasuredStates * Add search_callback support for task tuner * Code refine for tune_test.py * Update * Update * Update * Update * Bug fix * Add python custom sketch rule (#21) * Add custom sketch rule * Bug fix * Ansor Relay Integration (without layout rewrite) (#22) * relay integration * Add tune_op_subgraph.py & Some code clean for tune_network.py (#23) * Add single op tune scripts * Add tune subgraph support * Merge all op & all subgraph to one file * Rename file * add explicit_unroll_max_extent (#25) * Add Index simplification & API update (#26) * Add vectorized cooperative_fetching test * Update math simplify for vectorized CF * File rename * Update tune_network * API update * Update PreLoadMeasuredStates & Some bug fix (#27) * Add a threading wrapper to fix the test bug * Set default TVM_USE_AUTO_SCHEDULER to false * Update PreLoadMeasuredStates callback * Add tensorize step for loop_state (#31) * Add tensorize step * State python api update (#33) * Start to update api * Add compute_dag to state * API update * kernel layout rewrite (#28) * kernel layout rewrite * remove some hacks * add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass * set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite * [cache flush] port cache flush to ansor (#32) * Improve relay integration (#34) * tmp checkpoint * Improve relay integration * Improve relay integration * Fix xgb error & Simplify dispatcher (#35) * Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36) * Rename "MetaTileRewritePolicy" to "SketchPolicy". * Add a new class for auto_unroll_max_step, storage_offset in StageNode * fix tune_op_subgraph.py * rebase * Migrate all node::make to noderef's construct function (#37) * Start to move xxxnode::make to noderef() * Update * Update * Finish transform_step * Finish comute dag & auto schedule * Update * Update * Update * Update * Update * Code refine * Code refine * Code refine * Update * Update * Some lint fix & Recover the double constructor of tvm::PrimExpr (#39) * lint fix * clang-format-fix * pylint fix * Update * Recover the double constructor of tvm::PrimExpr * Fix pylint * pylint fix * pylint fix * Add MutateComputeLocation and MutateParallel in evolutionary search (#40) * Add MutateComputeLocation and MutateParallel in evolutionary search * fix lint * Improve loop state python API (stage_tensors -> stage_ops) (#41) * improve loop state python API (stage_tensors -> stage_ops) * fix * ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42) * Bug Fix * Sample example of Custom TensorCore Matmul * Rever Commits, Start to build minimum Ansor system * Code clean for minimum Ansor system * Bug fix & Delete AccessAnalyzer * Delete attachmap & Code clean * Doc update Update statenode::stages from vector to Array * Headfile update & Python doc update * clang-format fix * pylint fix * Update * Doc update * Update * Bug fix after code merge to the new master * clang-format fix * Update * Update * Update std::vector to Array; Update verbosity setting; Some commemts addressed * std::vector->Array & std::string->String * Add init_state to ComputeDAG * Update * Update some unordered_map to Map * clang-format fix * Comments addressed Delete ReplayAndInferBound Delete ReplaySteps & InferBoundCommon * Lint fix * Update * Update * Update * Update * Update * Update * Update * Update * Update * Rename ansor namespace to auto_schedule * Update * Rename ThreadPool to ParallelFor * Add parallel_for * Remove ThreadPool * Update python/tvm/auto_schedule/auto_schedule.py * trigger CI Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com> Co-authored-by: Zhao Wu <zhaowu@apache.org>
CloudManX
pushed a commit
to CloudManX/incubator-tvm
that referenced
this pull request
Sep 15, 2020
…generating (apache#5962) * Code migration Start (apache#1) * Init commit: Code migration Start * Add loop_state.cc/h * Add ComputeDAG basic test * Split transform_step out & Update more UTs (apache#3) * Split transform_step out * Update GetProducers & GetConsumers * Update UTs * Add UT for CacheReadWrite & Some bug fix * Add search_task, measure and serialization (apache#4) * Add FollowSplit & FollowFusedSplit tests * Update dag.InferBound & its UT * Add search_task, measure and serialization * Update Serialization UT * Add MetaTileRewritePolicy (apache#5) * Add feature * Add cost_model, meta_tile_rewrite_policy * Add MetaTileRewritePolicy basic UT * Basic Python API for State (apache#6) * Add Basic Python API for State * Add UTs for State * Add Python API: Measure & Task (apache#7) * Update the return value of state operation * Add task * Copy measure.py & utils.py * Fix LocalBuilder * Fix LocalRunner * Add ansor.auto_schedule() API; First AutoSchedule working version(apache#8) * Add basic Python support for ansor.auto_schedule * Update AutoSchedule API * Bug fix for get the attach point of a fused iter * Update UT after infer bug fix * Bug fix & Add python serialization API (apache#10) * Delete C++ UT hack since Python is ready * Add ndarray.non_empty * Update Serialization python API * Improve code style, python wrapper and test cases (apache#11) * Update c++ code style and unit test * Update python State wrapper and test cases * fix unit tests * Add RPCRunner & OpenCL/CUDA test (apache#12) * Add RPCRunner & OpenCL search test * Add CUDA search test * Add RPCRunner test * rebase to upstream/master * Add Ansor basic tutorial (apache#13) * Add basic tutorial * migrate feature extraction (apache#14) * Add XGBModel & RPCRunnerWarpper (apache#15) * Add XGBModel & RPCRunnerWarpper * Revert "Add Parallel Granularity Mutation" * Migrate workload_registry.py (apache#16) * add workload registry * update * update * add task scheduler (apache#17) * Add conv2d cuda tutorial with workload registry (apache#18) * add tune_test.py (the old tune_wkl.py) (apache#19) * add tune_test.py (the old tune_wkl.py) * update * fix measure * fix for gpu * Code refine for tune_test.py & Add a pre load callback (apache#20) * Bug fix for tutorials * Add PreLoadMeasuredStates * Add search_callback support for task tuner * Code refine for tune_test.py * Update * Update * Update * Update * Bug fix * Add python custom sketch rule (apache#21) * Add custom sketch rule * Bug fix * Ansor Relay Integration (without layout rewrite) (apache#22) * relay integration * Add tune_op_subgraph.py & Some code clean for tune_network.py (apache#23) * Add single op tune scripts * Add tune subgraph support * Merge all op & all subgraph to one file * Rename file * add explicit_unroll_max_extent (apache#25) * Add Index simplification & API update (apache#26) * Add vectorized cooperative_fetching test * Update math simplify for vectorized CF * File rename * Update tune_network * API update * Update PreLoadMeasuredStates & Some bug fix (apache#27) * Add a threading wrapper to fix the test bug * Set default TVM_USE_AUTO_SCHEDULER to false * Update PreLoadMeasuredStates callback * Add tensorize step for loop_state (apache#31) * Add tensorize step * State python api update (apache#33) * Start to update api * Add compute_dag to state * API update * kernel layout rewrite (apache#28) * kernel layout rewrite * remove some hacks * add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass * set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite * [cache flush] port cache flush to ansor (apache#32) * Improve relay integration (apache#34) * tmp checkpoint * Improve relay integration * Improve relay integration * Fix xgb error & Simplify dispatcher (apache#35) * Rename "MetaTileRewritePolicy" to "SketchPolicy". (apache#36) * Rename "MetaTileRewritePolicy" to "SketchPolicy". * Add a new class for auto_unroll_max_step, storage_offset in StageNode * fix tune_op_subgraph.py * rebase * Migrate all node::make to noderef's construct function (apache#37) * Start to move xxxnode::make to noderef() * Update * Update * Finish transform_step * Finish comute dag & auto schedule * Update * Update * Update * Update * Update * Code refine * Code refine * Code refine * Update * Update * Some lint fix & Recover the double constructor of tvm::PrimExpr (apache#39) * lint fix * clang-format-fix * pylint fix * Update * Recover the double constructor of tvm::PrimExpr * Fix pylint * pylint fix * pylint fix * Add MutateComputeLocation and MutateParallel in evolutionary search (apache#40) * Add MutateComputeLocation and MutateParallel in evolutionary search * fix lint * Improve loop state python API (stage_tensors -> stage_ops) (apache#41) * improve loop state python API (stage_tensors -> stage_ops) * fix * ComputeDAG bug fix & Add Custom TensorCore Matmul Example (apache#42) * Bug Fix * Sample example of Custom TensorCore Matmul * Rever Commits, Start to build minimum Ansor system * Code clean for minimum Ansor system * Bug fix & Delete AccessAnalyzer * Delete attachmap & Code clean * Doc update Update statenode::stages from vector to Array * Headfile update & Python doc update * clang-format fix * pylint fix * Update * Doc update * Update * Bug fix after code merge to the new master * clang-format fix * Update * Update * Update std::vector to Array; Update verbosity setting; Some commemts addressed * std::vector->Array & std::string->String * Add init_state to ComputeDAG * Update * Update some unordered_map to Map * clang-format fix * Comments addressed Delete ReplayAndInferBound Delete ReplaySteps & InferBoundCommon * Lint fix * Update * Update * Update * Update * Update * Update * Update * Update * Update * Rename ansor namespace to auto_schedule * Update * Rename ThreadPool to ParallelFor * Add parallel_for * Remove ThreadPool * Update python/tvm/auto_schedule/auto_schedule.py * trigger CI Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com> Co-authored-by: Zhao Wu <zhaowu@apache.org>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this pull request
Feb 5, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
Hzfengsy
pushed a commit
to Hzfengsy/tvm
that referenced
this pull request
Feb 12, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> fix some fixes fix test
Hzfengsy
pushed a commit
to Hzfengsy/tvm
that referenced
this pull request
Feb 19, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> fix some fixes fix test
Hzfengsy
pushed a commit
to Hzfengsy/tvm
that referenced
this pull request
Feb 19, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai> fix some fixes fix test
zxybazh
added a commit
to zxybazh/tvm
that referenced
this pull request
Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this pull request
Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this pull request
Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this pull request
Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this pull request
Feb 22, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh
added a commit
to zxybazh/tvm
that referenced
this pull request
Feb 22, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com> Co-authored-by: Hongyi Jin <3231950289@qq.com> Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> Co-authored-by: Junru Shao <junrushao1994@gmail.com> Co-authored-by: Wuwei Lin <wuwei@apache.org> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com> Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
MasterJH5574
pushed a commit
to MasterJH5574/tvm
that referenced
this pull request
Feb 26, 2022
* Update AST. * ShapeOf. * ShapeOf. * Address comment.
MasterJH5574
pushed a commit
to MasterJH5574/tvm
that referenced
this pull request
Mar 3, 2022
* Update AST. * ShapeOf. * ShapeOf. * Address comment.
MasterJH5574
pushed a commit
to MasterJH5574/tvm
that referenced
this pull request
Mar 7, 2022
[SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> Fix AxisTree (apache#3) * fix axis tree * upd [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` [SparseTIR] Introduce SpIterVar (apache#6) * [SparseTIR] Introduce SpIterVar * Add conversion to PrimExpr [BugFix] Fix binary search & SpIterVar (apache#7) [BugFix] Add field `is_reduction` for SpIterVar (apache#9) * [BugFix] Add field `is_reduction` for SpIterVar * Formatting [SparseTIR] Index Lowering (apache#8) * Add StmtFunctor/ExprFunctor for SparseBufferStore/Load * Add basic index lowering * Finish index lowering (maybe) * Address comments * Convert CRLF to LF Frontend update, demo scripts. (apache#10) * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> * Fix AxisTree (apache#3) * fix axis tree * upd * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * fix axis tree * upd * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> * Fix AxisTree (apache#3) * fix axis tree * upd * [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> * Fix AxisTree (apache#3) * fix axis tree * upd * [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` * [SparseTIR] Introduce SpIterVar (apache#6) * [SparseTIR] Introduce SpIterVar * Add conversion to PrimExpr * [BugFix] Fix binary search & SpIterVar (apache#7) * [BugFix] Add field `is_reduction` for SpIterVar (apache#9) * [BugFix] Add field `is_reduction` for SpIterVar * Formatting * upd * upd Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> [SparseTIR] SparseBlock on C++/Python side (apache#11) * Fix a bug in the last commit * SparseBlock on C++ & Python side [BugFix][SparseTIR] TVMScript Parser for Axis & SpIterVar (apache#12) * Update `cord` and `pos` * Fix `idtype` * Formatting.. * Bug fix 1 * Move new special stmts * Parser for Axis and SpIterVar * Fix context_maintainer.py [SparseTIR] Enhance SparseBlock to contain enough PrimFunc information (apache#13) * Enhance SparseBlock to have enough PrimFunc info * Remove `func_sparse_buffer_map_` * Don't print the map uh-huh [SparseTIR] Parser, Printer, Roundtrip (apache#14) * SparseBlock scope handler (part 1) * SparseBlock scope handler (part 2) * SparseBlock scope handler (part 3) * SparseBlock scope handler (fix 1) * Add SparseBufferLoad/Store on Python side * Parser for SparseBufferLoad/Store * Add SparseBlock to Python __init__ * StmtFunctor for SparseBlock * Ensure at least one dimension for SparseBuffer * Make `axis` field of SpIterVar mandatory * SparseBlock scope handler (fix 2) * Update Axis syntax by removing `name` parameter * Move to intrin.py * Add filed `from_sparse` to DenseFixedAxis * SparseTIR script printer * Roundtrip test * `update_symbol` bug fix * Fix attr visit in SparseBuffer * Define then compare in SparseBlock * Fix printer bug for SparseBuffer * Enable graph match for Axis and SparseBuffer * Complete HashReduce and EqualReduce for AxisTree and SparseBuffer * Fix typo * Rename test * Bug fix 1 * Bug fix 2 * Add more tests Move tests (apache#15) [SparseTIR] ReprPrinter for Axis and SpIterVar (apache#16) upd (apache#17) flatten (apache#18) ELL and BSR correctness test scripts (apache#19) [SparseTIR] SparseTIR Lowering (apache#20) * Fix a previous bug of sparse-fixed SpIterVar creation * Fix a previous bug in `GetDenseValue` * Refactor Collector and IndexTransformer * Construct block and loops * Fix a previous bug which rejects DV iters in collector * Update buffer map * Create root block * Fix bug of sparse-fixed SpIterVar creation * Fix bug on SpIterVar conversion (with refactor) * Fix bug when getting dependent SpIterVars * Fix bug on dependency map and index lowering * Full block read/write region * Test version 1 * Fix bug of loop order * Fix bug of batch-mm iterator ordering * Update PrimFunc args to use symbolic params * Fix bug of test "csr_element_wise" * Fix bug of index accumulation for sparse-fixed axis * Update correctness test * Test structural equality * Refactor and use Array fix nnz cols Add docstring for sparse tir lowering (apache#21) * add docstring * upd Add more examples part 1 (sddmm) (apache#22) * upd * upd * upd [SparseTIR][Schedule] SparseBlockRV, GetSparseBlock, SparseReorder (apache#23) * Test initialization * Fix a stupid bug of ReprPrinter * Add SparseBlockRV * Schedule: GetSparseBlock * Schedule: Reorder [SparseTIR][Schedule] GetSpIters (apache#24) remove hybrid script for successful compilation Add atomic intrinsic for output nonzero inference. (apache#25) * upd * upd Add "sparse" block attribute. (apache#26) Revert "remove hybrid script for successful compilation" This reverts commit eebd7c1. [SparseTIR] Hack `IsAffineBinding` check (apache#27) * [TensorIR][Schedule] Inherit block anotation upon creating new blocks * Fix SDDMM test * Hack IsAffineBinding for sparse blocks Axis Dependency Tree aware code-gen and bmm example (apache#28) * upd * upd * upd * upd * upd * upd * upd * upd * remove redundancy * fix * upd * upd Re-design Indices lowering (apache#29) * upd * upd * upd * upd * upd * init * format * fix * revise coding-style * format Complete indices lowering (apache#30) * upd * upd * upd * done * upd * passed test * upd Add more docstrings and depress warnings for new lowering algorithm. (apache#31) Refactor derived axis, frontend support of fusion. (apache#32) * upd * upd * fix Fatal bugfix and change the signature of DenseVariableAxis. (apache#33) Syntax simplification (apache#34) Change the order of generated blocks for block isolation. (apache#35) * upd * upd * upd Syntax of AttachAxis for BMM (apache#36) * upd * upd * upd [SparseTIR] Add "square sum" lowering test (apache#37) * Add square sum test * Remove pylint comment [BugFix] Fix offset caching in lowering (apache#38) * Hack compact dataflow check in a dirty way * Add two-K square sum test * Mark skipped tests * Fix offset saving in lowering Fusion syntax fix + SDDMM example. (apache#39) Some structure change on update offsets. (apache#40) [Refactor] SparseTIR Lowering (apache#41) * Take out methods in Scope * Refactor * Refactor "match" * Tweak scope contents * Refactor ViewIndexInAxis * Refactor Scope * SDDMM tests under implementation * Refactor block stack * Use Map for var_map * Extract NeedCreateNewBlock * Simplify SpIterVarToIterVar via GetIterExtent * Refactor NeedCreateNewBlock * Add docstring * Use "auto" correctly * Minor refactor and use some move Remove redundant analyzers (apache#42) Support indices lowering for attach and fuse. (apache#43) * upd * upd * upd Fix irregular BMM example. (apache#44) * upd * upd * upd * upd RGCN forward and butterfly pattern example. (apache#45) Fused SDDMM example. (apache#46) * upd * wip * fix Fix sparse reorder after refactor (apache#47) [Refactor] Refactor Unittest (apache#48) * upd * remove redundancy [Unittest] Correctness test for benchmarking scripts (apache#49) Bugfix and more test for axis fusion, new workload (apache#50) * upd * upd upd
prateek9623
pushed a commit
to prateek9623/tvm
that referenced
this pull request
May 1, 2022
Create a version for the tvm library
jinhongyii
pushed a commit
to jinhongyii/tvm
that referenced
this pull request
Jun 20, 2022
* Update AST. * ShapeOf. * ShapeOf. * Address comment.
jinhongyii
pushed a commit
to jinhongyii/tvm
that referenced
this pull request
Jun 20, 2022
* fix * fix
gbonik
pushed a commit
to gbonik/tvm
that referenced
this pull request
Jun 28, 2022
TracedObject wrapper that tracks an ObjectPath
Hzfengsy
pushed a commit
to Hzfengsy/tvm
that referenced
this pull request
Jul 30, 2022
* Update AST. * ShapeOf. * ShapeOf. * Address comment.
areusch
referenced
this pull request
in areusch/tvm
Sep 20, 2022
* Update AST. * ShapeOf. * ShapeOf. * Address comment.
gigiblender
pushed a commit
to gigiblender/tvm
that referenced
this pull request
Nov 3, 2022
* Update AST. * ShapeOf. * ShapeOf. * Address comment.
MasterJH5574
pushed a commit
to MasterJH5574/tvm
that referenced
this pull request
Nov 20, 2022
* Update AST. * ShapeOf. * ShapeOf. * Address comment.
MasterJH5574
pushed a commit
to MasterJH5574/tvm
that referenced
this pull request
Nov 20, 2022
* high-level-op support * format * format * follow relay convention * format * fix
MasterJH5574
added a commit
to MasterJH5574/tvm
that referenced
this pull request
Nov 20, 2022
* Revert "Add High-level Op Support (apache#5)" This reverts commit 67960ee474c8797ca9111527f7b9eb6693377ad6. * [CherryPick] Relax ops: dense/conv2d/max_pool2d/relu/softmax/flatten (tlc-pack#266) * relax.transpose * relax.subtract * relax.nn.batch_norm * relax.reshape * relax.expand_dims * relax.squeeze * relax.concatenate Co-authored-by: sung <sunggg@umich.edu>
gigiblender
pushed a commit
to gigiblender/tvm
that referenced
this pull request
Jan 19, 2023
* [IR] Introduce StructInfo * StructInfoFunctor and Analysis Support * [TVMScript] Parse type/shape annotation with StructInfo * remove runtime type assign * Remove type/shape during parsing (#2) * Normalizer prep: simple checks and legacy function renaming. * Struct info deduction in BlockBuilder. * Two TODOs * StructInfo Normalizer Fixes (#3) * StructInfo AST Fix * Fix Extern Func Deduction and shape mutator. * Update VoidStructInfo & globalvar (#4) * Fix passes and proper sinfo propagation. * Refactor EraseToWellDefined to Enable Remapping * [WIP] First stab at symbolic param tracking * Update EraseToWellDefined to support symbolic shape return (apache#5) * fix R.shape with ndim (apache#6) * Remove update shape/type * Address review comment, AnnotateTypeShape=>AnnotateStructInfo * Update include/tvm/script/ir_builder/relax/frame.h Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu> * Address comments * Update printer to use structinfo (apache#7) * Update Error mechanism to prep for obj loc based reporting * Symbolic shape aware function call return value derivation. The main flow works as follows: - Match and populate shape_var_map and var_map by visit each pair of param and call arguments. - Call EraseToWellDefined to map the ret parameter to new result. * [ANALYSIS] Refactor well-form to only look at struct info. * Update comments according to reviews. * Update include/tvm/relax/struct_info.h Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu> Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Tianqi Chen <tqchen> Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>
gigiblender
pushed a commit
to gigiblender/tvm
that referenced
this pull request
Jan 31, 2023
* Update AST. * ShapeOf. * ShapeOf. * Address comment.
gigiblender
pushed a commit
to gigiblender/tvm
that referenced
this pull request
Jan 31, 2023
* [IR] Introduce StructInfo * StructInfoFunctor and Analysis Support * [TVMScript] Parse type/shape annotation with StructInfo * remove runtime type assign * Remove type/shape during parsing (#2) * Normalizer prep: simple checks and legacy function renaming. * Struct info deduction in BlockBuilder. * Two TODOs * StructInfo Normalizer Fixes (#3) * StructInfo AST Fix * Fix Extern Func Deduction and shape mutator. * Update VoidStructInfo & globalvar (#4) * Fix passes and proper sinfo propagation. * Refactor EraseToWellDefined to Enable Remapping * [WIP] First stab at symbolic param tracking * Update EraseToWellDefined to support symbolic shape return (apache#5) * fix R.shape with ndim (apache#6) * Remove update shape/type * Address review comment, AnnotateTypeShape=>AnnotateStructInfo * Update include/tvm/script/ir_builder/relax/frame.h Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu> * Address comments * Update printer to use structinfo (apache#7) * Update Error mechanism to prep for obj loc based reporting * Symbolic shape aware function call return value derivation. The main flow works as follows: - Match and populate shape_var_map and var_map by visit each pair of param and call arguments. - Call EraseToWellDefined to map the ret parameter to new result. * [ANALYSIS] Refactor well-form to only look at struct info. * Update comments according to reviews. * Update include/tvm/relax/struct_info.h Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu> Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Tianqi Chen <tqchen> Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>
Lunderberg
pushed a commit
to Lunderberg/tvm
that referenced
this pull request
Mar 3, 2023
* Update AST. * ShapeOf. * ShapeOf. * Address comment.
Lunderberg
pushed a commit
to Lunderberg/tvm
that referenced
this pull request
Mar 3, 2023
* [IR] Introduce StructInfo * StructInfoFunctor and Analysis Support * [TVMScript] Parse type/shape annotation with StructInfo * remove runtime type assign * Remove type/shape during parsing (apache#2) * Normalizer prep: simple checks and legacy function renaming. * Struct info deduction in BlockBuilder. * Two TODOs * StructInfo Normalizer Fixes (apache#3) * StructInfo AST Fix * Fix Extern Func Deduction and shape mutator. * Update VoidStructInfo & globalvar (apache#4) * Fix passes and proper sinfo propagation. * Refactor EraseToWellDefined to Enable Remapping * [WIP] First stab at symbolic param tracking * Update EraseToWellDefined to support symbolic shape return (apache#5) * fix R.shape with ndim (apache#6) * Remove update shape/type * Address review comment, AnnotateTypeShape=>AnnotateStructInfo * Update include/tvm/script/ir_builder/relax/frame.h Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu> * Address comments * Update printer to use structinfo (apache#7) * Update Error mechanism to prep for obj loc based reporting * Symbolic shape aware function call return value derivation. The main flow works as follows: - Match and populate shape_var_map and var_map by visit each pair of param and call arguments. - Call EraseToWellDefined to map the ret parameter to new result. * [ANALYSIS] Refactor well-form to only look at struct info. * Update comments according to reviews. * Update include/tvm/relax/struct_info.h Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu> Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn> Co-authored-by: Tianqi Chen <tqchen> Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>
mikeseven
pushed a commit
to mikeseven/tvm
that referenced
this pull request
Sep 27, 2023
SIM-2981: make pre-release instead of post-release Approved-by: Jeffrey Uong
MasterJH5574
added a commit
to MasterJH5574/tvm
that referenced
this pull request
Oct 27, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.