New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Convert PyTorch models via NNVM #23

Closed

markrogersjr wants to merge 1 commit into neo-ai:dev from markrogersjr:neo_pytorch

markrogersjr commented Apr 1, 2019

Consists of an operator map from Prim/ATen operators to NNVM operators and a traversal of PyTorch JIT IR graphs. @wweic @yuruofeifei

markrogersjr force-pushed the neo_pytorch branch 4 times, most recently from 22b3d74 to ac2146e Compare

April 2, 2019 01:29

yongwww suggested changes

View reviewed changes

tests/python/frontend/pytorch/test_forward.py Outdated Show resolved Hide resolved

tests/python/frontend/pytorch/test_forward.py Outdated Show resolved Hide resolved

wweic commented Apr 4, 2019 •

edited

Loading

The unit tests are failing(https://neo-ai-ci.amazon-ml.com/blue/organizations/jenkins/tvm/detail/PR-23/5/pipeline):

  File "/workspace/nnvm/python/nnvm/frontend/pytorch/converter.py", line 5, in <module>

    import torch

ImportError: No module named torch

Could you install torch at the beginning ./tests/scripts/task_python_vta.sh? @KoinFlyp

wweic suggested changes

View reviewed changes

nnvm/python/nnvm/frontend/pytorch/base.py Outdated Show resolved Hide resolved

nnvm/python/nnvm/frontend/pytorch/base.py Outdated

+                  def add_input(self, name, tensor):
+                      self.inputs[name] = PyTorchInput(name, tensor)
+                      self.inputs[name].graph = self

wweic Apr 5, 2019

PyTorchInput does not have property graph

Author

markrogersjr Apr 5, 2019

Yeah, I suppose we can set to None by default. graph is set by PyTorchGraph in add_input.

nnvm/python/nnvm/frontend/pytorch/base.py Outdated

+                  def add_param(self, name, tensor):
+                      self.params[name] = PyTorchParam(name, tensor.astype('float32'))
+                      self.params[name].graph = self

wweic Apr 5, 2019

PyTorchParam does not have property graph.

Author

markrogersjr Apr 5, 2019

Ditto

nnvm/python/nnvm/frontend/pytorch/aten.py Outdated

+                          'stride': self._inputs[4].as_json(),
+                      }
+              #    @property

wweic Apr 5, 2019

remove dead code.

nnvm/python/nnvm/frontend/pytorch/aten.py Outdated

+                      self.input_names = [self._input_name(0)]
+                      self.shape = self._inputs[1].as_json()
+              #def expand(node, *inputs):

wweic Apr 5, 2019

remove dead code

tests/python/frontend/pytorch/test_forward.py Outdated Show resolved Hide resolved

nnvm/python/nnvm/frontend/pytorch/aten.py Outdated Show resolved Hide resolved

wweic requested a review from zhiics

April 5, 2019 05:15

yuruofeifei reviewed

View reviewed changes

nnvm/python/nnvm/frontend/pytorch/aten.py

+                  def __init__(self, node, graph):
+                      super(ATenOp, self).__init__(node, graph)
+                      self.dtype = 'float32'

yuruofeifei Apr 5, 2019

Is float32 the only dtype?

Author

markrogersjr Apr 5, 2019

For simplicity, yes but we can add other datatypes, perhaps in a separate PR. DLR only supports float32.

yuruofeifei reviewed

View reviewed changes

nnvm/python/nnvm/frontend/pytorch/aten.py Outdated

+                  @property
+                  def shape(self):
+                      print('shape of', self.name)

yuruofeifei Apr 5, 2019

Remove?

Author

markrogersjr Apr 5, 2019

yep thanks

yuruofeifei reviewed

View reviewed changes

nnvm/python/nnvm/frontend/pytorch/aten.py Outdated

+                      print('shape of', self.name)
+                      if not hasattr(self, '_shape'):
+                          self._shape = list(infer_shape(create(self.as_nnvm()))[1][0])
+                      print('got the shape')

yuruofeifei Apr 5, 2019

Remove?

markrogersjr force-pushed the neo_pytorch branch 5 times, most recently from 58faed8 to cff5503 Compare

April 8, 2019 21:10

Author

markrogersjr commented Apr 8, 2019

@Laurawly @icemelon9 can you please take a look? Particularly at the topi implementation of adaptive pooling. Thanks!

markrogersjr force-pushed the neo_pytorch branch 4 times, most recently from 383d321 to 6c03d0b Compare

April 9, 2019 03:35

Author

markrogersjr commented Apr 9, 2019

@wweic seems PyTorch is still not installed (see CI results). Can you please investigate?

markrogersjr force-pushed the neo_pytorch branch from 6c03d0b to d3c1895 Compare

April 9, 2019 17:27

Laurawly reviewed

View reviewed changes

topi/include/topi/nn/adaptive_pooling.h Outdated

+                                               PoolType pool_type,
+                                               const size_t height_axis,
+                                               const size_t width_axis) {
+                CHECK(x->shape.size() >= 2) << "Adaptive Pooling input must >= 2-D (H, W)";

Laurawly Apr 9, 2019

Can the input shape be other shape besides 2D and 4D?

Author

markrogersjr Apr 9, 2019

Probably best to assume it's only 4D but I think greater than 4D is supported.

markrogersjr force-pushed the neo_pytorch branch 6 times, most recently from 427b7de to 766c33f Compare

April 9, 2019 21:01

markrogersjr force-pushed the neo_pytorch branch 6 times, most recently from 7cdfc43 to e130a00 Compare

April 23, 2019 16:48

markrogersjr force-pushed the neo_pytorch branch 5 times, most recently from 41d4f6f to 07dace7 Compare

April 25, 2019 04:22

hcho3 commented May 3, 2019

@KoinFlyp @wweic @yongwww @zhiics @kevinthesun @Laurawly @icemelon9 @yuruofeifei What's the blocking issue for this PR right now?

markrogersjr force-pushed the neo_pytorch branch 5 times, most recently from 51ee22a to acf5162 Compare

May 14, 2019 23:56

markrogersjr force-pushed the neo_pytorch branch 3 times, most recently from edceb6b to 0d50913 Compare

May 17, 2019 18:25


          pytorch converter

0b3b7f7

markrogersjr force-pushed the neo_pytorch branch from 0d50913 to 0b3b7f7 Compare

May 17, 2019 23:30

yongwww commented Sep 16, 2019

close this pr since inactivity & we would like to go with relay. Feel free to re-open it into upstream tvm directly if needed.

yongwww closed this

alexwong mentioned this pull request

Add a PyTorch to Relay parser #63

Merged

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request


          [Ansor][AutoTVM v2.0] Part 0: Ansor minimum system for auto schedule …

456c58d

…generating (apache#5962)

* Code migration Start (neo-ai#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (neo-ai#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (neo-ai#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (neo-ai#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (neo-ai#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (neo-ai#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(neo-ai#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (neo-ai#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (neo-ai#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (neo-ai#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (neo-ai#13)

* Add basic tutorial

* migrate feature extraction (neo-ai#14)

* Add XGBModel & RPCRunnerWarpper (neo-ai#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (neo-ai#16)

* add workload registry

* update

* update

* add task scheduler (neo-ai#17)

* Add conv2d cuda tutorial with workload registry (neo-ai#18)

* add tune_test.py (the old tune_wkl.py) (neo-ai#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (neo-ai#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (neo-ai#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (neo-ai#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (neo-ai#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (neo-ai#25)

* Add Index simplification & API update (neo-ai#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (neo-ai#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (neo-ai#31)

* Add tensorize step

* State python api update (neo-ai#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (neo-ai#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (neo-ai#32)

* Improve relay integration (neo-ai#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (neo-ai#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (neo-ai#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (neo-ai#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (neo-ai#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (neo-ai#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (neo-ai#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (neo-ai#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com>
Co-authored-by: Zhao Wu <zhaowu@apache.org>

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request


          [Ansor][AutoTVM v2.0] Part 0: Ansor minimum system for auto schedule …

3d4e1eb

…generating (apache#5962)

* Code migration Start (neo-ai#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (neo-ai#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (neo-ai#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (neo-ai#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (neo-ai#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (neo-ai#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(neo-ai#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (neo-ai#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (neo-ai#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (neo-ai#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (neo-ai#13)

* Add basic tutorial

* migrate feature extraction (neo-ai#14)

* Add XGBModel & RPCRunnerWarpper (neo-ai#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (neo-ai#16)

* add workload registry

* update

* update

* add task scheduler (neo-ai#17)

* Add conv2d cuda tutorial with workload registry (neo-ai#18)

* add tune_test.py (the old tune_wkl.py) (neo-ai#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (neo-ai#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (neo-ai#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (neo-ai#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (neo-ai#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (neo-ai#25)

* Add Index simplification & API update (neo-ai#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (neo-ai#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (neo-ai#31)

* Add tensorize step

* State python api update (neo-ai#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (neo-ai#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (neo-ai#32)

* Improve relay integration (neo-ai#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (neo-ai#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (neo-ai#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (neo-ai#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (neo-ai#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (neo-ai#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (neo-ai#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (neo-ai#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com>
Co-authored-by: Zhao Wu <zhaowu@apache.org>

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request


          [Ansor][AutoTVM v2.0] Part 0: Ansor minimum system for auto schedule …

67514f2

…generating (apache#5962)

* Code migration Start (neo-ai#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (neo-ai#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (neo-ai#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (neo-ai#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (neo-ai#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (neo-ai#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(neo-ai#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (neo-ai#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (neo-ai#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (neo-ai#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (neo-ai#13)

* Add basic tutorial

* migrate feature extraction (neo-ai#14)

* Add XGBModel & RPCRunnerWarpper (neo-ai#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (neo-ai#16)

* add workload registry

* update

* update

* add task scheduler (neo-ai#17)

* Add conv2d cuda tutorial with workload registry (neo-ai#18)

* add tune_test.py (the old tune_wkl.py) (neo-ai#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (neo-ai#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (neo-ai#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (neo-ai#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (neo-ai#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (neo-ai#25)

* Add Index simplification & API update (neo-ai#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (neo-ai#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (neo-ai#31)

* Add tensorize step

* State python api update (neo-ai#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (neo-ai#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (neo-ai#32)

* Improve relay integration (neo-ai#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (neo-ai#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (neo-ai#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (neo-ai#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (neo-ai#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (neo-ai#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (neo-ai#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (neo-ai#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com>
Co-authored-by: Zhao Wu <zhaowu@apache.org>

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request


          [Ansor][AutoTVM v2.0] Part 0: Ansor minimum system for auto schedule …

a11cb25

…generating (apache#5962)

* Code migration Start (neo-ai#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (neo-ai#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (neo-ai#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (neo-ai#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (neo-ai#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (neo-ai#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(neo-ai#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (neo-ai#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (neo-ai#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (neo-ai#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (neo-ai#13)

* Add basic tutorial

* migrate feature extraction (neo-ai#14)

* Add XGBModel & RPCRunnerWarpper (neo-ai#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (neo-ai#16)

* add workload registry

* update

* update

* add task scheduler (neo-ai#17)

* Add conv2d cuda tutorial with workload registry (neo-ai#18)

* add tune_test.py (the old tune_wkl.py) (neo-ai#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (neo-ai#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (neo-ai#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (neo-ai#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (neo-ai#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (neo-ai#25)

* Add Index simplification & API update (neo-ai#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (neo-ai#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (neo-ai#31)

* Add tensorize step

* State python api update (neo-ai#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (neo-ai#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (neo-ai#32)

* Improve relay integration (neo-ai#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (neo-ai#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (neo-ai#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (neo-ai#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (neo-ai#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (neo-ai#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (neo-ai#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (neo-ai#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com>
Co-authored-by: Zhao Wu <zhaowu@apache.org>

trevor-m pushed a commit that referenced this pull request


          [Ansor][AutoTVM v2.0] Part 0: Ansor minimum system for auto schedule …

780f637

…generating (apache#5962)

* Code migration Start (#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (#13)

* Add basic tutorial

* migrate feature extraction (#14)

* Add XGBModel & RPCRunnerWarpper (#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (#16)

* add workload registry

* update

* update

* add task scheduler (#17)

* Add conv2d cuda tutorial with workload registry (#18)

* add tune_test.py (the old tune_wkl.py) (#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (#25)

* Add Index simplification & API update (#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (#31)

* Add tensorize step

* State python api update (#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (#32)

* Improve relay integration (#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com>
Co-authored-by: Zhao Wu <zhaowu@apache.org>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

wweic wweic requested changes

icemelon icemelon left review comments

Laurawly Laurawly left review comments

yuruofeifei yuruofeifei left review comments

yongwww yongwww requested changes

zhiics Awaiting requested review from zhiics

Labels

None yet