Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ANSOR] Auto-scheduler tutorial for GPU and necessary refactor/fix #6512

Merged
merged 13 commits into from
Sep 19, 2020

Conversation

merrymercy
Copy link
Member

@merrymercy merrymercy commented Sep 18, 2020

  • Add a tutorial on auto-scheduling a subgraph for GPU (tutorials/auto_scheduler/tune_conv2d_layer_cuda.py)
  • Refactor evolutionary search
  • Fix MutateComputeLocation
    • In the old implementation we reuse InitChangeComputeLocation, but this is wrong. Because it always appends new steps and will make the transform history too long. In the new implementation, we only mutate the old steps, so the number of steps keeps the same.

@merrymercy
Copy link
Member Author

python/tvm/auto_scheduler/measure.py Show resolved Hide resolved
Comment on lines 174 to 190
######################################################################
# .. note::
# We cannot run the line above because of the conflict between
# python's multiprocessing and tvm's thread pool.
# After running a tvm generated binary (L112), the python's multiprocessing
# library will hang forever.
# You have to make sure that you don't run any tvm generated binaries before
# calling ansor's search. To run the L156 above, you should comment out L112-114.
# After running a tvm generated binary the python's multiprocessing library
# will hang forever. You have to make sure that you don't run any tvm
# generated binaries before calling auot-scheduler's search.
# To run the function above, you should comment out all code in
# "Check correctness and evaluate performance" section.
#
# You should be careful about this problem in your applications.
# There are other workarounds for this problem.
# For example, you can start a new thread/process (with the builtin python library
# threading or multiprocessing) and run the tvm binaries in the new thread/process.
# This provides an isolation and avoids the conflict in the main thread/process.
# You can also use :any:`auto_scheduler.measure.LocalRPCMeasureContext` for auto-scheduler,
# as shown in the GPU tutorial (:ref:`auto-scheduler-conv-gpu`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may still be confusing. I'm thinking it may be better to just delete the resume part of the CPU tutorials to keep it as simple as possible.

We can add a comment saying that LocalRPCMeasureContext is recommended to be used in all hardware targets in the GPU tutorial.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People are likely to get into this problem in their applications, so it is worth to let them aware of this problem in the tutorial.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I removed LocalRPCMeasureContext to make the major part of the tutorial simpler.
Because resuming a search is a very advanced usage, most people don't need this part. So we can make it complicated. But all sections above should be as simple as possible.

tutorials/auto_scheduler/tune_conv2d_layer_cuda.py Outdated Show resolved Hide resolved
tutorials/auto_scheduler/tune_matmul_x86.py Outdated Show resolved Hide resolved
tutorials/auto_scheduler/tune_matmul_x86.py Outdated Show resolved Hide resolved
Comment on lines +192 to +193
# You can also use :any:`auto_scheduler.measure.LocalRPCMeasureContext` for auto-scheduler,
# as shown in the GPU tutorial (:ref:`auto-scheduler-conv-gpu`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively, if there's no obvious performance impact, we should use RPC runner on both CPU and GPU, so it'd better to mention why we didn't use it in this tutorial.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the comment from @jcf94 when reviewing this part. I agree that this may be confusing, but I prefer to keep the resume part; otherwise users will still encounter this issue when they are trying to resume the search. For example, one may write a script to do the following:

sch, args = do_search(task, log_file)
perf = evaluate_result(sch, args)
while perf < goal:
  sch, args = resume_search(task, log_file)
  perf = evaluate_result(sch, args)

@merrymercy
Copy link
Member Author

Comments are addressed. @comaniac @jcf94 @tqchen

@comaniac comaniac merged commit eee04c0 into apache:master Sep 19, 2020
@comaniac
Copy link
Contributor

Thanks @merrymercy @jcf94

@merrymercy merrymercy deleted the ansor-gpu-tutorial branch September 20, 2020 23:43
TusharKanekiDey pushed a commit to TusharKanekiDey/tvm that referenced this pull request Oct 13, 2020
…pache#6512)

* add gpu tutorial

* refactor mutation in evolutionary search

* update

* update double matmul

* fix lint

* add double matmul test

* fix mutate compute location

* fix sketch search policy

* fix lint

* update

* address comments

* fix PruneInvalidStates
TusharKanekiDey pushed a commit to TusharKanekiDey/tvm that referenced this pull request Oct 14, 2020
…pache#6512)

* add gpu tutorial

* refactor mutation in evolutionary search

* update

* update double matmul

* fix lint

* add double matmul test

* fix mutate compute location

* fix sketch search policy

* fix lint

* update

* address comments

* fix PruneInvalidStates
TusharKanekiDey pushed a commit to TusharKanekiDey/tvm that referenced this pull request Oct 15, 2020
…pache#6512)

* add gpu tutorial

* refactor mutation in evolutionary search

* update

* update double matmul

* fix lint

* add double matmul test

* fix mutate compute location

* fix sketch search policy

* fix lint

* update

* address comments

* fix PruneInvalidStates
TusharKanekiDey pushed a commit to TusharKanekiDey/tvm that referenced this pull request Oct 15, 2020
…pache#6512)

* add gpu tutorial

* refactor mutation in evolutionary search

* update

* update double matmul

* fix lint

* add double matmul test

* fix mutate compute location

* fix sketch search policy

* fix lint

* update

* address comments

* fix PruneInvalidStates
TusharKanekiDey pushed a commit to TusharKanekiDey/tvm that referenced this pull request Oct 16, 2020
…pache#6512)

* add gpu tutorial

* refactor mutation in evolutionary search

* update

* update double matmul

* fix lint

* add double matmul test

* fix mutate compute location

* fix sketch search policy

* fix lint

* update

* address comments

* fix PruneInvalidStates
TusharKanekiDey pushed a commit to TusharKanekiDey/tvm that referenced this pull request Oct 16, 2020
…pache#6512)

* add gpu tutorial

* refactor mutation in evolutionary search

* update

* update double matmul

* fix lint

* add double matmul test

* fix mutate compute location

* fix sketch search policy

* fix lint

* update

* address comments

* fix PruneInvalidStates
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Oct 19, 2020
…pache#6512)

* add gpu tutorial

* refactor mutation in evolutionary search

* update

* update double matmul

* fix lint

* add double matmul test

* fix mutate compute location

* fix sketch search policy

* fix lint

* update

* address comments

* fix PruneInvalidStates
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Oct 19, 2020
…pache#6512)

* add gpu tutorial

* refactor mutation in evolutionary search

* update

* update double matmul

* fix lint

* add double matmul test

* fix mutate compute location

* fix sketch search policy

* fix lint

* update

* address comments

* fix PruneInvalidStates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants