-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ANSOR] Auto-scheduler tutorial for GPU and necessary refactor/fix #6512
Conversation
21e013b
to
8efbc59
Compare
8efbc59
to
8f46f2e
Compare
###################################################################### | ||
# .. note:: | ||
# We cannot run the line above because of the conflict between | ||
# python's multiprocessing and tvm's thread pool. | ||
# After running a tvm generated binary (L112), the python's multiprocessing | ||
# library will hang forever. | ||
# You have to make sure that you don't run any tvm generated binaries before | ||
# calling ansor's search. To run the L156 above, you should comment out L112-114. | ||
# After running a tvm generated binary the python's multiprocessing library | ||
# will hang forever. You have to make sure that you don't run any tvm | ||
# generated binaries before calling auot-scheduler's search. | ||
# To run the function above, you should comment out all code in | ||
# "Check correctness and evaluate performance" section. | ||
# | ||
# You should be careful about this problem in your applications. | ||
# There are other workarounds for this problem. | ||
# For example, you can start a new thread/process (with the builtin python library | ||
# threading or multiprocessing) and run the tvm binaries in the new thread/process. | ||
# This provides an isolation and avoids the conflict in the main thread/process. | ||
# You can also use :any:`auto_scheduler.measure.LocalRPCMeasureContext` for auto-scheduler, | ||
# as shown in the GPU tutorial (:ref:`auto-scheduler-conv-gpu`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may still be confusing. I'm thinking it may be better to just delete the resume part of the CPU tutorials to keep it as simple as possible.
We can add a comment saying that LocalRPCMeasureContext
is recommended to be used in all hardware targets in the GPU tutorial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People are likely to get into this problem in their applications, so it is worth to let them aware of this problem in the tutorial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I removed LocalRPCMeasureContext
to make the major part of the tutorial simpler.
Because resuming a search is a very advanced usage, most people don't need this part. So we can make it complicated. But all sections above should be as simple as possible.
# You can also use :any:`auto_scheduler.measure.LocalRPCMeasureContext` for auto-scheduler, | ||
# as shown in the GPU tutorial (:ref:`auto-scheduler-conv-gpu`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intuitively, if there's no obvious performance impact, we should use RPC runner on both CPU and GPU, so it'd better to mention why we didn't use it in this tutorial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed the comment from @jcf94 when reviewing this part. I agree that this may be confusing, but I prefer to keep the resume part; otherwise users will still encounter this issue when they are trying to resume the search. For example, one may write a script to do the following:
sch, args = do_search(task, log_file)
perf = evaluate_result(sch, args)
while perf < goal:
sch, args = resume_search(task, log_file)
perf = evaluate_result(sch, args)
Thanks @merrymercy @jcf94 |
…pache#6512) * add gpu tutorial * refactor mutation in evolutionary search * update * update double matmul * fix lint * add double matmul test * fix mutate compute location * fix sketch search policy * fix lint * update * address comments * fix PruneInvalidStates
…pache#6512) * add gpu tutorial * refactor mutation in evolutionary search * update * update double matmul * fix lint * add double matmul test * fix mutate compute location * fix sketch search policy * fix lint * update * address comments * fix PruneInvalidStates
…pache#6512) * add gpu tutorial * refactor mutation in evolutionary search * update * update double matmul * fix lint * add double matmul test * fix mutate compute location * fix sketch search policy * fix lint * update * address comments * fix PruneInvalidStates
…pache#6512) * add gpu tutorial * refactor mutation in evolutionary search * update * update double matmul * fix lint * add double matmul test * fix mutate compute location * fix sketch search policy * fix lint * update * address comments * fix PruneInvalidStates
…pache#6512) * add gpu tutorial * refactor mutation in evolutionary search * update * update double matmul * fix lint * add double matmul test * fix mutate compute location * fix sketch search policy * fix lint * update * address comments * fix PruneInvalidStates
…pache#6512) * add gpu tutorial * refactor mutation in evolutionary search * update * update double matmul * fix lint * add double matmul test * fix mutate compute location * fix sketch search policy * fix lint * update * address comments * fix PruneInvalidStates
…pache#6512) * add gpu tutorial * refactor mutation in evolutionary search * update * update double matmul * fix lint * add double matmul test * fix mutate compute location * fix sketch search policy * fix lint * update * address comments * fix PruneInvalidStates
…pache#6512) * add gpu tutorial * refactor mutation in evolutionary search * update * update double matmul * fix lint * add double matmul test * fix mutate compute location * fix sketch search policy * fix lint * update * address comments * fix PruneInvalidStates
tutorials/auto_scheduler/tune_conv2d_layer_cuda.py
)