-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ansor][AutoTVM v2.0] Phase 1: Add cache_read/cache_write steps #6107
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished a half.
# Add a new stage will change all ops. But we still want to use the old ops to index stages, | ||
# So we keep updating them and do not remove the old ops. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This description is better to just be the docstring of this function.
- The function name is misleading. This function does not insert a new stage. Instead, it updates the stage ID map by taking a new stage ID. The name like
_update_stage_id_map
might be better. - The return type of this function is improper, since it does nothing with
added_op
. Better to not return anything. - I feel this function should not be a standalone function. This is tightly coupled CacheRead/CacheWrite implementation. In this case, it is improper to explicitly call this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This is an internal function which should not be called outside, seems the docstring is unnecessary.
- This function actuall means insert a new stage to the id map(add should modify the stage id behind this stage after the insert), will consider for a better name.
- Good point.
- This will be used in Rfactor, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to a _apply_stage_id_offset
, this plays a similar role as the AttachMap::ApplyStageIdOffset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I know we don't need docstring for internal functions, but it's not a problem if we need. At least we should put the description in the beginning of the function.
_apply_stage_id_offset
sounds better.- I didn't mean this function will be used by CacheRead/Write only. I meant this function is required when we implement a step that changes the stages, but there's no checking to enforce that. An ideal solution would be having a function to deal with stage changing and let CacheRead/CacheWrite/Rfactor call the function.
self.stage_id_map[added_op] = new_stage_id | ||
|
||
# Update stage_id_map for new ops | ||
self._update_stage_id_map() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that this call will override the mapped stage ID of all ops from stages but don't fully understand. Could you elaborate more about _insert_new_stage
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After cache_read/cache_write, actually all the stages behind the new added stage will be update, their ops are different from before.
This is to keep the original Tensor/ops works and add those new ops to the id map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we can imagine the stage ID map will have all ops, including out-of-date and up-to-date ops. Whatever the op is up-to-date, this function makes sure it points to the right stage.
If the above statement about my understanding to the stage ID map is correct, then I agree with you that this is the must. Meanwhile, we need to add more explanations.
src/auto_scheduler/loop_state.h
Outdated
* \brief Traverse through `stage_to_attach_iter` and `iter_to_attached_stages` map, add offset | ||
* to stage indexes that are larger than the start_id. Used for steps that inserts net stages to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean "and update offset"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is a good example to demonstrate how to add steps :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @comaniac for most of the points. Also some additional comments. Thanks for the contribution!
# | ||
# Seems there's bug with the input/output tensor. Such multi outputs case | ||
# should be unusual, so we make some hack on DoCacheWrite | ||
# To be fixed in the future |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe we should explicitly add a "TODO" mark here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, we shouldn't use TODOs but we should track these in an Ansor stabilization PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these TODOs are necessary, or others may be confused seeing these codes.
We dose have a track issue for all those needs to be handled: https://github.com/merrymercy/Ansor/issues/65
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bug of tvm and is outside the scope of Ansor. This feature is never used in Ansor. We just found this bug by accident during development. So we can just skip it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My high level comments are the documentation needs more clarity before we merge, and we should move out all TODOs into a larger Ansor tracking issue which tracks all bugs/features that need to be resolved before stabilization.
@@ -351,6 +351,72 @@ def compute_root(self, stage): | |||
self.state_object = _ffi_api.StateComputeRoot(self.state_object, | |||
self._resolve_stage_id(stage)) | |||
|
|||
def cache_read(self, stage, scope_name, reader_stages): | |||
""" Schedule primitive corresponds to te.schedule.cache_read. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this step does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step does the same thing as te.schedule.cache_read
does. We choose to add a pointer to te.schedule.cache_read
instead of copying the docstring from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can make the pointer more clear. (e.g., say "see also te.schedule.cache_read
")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a pointer to te.schedule.cache_read
, we may also add this to other steps later.
src/auto_scheduler/loop_state.h
Outdated
@@ -225,6 +238,13 @@ class StateNode : public Object { | |||
* operation. | |||
*/ | |||
AttachMap attach_map; | |||
/*! | |||
* \brief The up-to-date ComputeDAG of this state, used for some steps that may change the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this better? given the above methods it seems that current_compute_dag
might in fact not be up-to-date, given that some scheduling steps modify the compute dag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dag is always up-to-date.
The comment in the above method says the "initial dag" may not be up-to-date. So we need to store a new up-to-date dag here.
/********** Primitives adding new stages **********/ | ||
|
||
/*! | ||
* \brief Common part for steps that add new stages(e.g. CacheReadStep, CacheWriteStep, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this more?
const ComputeDAG& current_compute_dag = | ||
dag.ReplayAndGetDAG(GetStageModifiableSteps(GetRef<Step>(this), (*state)->transform_steps)); | ||
int added_ops = current_compute_dag->ops.size() - last_dag_op_size; | ||
// TODO(jcf94): Update this check to equal after fixing the cache write bug in TVM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we track all of these in an Ansor tracking issue instead of putting TODOs in the code. My worry is it is very easy to forget about all the bugs that must be resolved before stabilizing a new subsystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can open an issue later.
For this specific line, this is a bug of tvm and this feature is not used in Ansor. So it is okay to skip this.
# | ||
# Seems there's bug with the input/output tensor. Such multi outputs case | ||
# should be unusual, so we make some hack on DoCacheWrite | ||
# To be fixed in the future |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, we shouldn't use TODOs but we should track these in an Ansor stabilization PR
@merrymercy @junrushao1994 @jroesch @comaniac Comments are addressed, please take another look. |
I opened an issue to track the unresolved todo items during upstream. #6133 |
* RfactorStep). This will return all steps that can change the number of stages in a ComputeDAG, | ||
* and stop by the current step. | ||
*/ | ||
Array<Step> GetFormerStageModifiableSteps(Step current_step, const Array<Step>& transform_steps) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This naming seems not quite clear. How about GetMutableSteps
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or StageMutableSteps
?
It seems that just MutableSteps
is still not quite clear.
…he#6107) * Add cache_read/cache_write step * Update * Update * Update * Update state->current_compute_dag to Optional * Update * Update doc * Update * Update * Doc update * Update
…he#6107) * Add cache_read/cache_write step * Update * Update * Update * Update state->current_compute_dag to Optional * Update * Update doc * Update * Update * Doc update * Update
…he#6107) * Add cache_read/cache_write step * Update * Update * Update * Update state->current_compute_dag to Optional * Update * Update doc * Update * Update * Doc update * Update
…he#6107) * Add cache_read/cache_write step * Update * Update * Update * Update state->current_compute_dag to Optional * Update * Update doc * Update * Update * Doc update * Update
…he#6107) * Add cache_read/cache_write step * Update * Update * Update * Update state->current_compute_dag to Optional * Update * Update doc * Update * Update * Doc update * Update
For the full upstream plan, see Ansor RFC.
In this PR, we bring cache_read/cache_write steps for Ansor auto_scheduler.
These steps will insert extra stage to the original ComputeDAG, the class member
current_compute_dag
in State is used to track the up-to-date ComputeDAG.cc @merrymercy @comaniac @junrushao1994 @FrozenGene