Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ansor][AutoTVM v2.0] Phase 1: Add cache_read/cache_write steps #6107

Merged
merged 12 commits into from
Jul 27, 2020

Conversation

jcf94
Copy link
Contributor

@jcf94 jcf94 commented Jul 22, 2020

For the full upstream plan, see Ansor RFC.

In this PR, we bring cache_read/cache_write steps for Ansor auto_scheduler.

These steps will insert extra stage to the original ComputeDAG, the class member current_compute_dag in State is used to track the up-to-date ComputeDAG.

cc @merrymercy @comaniac @junrushao1994 @FrozenGene

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished a half.

python/tvm/auto_scheduler/loop_state.py Outdated Show resolved Hide resolved
python/tvm/auto_scheduler/loop_state.py Outdated Show resolved Hide resolved
python/tvm/auto_scheduler/loop_state.py Outdated Show resolved Hide resolved
Comment on lines 439 to 440
# Add a new stage will change all ops. But we still want to use the old ops to index stages,
# So we keep updating them and do not remove the old ops.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This description is better to just be the docstring of this function.
  • The function name is misleading. This function does not insert a new stage. Instead, it updates the stage ID map by taking a new stage ID. The name like _update_stage_id_map might be better.
  • The return type of this function is improper, since it does nothing with added_op. Better to not return anything.
  • I feel this function should not be a standalone function. This is tightly coupled CacheRead/CacheWrite implementation. In this case, it is improper to explicitly call this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This is an internal function which should not be called outside, seems the docstring is unnecessary.
  • This function actuall means insert a new stage to the id map(add should modify the stage id behind this stage after the insert), will consider for a better name.
  • Good point.
  • This will be used in Rfactor, too.

Copy link
Contributor Author

@jcf94 jcf94 Jul 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to a _apply_stage_id_offset, this plays a similar role as the AttachMap::ApplyStageIdOffset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I know we don't need docstring for internal functions, but it's not a problem if we need. At least we should put the description in the beginning of the function.
  • _apply_stage_id_offset sounds better.
  • I didn't mean this function will be used by CacheRead/Write only. I meant this function is required when we implement a step that changes the stages, but there's no checking to enforce that. An ideal solution would be having a function to deal with stage changing and let CacheRead/CacheWrite/Rfactor call the function.

self.stage_id_map[added_op] = new_stage_id

# Update stage_id_map for new ops
self._update_stage_id_map()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that this call will override the mapped stage ID of all ops from stages but don't fully understand. Could you elaborate more about _insert_new_stage?

Copy link
Contributor Author

@jcf94 jcf94 Jul 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After cache_read/cache_write, actually all the stages behind the new added stage will be update, their ops are different from before.
This is to keep the original Tensor/ops works and add those new ops to the id map.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can imagine the stage ID map will have all ops, including out-of-date and up-to-date ops. Whatever the op is up-to-date, this function makes sure it points to the right stage.

If the above statement about my understanding to the stage ID map is correct, then I agree with you that this is the must. Meanwhile, we need to add more explanations.

src/auto_scheduler/compute_dag.h Outdated Show resolved Hide resolved
Comment on lines 201 to 202
* \brief Traverse through `stage_to_attach_iter` and `iter_to_attached_stages` map, add offset
* to stage indexes that are larger than the start_id. Used for steps that inserts net stages to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean "and update offset"?

src/auto_scheduler/loop_state.h Outdated Show resolved Hide resolved
@tqchen
Copy link
Member

tqchen commented Jul 22, 2020

cc @jroesch @jwfromm @junrushao1994

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is a good example to demonstrate how to add steps :)

src/auto_scheduler/transform_step.cc Outdated Show resolved Hide resolved
src/auto_scheduler/transform_step.cc Outdated Show resolved Hide resolved
src/auto_scheduler/transform_step.cc Outdated Show resolved Hide resolved
src/auto_scheduler/transform_step.cc Outdated Show resolved Hide resolved
src/auto_scheduler/transform_step.cc Show resolved Hide resolved
Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @comaniac for most of the points. Also some additional comments. Thanks for the contribution!

#
# Seems there's bug with the input/output tensor. Such multi outputs case
# should be unusual, so we make some hack on DoCacheWrite
# To be fixed in the future
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe we should explicitly add a "TODO" mark here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, we shouldn't use TODOs but we should track these in an Ansor stabilization PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these TODOs are necessary, or others may be confused seeing these codes.
We dose have a track issue for all those needs to be handled: https://github.com/merrymercy/Ansor/issues/65

Copy link
Member

@merrymercy merrymercy Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug of tvm and is outside the scope of Ansor. This feature is never used in Ansor. We just found this bug by accident during development. So we can just skip it

src/auto_scheduler/transform_step.cc Show resolved Hide resolved
Copy link
Member

@jroesch jroesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My high level comments are the documentation needs more clarity before we merge, and we should move out all TODOs into a larger Ansor tracking issue which tracks all bugs/features that need to be resolved before stabilization.

@@ -351,6 +351,72 @@ def compute_root(self, stage):
self.state_object = _ffi_api.StateComputeRoot(self.state_object,
self._resolve_stage_id(stage))

def cache_read(self, stage, scope_name, reader_stages):
""" Schedule primitive corresponds to te.schedule.cache_read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what this step does?

Copy link
Member

@merrymercy merrymercy Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step does the same thing as te.schedule.cache_read does. We choose to add a pointer to te.schedule.cache_read instead of copying the docstring from it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can make the pointer more clear. (e.g., say "see also te.schedule.cache_read")

Copy link
Contributor Author

@jcf94 jcf94 Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a pointer to te.schedule.cache_read, we may also add this to other steps later.

python/tvm/auto_scheduler/loop_state.py Show resolved Hide resolved
src/auto_scheduler/loop_state.h Outdated Show resolved Hide resolved
@@ -225,6 +238,13 @@ class StateNode : public Object {
* operation.
*/
AttachMap attach_map;
/*!
* \brief The up-to-date ComputeDAG of this state, used for some steps that may change the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this better? given the above methods it seems that current_compute_dag might in fact not be up-to-date, given that some scheduling steps modify the compute dag.

Copy link
Member

@merrymercy merrymercy Jul 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dag is always up-to-date.
The comment in the above method says the "initial dag" may not be up-to-date. So we need to store a new up-to-date dag here.

src/auto_scheduler/loop_state.h Outdated Show resolved Hide resolved
src/auto_scheduler/loop_state.h Outdated Show resolved Hide resolved
/********** Primitives adding new stages **********/

/*!
* \brief Common part for steps that add new stages(e.g. CacheReadStep, CacheWriteStep,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this more?

const ComputeDAG& current_compute_dag =
dag.ReplayAndGetDAG(GetStageModifiableSteps(GetRef<Step>(this), (*state)->transform_steps));
int added_ops = current_compute_dag->ops.size() - last_dag_op_size;
// TODO(jcf94): Update this check to equal after fixing the cache write bug in TVM
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we track all of these in an Ansor tracking issue instead of putting TODOs in the code. My worry is it is very easy to forget about all the bugs that must be resolved before stabilizing a new subsystem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can open an issue later.
For this specific line, this is a bug of tvm and this feature is not used in Ansor. So it is okay to skip this.

src/auto_scheduler/transform_step.h Outdated Show resolved Hide resolved
#
# Seems there's bug with the input/output tensor. Such multi outputs case
# should be unusual, so we make some hack on DoCacheWrite
# To be fixed in the future
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, we shouldn't use TODOs but we should track these in an Ansor stabilization PR

@jcf94 jcf94 requested review from merrymercy and jroesch July 24, 2020 02:16
@jcf94
Copy link
Contributor Author

jcf94 commented Jul 24, 2020

@merrymercy @junrushao1994 @jroesch @comaniac Comments are addressed, please take another look.
I'll submit the PRs for the rest steps soon after this.

@merrymercy
Copy link
Member

merrymercy commented Jul 24, 2020

I opened an issue to track the unresolved todo items during upstream. #6133
However, all todos in this PR are not related to Ansor and are outside the scope of Ansor. They won't be tracked in the above issue either.
They are just minor bugs of tvm we found during the development. We don't have the plan to fix them because we don't use these features.

python/tvm/auto_scheduler/compute_dag.py Outdated Show resolved Hide resolved
src/auto_scheduler/transform_step.cc Outdated Show resolved Hide resolved
src/auto_scheduler/loop_state.h Outdated Show resolved Hide resolved
src/auto_scheduler/transform_step.h Outdated Show resolved Hide resolved
src/auto_scheduler/compute_dag.h Outdated Show resolved Hide resolved
src/auto_scheduler/compute_dag.h Outdated Show resolved Hide resolved
src/auto_scheduler/loop_state.h Outdated Show resolved Hide resolved
* RfactorStep). This will return all steps that can change the number of stages in a ComputeDAG,
* and stop by the current step.
*/
Array<Step> GetFormerStageModifiableSteps(Step current_step, const Array<Step>& transform_steps) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming seems not quite clear. How about GetMutableSteps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or StageMutableSteps ?
It seems that just MutableSteps is still not quite clear.

@merrymercy
Copy link
Member

merrymercy commented Jul 25, 2020

@jcf94 Sorry, you have to rebase because I moved hearder files to the public folder include/tvm/auto_scheduler in #6103

@jcf94
Copy link
Contributor Author

jcf94 commented Jul 27, 2020

@jcf94 Sorry, you have to rebase because I moved hearder files to the public folder include/tvm/auto_scheduler in #6103

Fine, updated with the new upstream master.

@jcf94 jcf94 requested a review from merrymercy July 27, 2020 01:45
@merrymercy
Copy link
Member

@jroesch @comaniac please take another look and approve

@merrymercy merrymercy dismissed jroesch’s stale review July 27, 2020 04:32

comments are addressed

@merrymercy
Copy link
Member

merrymercy commented Jul 27, 2020

I dismissed @jroesch 's review because all of his comments are addressed.
Most of his concern is due to some todo items caused by a minor TVM bug, which is outside of the scope of this PR.
Following his suggestion, I also opened an issue (#6133) to track other todo items.

@merrymercy merrymercy merged commit b8f8b8d into apache:master Jul 27, 2020
@jcf94 jcf94 deleted the cache_read_write branch July 29, 2020 07:29
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Aug 26, 2020
…he#6107)

* Add cache_read/cache_write step

* Update

* Update

* Update

* Update state->current_compute_dag to Optional

* Update

* Update doc

* Update

* Update

* Doc update

* Update
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Aug 26, 2020
…he#6107)

* Add cache_read/cache_write step

* Update

* Update

* Update

* Update state->current_compute_dag to Optional

* Update

* Update doc

* Update

* Update

* Doc update

* Update
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Aug 26, 2020
…he#6107)

* Add cache_read/cache_write step

* Update

* Update

* Update

* Update state->current_compute_dag to Optional

* Update

* Update doc

* Update

* Update

* Doc update

* Update
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Sep 2, 2020
…he#6107)

* Add cache_read/cache_write step

* Update

* Update

* Update

* Update state->current_compute_dag to Optional

* Update

* Update doc

* Update

* Update

* Doc update

* Update
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Sep 3, 2020
…he#6107)

* Add cache_read/cache_write step

* Update

* Update

* Update

* Update state->current_compute_dag to Optional

* Update

* Update doc

* Update

* Update

* Doc update

* Update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants