Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes. #9326

Merged
merged 4 commits into from
Nov 12, 2021

Conversation

mbs-octoml
Copy link
Contributor

@mbs-octoml mbs-octoml commented Oct 19, 2021

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from #9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:

  • Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
  • Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero. The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

@mbs-octoml
Copy link
Contributor Author

(Nearly ready for review -- using ci to tease out what i've broken)

mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request Nov 5, 2021
(This is in preparation for apache#9326, which I'm trying to
make as small as possible, sorry for the scatter gun.)

If no explicit host target is given but the given
TargetMap has targets with hosts, try to use those
to establish the host_target.

Also make sure both the 'legacy' TargetMap representation
and the newer representation agree to pointer equality on
their targets.

That triggered a small change in the Interpreter to
make better use of the CompilationConfig.

Since Targets are used in ObjectPtrEquality maps AND we
tend to call CheckAndUpdateHostConsistency all over the
place (I count 65) I had a tricky time debugging failures. Added
a ToDebugString() to Target which will include the host,
and made sure the pretty printer will use the debug-friendly
form when the show_meta_data_ flag is false.
@mbs-octoml mbs-octoml force-pushed the mbs-scopes branch 2 times, most recently from e82b00f to dfbb253 Compare November 5, 2021 21:07
mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request Nov 6, 2021
(This is a bit of a grab bag in preparation for apache#9326
which I'm trying to minimize)

While switching the device planner to use SEScopes I had a lot
of trouble with Target's not matching up.
- If no explicit host target is given but the given
  TargetMap has targets with hosts, try to use those
  to establish the host_target.
- Make sure both the 'legacy' TargetMap representation
  and the newer representation agree to pointer equality on
  their targets.
- Make sure the Interpreter uses the target from CompilationConfig
  since it's been normalized.

To debug the above:
- When in pretty printing with show_meta_data_ false give as much
  detail on SEScopes, Targets and call attributes as possible.
  That needed some rework in the relay_text_printer.cc.
- Ditto for critical 'target' attribute on PrimFuncs.
- Also added a Target::ToDebugString so I could see the
  host fields along with everything else since a lot of problems
  were caused by a mismatch of 'the same' Target with and without
  a host. (Tried using that for the ReprPrinter but broken unit
  tests.)

Note that the codebase assumes Targets are compared by ObjectPtrEquality,
yet CheckAndUpdateHostConsistency (I count 65 call sites) changes the targets.
Ultimately CompilationConfig or it's ultimate replacement should ensure we munge
targets only once at the 'main' entry points.
areusch pushed a commit that referenced this pull request Nov 9, 2021
(This is a bit of a grab bag in preparation for #9326
which I'm trying to minimize)

While switching the device planner to use SEScopes I had a lot
of trouble with Target's not matching up.
- If no explicit host target is given but the given
  TargetMap has targets with hosts, try to use those
  to establish the host_target.
- Make sure both the 'legacy' TargetMap representation
  and the newer representation agree to pointer equality on
  their targets.
- Make sure the Interpreter uses the target from CompilationConfig
  since it's been normalized.

To debug the above:
- When in pretty printing with show_meta_data_ false give as much
  detail on SEScopes, Targets and call attributes as possible.
  That needed some rework in the relay_text_printer.cc.
- Ditto for critical 'target' attribute on PrimFuncs.
- Also added a Target::ToDebugString so I could see the
  host fields along with everything else since a lot of problems
  were caused by a mismatch of 'the same' Target with and without
  a host. (Tried using that for the ReprPrinter but broken unit
  tests.)

Note that the codebase assumes Targets are compared by ObjectPtrEquality,
yet CheckAndUpdateHostConsistency (I count 65 call sites) changes the targets.
Ultimately CompilationConfig or it's ultimate replacement should ensure we munge
targets only once at the 'main' entry points.
@mbs-octoml mbs-octoml changed the title [DRAFT] Switch device planning to be in units of SEScope Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes. Nov 9, 2021
@mbs-octoml mbs-octoml marked this pull request as ready for review November 9, 2021 17:50
Copy link
Member

@jroesch jroesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay mostly looks mechanical to me, LGTM.

src/relay/backend/te_compiler.h Show resolved Hide resolved
tests/python/unittest/test_micro_model_library_format.py Outdated Show resolved Hide resolved
@mbs-octoml
Copy link
Contributor Author

I'm trying to repro the tests/python/driver/tvmc/test_compiler.py::test_compile_tflite_module_with_external_codegen_cmsisnn failure.

Note #9480, which changed the assert for 3 to 4 artifacts, while I'm still getting 3. That's suspicions.

Attempting to repro in the docker ci-cpu image terminates my remote desktop session (?!?). I'll try to get the cnsis-nn setup going directly, which is a good chance to experience the BYOC dependency setup first hand.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.
Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)
@mbs-octoml
Copy link
Contributor Author

Ready to merge.

@junrushao junrushao merged commit be03d62 into apache:main Nov 12, 2021
@mbs-octoml mbs-octoml deleted the mbs-scopes branch November 12, 2021 18:22
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
(This is a bit of a grab bag in preparation for apache#9326
which I'm trying to minimize)

While switching the device planner to use SEScopes I had a lot
of trouble with Target's not matching up.
- If no explicit host target is given but the given
  TargetMap has targets with hosts, try to use those
  to establish the host_target.
- Make sure both the 'legacy' TargetMap representation
  and the newer representation agree to pointer equality on
  their targets.
- Make sure the Interpreter uses the target from CompilationConfig
  since it's been normalized.

To debug the above:
- When in pretty printing with show_meta_data_ false give as much
  detail on SEScopes, Targets and call attributes as possible.
  That needed some rework in the relay_text_printer.cc.
- Ditto for critical 'target' attribute on PrimFuncs.
- Also added a Target::ToDebugString so I could see the
  host fields along with everything else since a lot of problems
  were caused by a mismatch of 'the same' Target with and without
  a host. (Tried using that for the ReprPrinter but broken unit
  tests.)

Note that the codebase assumes Targets are compared by ObjectPtrEquality,
yet CheckAndUpdateHostConsistency (I count 65 call sites) changes the targets.
Ultimately CompilationConfig or it's ultimate replacement should ensure we munge
targets only once at the 'main' entry points.
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
(This is a bit of a grab bag in preparation for apache#9326
which I'm trying to minimize)

While switching the device planner to use SEScopes I had a lot
of trouble with Target's not matching up.
- If no explicit host target is given but the given
  TargetMap has targets with hosts, try to use those
  to establish the host_target.
- Make sure both the 'legacy' TargetMap representation
  and the newer representation agree to pointer equality on
  their targets.
- Make sure the Interpreter uses the target from CompilationConfig
  since it's been normalized.

To debug the above:
- When in pretty printing with show_meta_data_ false give as much
  detail on SEScopes, Targets and call attributes as possible.
  That needed some rework in the relay_text_printer.cc.
- Ditto for critical 'target' attribute on PrimFuncs.
- Also added a Target::ToDebugString so I could see the
  host fields along with everything else since a lot of problems
  were caused by a mismatch of 'the same' Target with and without
  a host. (Tried using that for the ReprPrinter but broken unit
  tests.)

Note that the codebase assumes Targets are compared by ObjectPtrEquality,
yet CheckAndUpdateHostConsistency (I count 65 call sites) changes the targets.
Ultimately CompilationConfig or it's ultimate replacement should ensure we munge
targets only once at the 'main' entry points.
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022
(This is a bit of a grab bag in preparation for apache#9326
which I'm trying to minimize)

While switching the device planner to use SEScopes I had a lot
of trouble with Target's not matching up.
- If no explicit host target is given but the given
  TargetMap has targets with hosts, try to use those
  to establish the host_target.
- Make sure both the 'legacy' TargetMap representation
  and the newer representation agree to pointer equality on
  their targets.
- Make sure the Interpreter uses the target from CompilationConfig
  since it's been normalized.

To debug the above:
- When in pretty printing with show_meta_data_ false give as much
  detail on SEScopes, Targets and call attributes as possible.
  That needed some rework in the relay_text_printer.cc.
- Ditto for critical 'target' attribute on PrimFuncs.
- Also added a Target::ToDebugString so I could see the
  host fields along with everything else since a lot of problems
  were caused by a mismatch of 'the same' Target with and without
  a host. (Tried using that for the ReprPrinter but broken unit
  tests.)

Note that the codebase assumes Targets are compared by ObjectPtrEquality,
yet CheckAndUpdateHostConsistency (I count 65 call sites) changes the targets.
Ultimately CompilationConfig or it's ultimate replacement should ensure we munge
targets only once at the 'main' entry points.
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
yangulei pushed a commit to yangulei/tvm that referenced this pull request Jan 11, 2022
(This is a bit of a grab bag in preparation for apache#9326
which I'm trying to minimize)

While switching the device planner to use SEScopes I had a lot
of trouble with Target's not matching up.
- If no explicit host target is given but the given
  TargetMap has targets with hosts, try to use those
  to establish the host_target.
- Make sure both the 'legacy' TargetMap representation
  and the newer representation agree to pointer equality on
  their targets.
- Make sure the Interpreter uses the target from CompilationConfig
  since it's been normalized.

To debug the above:
- When in pretty printing with show_meta_data_ false give as much
  detail on SEScopes, Targets and call attributes as possible.
  That needed some rework in the relay_text_printer.cc.
- Ditto for critical 'target' attribute on PrimFuncs.
- Also added a Target::ToDebugString so I could see the
  host fields along with everything else since a lot of problems
  were caused by a mismatch of 'the same' Target with and without
  a host. (Tried using that for the ReprPrinter but broken unit
  tests.)

Note that the codebase assumes Targets are compared by ObjectPtrEquality,
yet CheckAndUpdateHostConsistency (I count 65 call sites) changes the targets.
Ultimately CompilationConfig or it's ultimate replacement should ensure we munge
targets only once at the 'main' entry points.
yangulei pushed a commit to yangulei/tvm that referenced this pull request Jan 11, 2022
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
(This is a bit of a grab bag in preparation for apache#9326
which I'm trying to minimize)

While switching the device planner to use SEScopes I had a lot
of trouble with Target's not matching up.
- If no explicit host target is given but the given
  TargetMap has targets with hosts, try to use those
  to establish the host_target.
- Make sure both the 'legacy' TargetMap representation
  and the newer representation agree to pointer equality on
  their targets.
- Make sure the Interpreter uses the target from CompilationConfig
  since it's been normalized.

To debug the above:
- When in pretty printing with show_meta_data_ false give as much
  detail on SEScopes, Targets and call attributes as possible.
  That needed some rework in the relay_text_printer.cc.
- Ditto for critical 'target' attribute on PrimFuncs.
- Also added a Target::ToDebugString so I could see the
  host fields along with everything else since a lot of problems
  were caused by a mismatch of 'the same' Target with and without
  a host. (Tried using that for the ReprPrinter but broken unit
  tests.)

Note that the codebase assumes Targets are compared by ObjectPtrEquality,
yet CheckAndUpdateHostConsistency (I count 65 call sites) changes the targets.
Ultimately CompilationConfig or it's ultimate replacement should ensure we munge
targets only once at the 'main' entry points.
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants