Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lowering] Add TargetDevice composite data structure. #8892

Closed
wants to merge 2 commits into from

Conversation

jroesch
Copy link
Member

@jroesch jroesch commented Aug 31, 2021

This adds a new data structure which captures both the target and device as a composite data structure, see the comments for more motivation. The idea here is to enable us to track same device types with different targets. For example we could deal with ARM core with different targets for different CPUs, GPU, and NPU.

cc @junrushao1994 @areusch @mbs-octoml @csullivan @adstraw @manupa-arm @Mousius

Copy link
Contributor

@electriclilies electriclilies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one nit and a clarifying question about the longterm plan for Targets and Devices ---
Is idea of TargetDevice to replace both Target and Device? And if we do replace Target with TargetDevice, how will we indicate that the Target for an expression has been set but not the Device? Thanks :)

* \brief A compile time representation of a target device.
*
* This data structure consists of both the compiler target and a virtual device,
* a tvm::Device where the the identifier is a virtual identifier and a concrete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on what you mean by a virtual identifier?

@junrushao
Copy link
Member

Hey Jared, thanks for the PR!

I understand the design of the virtual device id and it looks good to me if there are some use cases of it (usually we just assume it’s 0).

On the other hand, there is direct correspondence between TargetKind to DLDeviceType, e.g. https://github.com/apache/tvm/blob/main/src/target/target_kind.cc#L278. So shall we just infer it from TargetNode::kind?

* a tvm::Device where the the identifier is a virtual identifier and a concrete
* device type.
*
* Executors are required to handle how to map virtual device identifiers to physical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean up this sentence a bit:

Suggested change
* Executors are required to handle how to map virtual device identifiers to physical
* Before inference, executors should map virtual device identifiers (included in the executor config) to physical
device identifiers (e.g. DLDevice)

* \file tvm/target/target_device.h
* \brief A compile time representation of a target device.
*
* This data structure consists of both the compiler target and a virtual device,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't a file-level comment and belongs on the docstring for TargetDevice

* Executors are required to handle how to map virtual device identifiers to physical
* device identifiers.
*
* The reason to introduce this data structure is that for much of compilation we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of agree with this, but it requires a lot of other contextual information to understand things like the reach and purpose of this data structure. Specifically, it would be great to assume the reader does not understand the distinction between "concrete device" and "target" here (or explain it).

I think you could say something like:

This data structure defines a compile-time identifier for one of the devices specified in the Target. For instance, a Target string `cuda --target-host=llvm -mcpu=i386` requires the compiler to track two such devices: the GPU used by CUDA and the CPU targeted by LLVM. Before this data structure, the two different devices were configured by the runtime by finding a user-specified `DLDevice` whose `device_type` field matches the Target. More complex deployment scenarios can't be modeled coherently across compiler and runtime in this system without aliasing in the target string:
 * two of any `device_type` with differing capabilities
 * reconfigurable devices (e.g. FPGA) for which TVM requires two distinct configurations
 * devices with memory constraints (which need to be expressed in terms of a specific, but unresolved-at-compile-time DLDevice)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I owe you all an RFC to both justify this and bring together the threads on target maps, default targets, and unified target/device/memory scope planning.

* device API actions.
*
* The idea is that we will carry around TargetDevice structures until device and
* target planning at which time we can inject explicit virtual devices in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain what "inject explicit virtual devices" means--are you placing this structure as an attribute in the program, or just the id, or ? How should this structure be identified, in general?

* different targets or compilation options, and eventually resolve to a phyical
* set of devices with code specialized using the correct target.
*
* For example consider mobile SoCs which may contain two CPU types, a mobile GPU,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think just two CPU types is sufficient here


namespace tvm {

class TargetDevice;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TargetDevice makes sense in context of your previous explanation but not out of context. Can we pick a different name? I suggest we consider names that reflect the identifier used e.g. DeviceName or VirtualDevice or CompileTimeDevice.

I also prefer a string identifier so it can be given by the user or SDK authors. We can register these strings at runtime or produce enums if efficiency is a concern, but specifying memory layouts with respect to a virtual device id already seems obnoxious AF. I vastly prefer "dsp-cpu" over 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially it's worth adding a default attribute to Target which is alias rather than adding it to this structure?

v->Visit("target", &target);
v->Visit("virtual_device_id", &virtual_device_id);
DLDeviceType* ptr = &device_type;
v->Visit("device_type", reinterpret_cast<int*>(ptr));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why int*?

@areusch
Copy link
Contributor

areusch commented Sep 1, 2021

On the other hand, there is direct correspondence between TargetKind to DLDeviceType, e.g. https://github.com/apache/tvm/blob/main/src/target/target_kind.cc#L278. So shall we just infer it from TargetNode::kind?

@junrushao1994 how do you propose to infer the two-different-CPUs or two-different-FPGAs case? Also see my comments on the commentary at the top of target_device.cc. My thoughts are that being explicit at this level will provide a better user experience. I don't think it removes our ability to infer devices based on Target when there is no ambiguity, and it provides us a language in which we can warn the user when ambiguity exists.

@@ -17,7 +17,7 @@
* under the License.
*/
/*!
* Compile executable modules.
* \brief Implementation of methods, and FFI interfaces for the compilation target object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really minor nit, can you do \file then \brief ?

* under the License.
*/
/*!
* \brief The implementation of the TargetDevice object for representing compilation target + virtual device.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, re-order to \file then \brief

"""Annotate ops in an experession with a provied compiler/target and then
use it for codegen.
"""
The annotate the operations in an expression with the provided compiler target
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The annotate the operations in an expression with the provided compiler target
Annotate the operations in an expression with the provided compiler target,

@@ -204,6 +204,10 @@ using FForwardRewrite = TypedPackedFunc<Expr(const Call& ref_call, const Array<E
//----------------------------------------------
class ForwardPrep : private MixedModeVisitor {
public:
// This is needed to silence a clang-warning about not correclty detecting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// This is needed to silence a clang-warning about not correclty detecting
// This is needed to silence a clang-warning about not correctly detecting

@Mousius
Copy link
Member

Mousius commented Sep 1, 2021

Is idea of TargetDevice to replace both Target and Device? And if we do replace Target with TargetDevice, how will we indicate that the Target for an expression has been set but not the Device? Thanks :)

On the other hand, there is direct correspondence between TargetKind to DLDeviceType, e.g. https://github.com/apache/tvm/blob/main/src/target/target_kind.cc#L278. So shall we just infer it from TargetNode::kind?

These two comments together make me wonder if the correct mapping is directly from Target -> Device and from TargetKind to DeviceType rather than introducing a new structure?

@manupak
Copy link
Contributor

manupak commented Oct 1, 2021

Hi @jroesch ,

I finally manage to get some time to read this :). I think code looks generally good modulo the comments.

I have a broad design question.

At which phase, would we be doing the partitioning of devices for same kind ?
Confusingly, at the minute we have two annotation (one in the BYOC pipeline) and one internally in the relay lowering pipeline.
[Related PRs and discussion : https://github.com//pull/7428 and https://discuss.tvm.apache.org/t/rfc-composite-target/7744/10? -- cc: @mbs-octoml ]

The strategy we are using in the annotation target pass (of the BYOC pipeline -- it does not really need to be just a BYOC thing), is we greedily partition the Relay OPs given the knowledge that some target kinds are better in processing them.

Therefore, this is a step beyond that where we have to deal with multiple devices of same kind and brings algorithmic problem of load balancing between devices. I appreciate that it can be a complex problem; however, my immediate question is what is the initial strategy that is being planned to use this from an annotation point of view ?

@mbs-octoml
Copy link
Contributor

The RFC I promised in #8892 (comment) is now at up at apache/tvm-rfcs#38

@mbs-octoml
Copy link
Contributor

I've merged this PR into my exploration and suggest we withdraw it from here.

@areusch
Copy link
Contributor

areusch commented Oct 12, 2021

@mbs-octoml seems you are taking this on so i will close this PR. @jroesch can feel free to reopen if he has cycles to push it through.

@areusch areusch closed this Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants