-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Create LLVM scope class for use with LLVM libraries #83
Conversation
rfcs/0080-llvm-target.md
Outdated
// Let's see the LLVM IR and MIR after each transformation, i.e. use | ||
// -print-after-all in codegen. | ||
my_target = Target("llvm -mtriple myarch-unknown-elf -llvm-options=print-after-all"); | ||
LLVMTarget llvm_target(my_target); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @kparzysz-quic . I wonder if there could be potential mis-use if a user created multiple LLVMTarget, since target is more like an config, rather than ensuring a scoping info.
A different name might help(e.g. something that implies scope), alternatively, we can also aligns to our current With convention, which clearly indicate scoping
With<LLVMTarget> llvm_scope(llvm_target);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think renaming LLVMTarget
to LLVMScope
is probably technically a better approach at the moment, although personally I like With
a bit better (since it resembles the Python idiom, both visually and functionally).
The problem is when we want to initialize the LLVM scope by deserializing llvm::Module
(either from file, or from a string). Since the target string is stored in the module, the llvm::Module
needs to be created first (which means that it would be created before the scope object itself). To hide the gap between creating llvm::Module
, and creating the scope object, the LLVMTarget
(using old naming) implements both steps in a single function call. This call then returns both, the llvm::Module
and the scope object.
The issue with With
in the above situation is that it doesn't apply very nicely to this case. The return type (currently named ModuleData
(for a lack of a better idea), is
std::pair<std::unique_ptr<llvm::Module>, std::unique_ptr<LLVMTarget>>
With With
, it would become
std::pair<std::unique_ptr<llvm::Module>, std::unique_ptr<With<LLVMTarget>>>
which separates the "with" from its scope. This is not formally wrong, but loses the visual impact of "with".
Clarify that this RFC proposes a scope object that can be used later to save/restore LLVM's options. The saving and restoring itself is not a part of the RFC, since the `llvm` target in TVM does not yet allow passing LLVM command line flags.
e71d114
to
5e7398e
Compare
There was another passage about it that was ommitted from previous commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
broadly LGTM, but let's let TQ/Junru and any others comment on this/approve as well. unifying the interface between TVM Target and LLVM options seems like a net win for code complexity. my one question, which i think is mainly open-ended now, is how we handle LLVM arch that have unique flags that, right now, would show up in TVM as target-specific llvm flags.
I'd have to see a practical example of that to get an idea of what's expected. For Hexagon we'd want to automatically append certain flags (that enable auto-vectorization in LLVM, for example). I had one idea for that, but it may need a revision. |
@tqchen @junrushao1994 Do you have any additional comments? |
Thanks @kparzysz-quic . Sorry for the delayed reply since we are taking break here. Overall I like the direction we are going. Just to figure out the spectrum of possible APIs My main question is how are we going to interact with multiple ParseIR calls. Some example would be helpful. For example, is it OK for us to have nested LLVMScope void Example() {
LLVMScope scope1(target);
{
// what is the effect here, seems mod_data1 is immediate in scope
auto mod_data1 = LLVMScope::ParseIR(name);
}
} I also wonder if there is a way to "defer" the scope initialization. e.g. can LLVMModule be created, stored in the LLVMScope, but the target does not take in-effect until we enter the scope. We call InitializeLLVM in the constructor of LLVMTarget, but do other things like option setting in the enter stage.Something like void Example() {
LLVMTarget target1(target);
auto mod_data = LLVMTarget::ParseIR(name);
// enter target1 scope
With<LLVMTarget> scope1(target1);
{
// entering target in mod_data
With<LLVMTarget> scope2(mod_data);
}
} I think the main question is how coupled the operations related to LLVM are. |
Both You bring up a good point about when the scope "takes effect", or binds to LLVM. I propose that the scope becomes active when a new Side note: Ideally, prior to the activation of the scope, none of LLVM code or data types would be exposed to the user, but this RFC doesn't try to go that far. The main concern here is a platform on which saving/restoring LLVM flags could be done. I also suggest to create two classes: Something like class LLVMTarget {
public:
LLVMTarget(const LLVMTarget&); // copy ctor, etc.
private:
friend class LLVMScope;
LLVMTarget(ctx, values extracted from tvm::Target);
std::shared_ptr<llvm::LLVMContext> ctx_; // set via the private ctor, passed from LLVMContext
llvm::TargetMachine *tm;
...
};
class LLVMScope {
public:
LLVMScope(const Target& target) {
// Parse the target into individual attributes, e.g. mtriple, etc. This would allow users
// to create LLVMScope from Target and modify it before creating LLVMTarget.
//
// If the target has non-empty list of LLVM flags, it would be considered "modifying"
// the global state, otherwise it would be "read-only". This would allow creating
// multiple "read-only" concurrent scopes, but only one "modifying" one.
}
void activate() {
ctx_ = std::shared_ptr<llvm::LLVMContext>(CreateNewContext(is_read_only));
}
static std::pair<llvm::Module, LLVMScope> ParseIR(std::string ir) { // same for LoadIR
auto ctx = std::shared_ptr<llvm::LLVMContext>(CreateNewContext(false /*assume modifying*/));
llvm::Module m = llvm::parse_string_etc(ctx, ...)
LLVMScope scp(ctx, get_target_string(m)); // create an already active scope
return {m, scp};
}
LLVMTarget getTargetInfo() {
ICHECK(ctx_ != nullptr) << "must activate first";
...
}
private:
static llvm::LLVMContext* CreateNewContext(bool is_read_only) {
// Create context with appropriate locking
}
LLVMScope(std::shared_ptr<llvm::LLVMContext> ctx, const Target& target);
std::shared_ptr<llvm::LLVMContext> ctx_;
}; |
To address the second question: we shouldn't allow nesting of LLVM scopes. The reason is that if the target doesn't specify any command line flags, the assumption is that it will rely on LLVM's defaults. If we were to nest scopes, and the outer scope did modify flags, then the inner scope (without flags) would not have a way of knowing that the global state has been altered. |
Just to followup, why does ParseIR require activation of a scope, or is it possible that ParseIR returns a tuple of Target and Module, where activation is done separately. I am asking this mainly to see if we can get an explicit With style API |
The reason is that Edit: I like the explicit |
I get this part, my main question is whether we can initializeLLVM in the Target constructor but do option resetting in enter/exit, i could miss something here. |
I see what you mean. My concern is that since command line options can be used almost everywhere in LLVM (not just optimizations or code generation), we should try to do the saving as early as possible (and restoring as late as possible). Ideally all the LLVM calls in the program would be enclosed in the save/restore functions (even the bitcode reader registers a couple of command line flags). At the same time we could make a deliberate decision to only apply the save/restore to a part of LLVM functionality, i.e. allow some kinds of LLVM function calls to happen outside of the save/restore scope. Let me know what you think. |
Logically, there are two kinds of operations involved:
We are certainly in agreement that all K1 should be enclosed with option save/recovery scope. Parse/Save seems to belong to K0. From a logic reasoning pov, operations in K0 should not be impacted by cli opt as a result my comment. But I think @kparzysz-quic what you mean is that cl opt might breach into K0 as well. If that is really the case, it would indeed be helpful to take the setting/recover at the load/store level. Would be great to reason and confirm a bit. Since the ability to de-couple K0 and K1 is nice from the overall code structuring pov |
It does, yes: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L91-L99 The flags above come from the latest development branch, so in order to handle all LLVM versions we'd need to check all of them since 4.0 (which I'm ok doing), and we will need to keep this code up to date with the latest developments in LLVM. On the other hand, these aren't options that are very likely to be used by TVM users, and the issue with command line flags only really applies to deserialization (since serialization would happen after the flags had been saved). I don't think it's unreasonable to state that we only handle (K1), but we'd need to ensure that (K1) also included creation of any data structures that assist in modifying or in analysis of llvm::Module. |
In that case, I think it would be nice to state we handle K1, and leave ONLY serialization/deserialization out. This way we can have With style API, which indicate that we are explicit entering the scope for any processing. auto data = LLVMTarget::ParseIR(file_name);
With<LLVMTarget> scope(mod_data);
`` |
rfcs/0080-llvm-target.md
Outdated
LLVMScope(const Target& target); | ||
~LLVMScope(); | ||
|
||
std::pair<llvm::Module, LLVMScope> LoadIR(const std::string& file_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the semantics of this function? Does it create a new scope(besides the current one?) A code example that demonstrates LoadIR would be helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments with descriptions.
I updated the text of the RFC, the rendered link above, and the prototype draft PR. |
Should we still wait for review from Junru? |
I think we can go ahead and merge |
This implements RFC 80. See apache/tvm-rfcs#83. Summary of changes: - Created an `LLVMInstance` class. Uses of LLVM functions and data struc- tures should be contained within the lifetime of an object of this class. LLVMInstance object contains LLVMContext, and implements member functions to deserialize an llvm::Module. - Created an `LLVMTarget` class. Once an LLVMInstance object has been created, an object of LLVMTarget class can be created from TVM target string, or Target object for "llvm" target. Once LLVM command line flags are added to the "llvm" target, one of the goals of this object will be to save/restore relevant LLVM global state. Another objective for the LLVMTarget object is to be a single location for all LLVM-related compilation structures and options (such as TargetMachine, FastMathFlags, etc.)
…2140) This implements RFC 80. See apache/tvm-rfcs#83. Summary of changes: - Created an `LLVMInstance` class. Uses of LLVM functions and data struc- tures should be contained within the lifetime of an object of this class. LLVMInstance object contains LLVMContext, and implements member functions to deserialize an llvm::Module. - Created an `LLVMTarget` class. Once an LLVMInstance object has been created, an object of LLVMTarget class can be created from TVM target string, or Target object for "llvm" target. Once LLVM command line flags are added to the "llvm" target, one of the goals of this object will be to save/restore relevant LLVM global state. Another objective for the LLVMTarget object is to be a single location for all LLVM-related compilation structures and options (such as TargetMachine, FastMathFlags, etc.)
Follow-up to https://discuss.tvm.apache.org/t/modularizing-llvm-codegen-jit/12764