-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide an API to dump the machine code generated by LLVM for a given function. #31
Comments
LLVM also provides options to dump LLVM IR before/after all/some LLVM passes or to generate debug output during a processing performed by a certain LLVM pass. It could be useful, if this logging/debugging functionality would be possible to trigger via the API. It would allow for better understanding of optimizations and transformations performed by LLVM. |
I would very much like to provide facility to dump machine code - just haven't figured out how to do it. I haven't found any documentation on how to do it - if you know of any docs please would you point me to them. Dumping IR between passes is also possible, but right now I am using the standard PassManagerBuilder so that means I get the standard Clang /O1 /O2 /O3 passes etc. |
I think you can use the LLVMTargetMachineEmitToMemoryBuffer: Just specify that you want to have an LLVMAssemblyFile.
It can be still useful to see what each pass does. If possible, provide an option for dumping between/during LLVM passes. |
An example of how LLVMTargetMachineEmitToMemoryBuffer is used: This could be also related: |
Cool - thanks! Will look into implementing this. |
Hi - I tried above approach - unfortunately it does not disassemble - rather generates machine code from scratch. Haven't checked in due to that. It seems that to disassemble I need to use a different approach. |
You mean it does not take the machine code which you already generated, but generates the same machine code again and emits it in a disassembled form? |
That's right, but not the same machine code, e.g. doesn't take optimization options into account |
Well, I understand that this is not ideal, but it is supposed to be used only for debugging, where performance is not so important, right? So, even though it is pretty inefficient, it still produces the same code and in this sense the disassembly is correct, or? |
Well I guess I want to see the actual machine code ... rather than a regenerated version as I can't be sure that it reflects what will be executed. In my view it is not so useful. There is a way to disassemble the actual code so that would be better |
I can checkin this for now, but will probably rewrite it |
Sure. I'm not saying that it should be the final solution. It is only for the time being, until a proper solution for disassembly is found. |
BTW, I think doing a real disassembly may turn out pretty hard, because it would loose most of the symbolic information eventually. It is probably easier to force LLVM to emit both the machine code and assembly at the same time from the same input, i.e. pipeline should contain both native code generation and assembly generation. |
Yes I need to find out how to do that - I think the other link you posted might work: http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-October/054033.html |
I have added two types of dump. ravi.compile(f [,b]) - b is an optional boolean which if set to true will cause LLVM to dump the code generation (it is very verbose output) ravi.dumpllvmasm(f) - this dumps the assembly code output but also emits a warning that the generated assembly is not a disassembly of JIT code |
Thanks a lot! I checked on OS X and it works just fine! |
Actually, if possible I'd like to have an option to produce even more debug information. Right now, LLVM dumps the IR after each pass. But it does not show the debug output from each pass as it tries to transform the code. I think it can be very useful, because it may provide a hint why certain optimizations are not applied (e.g. it could not prove that two pointers do not alias, or it could not hoist a load, because something was not allowing it to do so). |
I have one more question: Have you thought about building a CFG from the LuaVM bytecode? I don't mean LLVM CFG, I mean high-level LuaVM CFG (control-flow-graph)? If you'd have it, you could do certain kinds of analysis, that LLVM cannot do as it operates at the lower level. E.g. you could try to detect dead stores, you could detect if a given variable is used in a given basic block or in a given LuaVM instruction. You could perform escape analysis, etc. |
Re your question about CFG from Lua code, maybe at some point. There are several conflicting goals for Ravi: Some optimizations will only be possible if I implement a full AST for Lua and change the compiler. But then upstream merges will be very difficult. |
I absolutely understand your intention, really. Therefore, my idea was that CFG is built not based on the AST or directly by parser, but based on the bytecode. This was it is completely decoupled from the LuaVM sources and is placed in your Ravi-specific source files. But of course, doing so would increase complexity. I ask those questions because I've got the impression that you'll soon run into a wall with the current "context-free" approach, where each Lua opcode is translated into LLVM IR one-by-one. This makes it impossible to perform many optimizations. Having a "context-sensitive" translation into LLVM IR, i.e. doing some kinds of analysis and transformation at Lua bytecode level, may result in a much better resulting performance, but at the expense of making the mapping from LuaVM opcode to LLVM IR more complex. |
I know this feeling as I also tried to decipher LuaJIT code ;-) IMHO, LuaJIT code looks the way how it looks intentionally, if you ask me. The author wanted it to be not very understandable by others. |
I will implement optimizations that are possible starting from Lua bytecode. One of the first ones is eliminating the overhead of updating the fornum "external" index - this can be done by checking whether the variable is being written to or being captured in an up-value. Another area is expression evaluation. Right now each node sets the type - but actually the type could be set after the entire expression is evaluated. I think LuaJIT 1.1.8 had some bytecode optimization (I could be mistaken) - so I could lift those. |
Sounds like a good plan. |
BTW, FWIW you could try to reuse the LLVM classes for basic blocks, instruction lists, etc. You just sub-class them and define your own blocks, instructions, etc in a form that you like. This would give you things like iteration over all basic blocks, over all instructions, computation of dominance information,etc for free. You could even reuse their pass manager for your own passes working on your own high-level internal representation. The price for it is that you become more dependent on LLVM. Since I've seen you're also playing with the idea of using gccjit, I don't know how much you want to depend on LLVM. |
I looked at your code for assembly dumping. One thing which I don't understand is: why do you create a new PassManager there instead of reusing the one, which was created in RaviJITFunctionImpl::compile? Reusing this one would ensure that the same optimizations are applied to the code before an assembly is produced, or? |
To be honest I don't understand how the assembly generation passes are hooked together and how this fits into the passmanagers in compile() |
May be it is worth asking on LLVM mailing lists? I think they should be able to explain how to do it pretty quickly. |
This is my attempt to make sure that the same pipeline is used to produce the machine code and the disassembly: diff --git a/include/ravi_llvmcodegen.h b/include/ravi_llvmcodegen.h
index 0fefbd5..8d31d2f 100644
--- a/include/ravi_llvmcodegen.h
+++ b/include/ravi_llvmcodegen.h
@@ -303,6 +303,12 @@ class RAVI_API RaviJITFunctionImpl : public RaviJITFunction {
// The llvm Function definition
llvm::Function *function_;
+
+ // LLVM Module Pass Manager
+ std::unique_ptr<llvm::PassManager> MPM;
+
+ // LLVM Function Pass Manager
+ std::unique_ptr<llvm::FunctionPassManager> FPM;
// Pointer to compiled function - this is only set when
// the function
diff --git a/src/ravijit.cpp b/src/ravijit.cpp
index 8cd17cb..ecd88ef 100644
--- a/src/ravijit.cpp
+++ b/src/ravijit.cpp
@@ -149,8 +149,6 @@ RaviJITFunctionImpl::RaviJITFunctionImpl(
fprintf(stderr, "Could not create ExecutionEngine: %s\n", errStr.c_str());
return;
}
-}
-
RaviJITFunctionImpl::~RaviJITFunctionImpl() {
// Remove this function from parent
owner_->deleteFunction(name_);
@@ -189,8 +187,6 @@ static void addMemorySanitizerPass(const llvm::PassManagerBuilder &Builder,
}
#endif
-void *RaviJITFunctionImpl::compile(bool doDump) {
-
// We use the PassManagerBuilder to setup optimization
// passes - the PassManagerBuilder allows easy configuration of
// typical C/C++ passes corresponding to O0, O1, O2, and O3 compiler options
@@ -214,8 +210,8 @@ void *RaviJITFunctionImpl::compile(bool doDump) {
#endif
{
// Create a function pass manager for this engine
- std::unique_ptr<llvm::FunctionPassManager> FPM(
- new llvm::FunctionPassManager(module_));
+ FPM = std::unique_ptr<llvm::FunctionPassManager>(
+ new llvm::FunctionPassManager(module_));
// Set up the optimizer pipeline. Start with registering info about how the
// target lays out data structures.
@@ -231,20 +227,22 @@ void *RaviJITFunctionImpl::compile(bool doDump) {
#endif
pmb.populateFunctionPassManager(*FPM);
FPM->doInitialization();
- FPM->run(*function_);
}
{
- std::unique_ptr<llvm::PassManager> MPM(new llvm::PassManager());
+ MPM = std::unique_ptr<llvm::PassManager>(new llvm::PassManager());
#if LLVM_VERSION_MINOR > 5
MPM->add(new llvm::DataLayoutPass());
#else
MPM->add(new llvm::DataLayoutPass(*engine_->getDataLayout()));
#endif
pmb.populateModulePassManager(*MPM);
- MPM->run(*module_);
}
+}
+
+void *RaviJITFunctionImpl::compile(bool doDump) {
+
if (ptr_)
return ptr_;
if (!function_ || !engine_)
@@ -257,6 +255,10 @@ void *RaviJITFunctionImpl::compile(bool doDump) {
TM->Options.PrintMachineCode = 1;
}
+ // Run required passes.
+ FPM->run(*function_);
+ MPM->run(*module_);
+
// Upon creation, MCJIT holds a pointer to the Module object
// that it received from EngineBuilder but it does not immediately
// generate code for this module. Code generation is deferred
@@ -300,13 +302,16 @@ void RaviJITFunctionImpl::dumpAssembly() {
}
if (!ptr_)
module_->setDataLayout(engine_->getDataLayout());
- llvm::legacy::PassManager pass;
- if (TM->addPassesToEmitFile(pass, formatted_stream,
+ //llvm::legacy::PassManager pass;
+ if (TM->addPassesToEmitFile(*MPM.get(), formatted_stream,
llvm::TargetMachine::CGFT_AssemblyFile)) {
llvm::errs() << "unable to add passes for generating assemblyfile\n";
return;
}
- pass.run(*module_);
+ // Run the same passes as during the usual compilation.
+ FPM->run(*function_);
+ MPM->run(*module_);
+ engine_->finalizeObject();
formatted_stream.flush();
llvm::errs() << codestr << "\n";
llvm::errs() |
Thanks. I don't think holding the pass managers in the function impl is a good idea though - I would rather refactor the pass manager calls into a common function both can call. |
Well, in LLVM usually you create them once and then reuse. But since you generate new pass manager per module/function, you don't do it yet. But may be you should. Pass managers are pretty heavy-weight and creating them every time is not such a good idea. |
There is not much guidance in the LLVM docs and the examples I have seen so not do as you suggest. For example: http://cs.swan.ac.uk/~csdavec/FOSDEM12/compiler.cc.html Do you have a doc reference or example you could point me to - that would be very helpful. I do not yet understand how to hook up pass managers properly - just going by examples. |
I don't have docs at hand. http://llvm.org/docs/WritingAnLLVMPass.html seems to imply that PassManagers are rather expensive to create. LLVM compiler uses only one global pass manager, AFAIK. And, BTW, you are eventually using a wrong PassManager. You include |
I have checked in an implementation. |
Resolved |
It would be nice to see the final machine code produced by LLVM after JITting a given function.
The text was updated successfully, but these errors were encountered: