-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building assembler code in reverse #48
Comments
Hey there. That definitely sounds like an interesting use case. I've thought about it for a bit and with some changes it should be possible to support this mode of assembling as well. One major limitation would be that an assembler could easily support either forwards or backwards assembly, but not both. The problem can be split in two parts I think. First is the reversal of the emitted instruction bytestream. This should be fairly easy as internally the plugin compiles single dynasm invocations to a list of calls to be made to the runtime. By reversing this list of calls at the end the instructions in a single call will be emitted tail to head. A minor complication exists here around emitted relocations, as they might now be emitted before the actual bytes that they modify are emittted. This should be avoided, and either the plugin or runtime needs to correct for them being emitted at a different offset in the instruction. Generalizing the handling of these will be beneficial. The second part would involve an assembler that just works in reverse, emitting starting at the end of a buffer and working backwards. That should be pretty simple to implement. A bit of care will be necessary as all offsets should now be respective to the end of the buffer. I think this will definitely be a useful feature to support. I'll have to look at the code myself for a bit to draft up a good design for the generalisation of relocations that will be necessary for the fixup to happen at the last stage of assembly, but outside that I don't see any immediate blockers. |
If I understand you correctly, then you are suggesting to reverse both the order of the assembler instructions as well as the bytecode they emit. For example, if I have dynasm!{
mov rax, 0x1
mov dl, 0x2
} this currently (in normal mode) becomes ops.extend(b"H\xb8");
ops.push_i64(0x1);
ops.extend(b"1\xd2\xb2");
ops.push_i8(0x2); But in reverse mode this would instead turn into ops.push_i8(0x2);
ops.extend(b"\xb2\xd21");
ops.push_i64(0x1);
ops.extend(b"\xb8H"); So not only is the order of the pushes/extends to This is actually a neat idea, and not something I had thought of. But it does look quite involved (from an implementation perspective). What I initially had in mind was to just store the result of each dynasm!{
mov rax, 0x1
}
dynasm!{
mov dl, 0x2
} This would become something like // First dynasm!
ops.new_vec(); // Create a new subvector
ops.extend(b"H\xb8"); // extend the new subvector
ops.push_i64(0x1);
// Second dynasm!
ops.new_vec();
ops.extend(b"1\xd2\xb2");
ops.push_i8(0x2); So at the end you have a |
I don't think it would be necessary to reverse the order of the bytecode, just reversing the order of the calls would be enough. In the end the bytes should still arrive in the executable buffer in the same order, its just filled in chunks from the end to the start. However, that is not the main issue. If it was just that the solution would be simple. The main thing throwing a wrench of the works is the handling of relocations. Jumps, rip-relative offsets etcetera. These need to be recorded and are often only resolved and fixed up at a later point, the point at which the actual label location is defined. Furthermore, to prevent unnecessary fixups later on it is not a good idea to emit invocations to separate buffers and reorder. Ideally you want to emit to a single buffer and grow it or the direction of emission. If you do not need relocation support you can currently already just emit to a VecAssembler per call and reorder it yourself but that will basically mean no control flow between invocations. |
Without going too deep into the details: we are compiling traces (straight-line paths through our own IR). However, our code still has control flow, as we need to jump to cleanup code when certain assumptions don't hold true (when a guard fails). This cleanup code is going to need to be tailored to each trace, and thus cannot be compiled ahead of time. I think that means that we will need to compile the cleanup code at runtime and have labels for the guard failure code. I'm guessing that means we can't use the |
I'm not sure, sorry. |
I noticed that there is a |
Hi again, I've been reviewing this thread because we will soon need to write a register allocator for our tracing JIT and this would be simplified if we could visit the IR instructions of an IR trace in reverse order. To be clear about our requirements, we would still want to emit the individual asm instructions for any given IR instruction in forwards order, but we'd want to process the list of IR instructions to codegen in reverse order. A potentially simpler solution could be to make the backing vector for an Would that work? |
For our experimental JIT compiler, which @vext01 mentioned in #47, we want to compile the instructions of a trace backwards as that seems to be what other JITs do. One of the reasons being that it simplifies register allocation (see [1]).
Is that something that would be possible using dynasm-rs? I've seen that there's a function
alter
which allows to patch the buffer, but I don't think that allows us to reverse the compiled instructions. One way I can imagine doing this would be to cache the result of eachdynasm!
macro call. Then, when we are done, we simply need to reverse the order of the cached results, before making them executable. I assume that this is not currently possible, but was wondering if there another way that dynasm-rs could be used to achieve this? If not, would you be happy to accept a change that makes this possible?[1] https://dl.acm.org/doi/pdf/10.1145/3132190.3132209
The text was updated successfully, but these errors were encountered: