-
Notifications
You must be signed in to change notification settings - Fork 201
Conversation
4bc31cf
to
6e15d79
Compare
Using |
6e15d79
to
022bc40
Compare
022bc40
to
2024f3b
Compare
It'd be useful to generate the |
2024f3b
to
17251bd
Compare
f1d8669
to
3fb342b
Compare
I've implemented write support for .eh_frame in gimli (gimli-rs/gimli#419). I expect that it should require no changes to this PR, and only a minor change to the wasmtime PR, but I haven't tested. |
3fb342b
to
1c185ff
Compare
c61a15b
to
49e5793
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few high-level questions below, to clarify the purpose of this PR. Thanks!
cranelift-codegen/src/isa/x86/abi.rs
Outdated
let push_fp_inst = pos.ins().x86_push(fp); | ||
let word_size = word_size as isize; | ||
cfa_state.position -= word_size; | ||
cfa_state.offset += word_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure no compiler pass won't add more instructions in between? (in particular thinking about register allocation which could insert copies/fills, or constant pools being generated, whenever we have them?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the prologue_epilogue
runs after regalloc
but before shrink_instructions
, relax_branches
-- I hope the latter does not insert anything to change stack layout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. There should be @julian-seward1 's late stage optimizations happening after that (and they can remove instructions), so we should be careful to put this somewhere the data flow graph can't change (i.e. at the very end).
cranelift-codegen/src/isa/x86/abi.rs
Outdated
pos.func.frame_layout.instructions.insert( | ||
reg_push_inst, | ||
vec![FrameLayoutChange::RegAt { | ||
reg: reg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we interested in the position of every register? (I guess so, if I understand correctly this is for tracking where local variables are.) If so, should every instruction in the function's body also have its own frame layout changes (in case regalloc moves a variable from one register to another)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this PR (and .debug_frame) we are interested in basics: proper frame unwinding (i.e. BP, SP, IP) and maybe registers used for args. (The tracking of original variables values is happening somewhere else).
49e5793
to
10101b6
Compare
Deferring to @sunfishcode, since my understanding is that this is mostly for wasmtime. |
This is useful for cg_clif for debuginfo and unwinding too. |
I think this patch looks good overall, though @iximeow's comment looks worth addressing. |
Just a note: the above merged PR adds frame changes to track the last few instructions as we exit functions on x86. I'd forgotten when commenting but we can have multiple returns, so it's a little more complex than I'd initially thought. I'm not sure how to get multi-return sys-v functions out of Cranelift to test, but to my understanding of the spec (and a comparison against gcc) it looks correct. So much as my word matters here: these changes look ✔️ |
.and_modify(|insts| { | ||
*insts = insts | ||
.into_iter() | ||
.map(|x| *x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.cloned()
@iximeow I am not a member of |
oh whoops, I'd thought you were. Sorry about that! |
Hi! Just as a preamble to the review, can you make sure this commit doesn't introduce merge commits please? (Rebasing should work just fine.) |
@yurydelendik Hey, can you help with this rebase, please? (Once that's done, I'm happy to review this.) |
e543dc3
to
80977a6
Compare
rebased all commits |
80977a6
to
f6582f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions below, and a few questions about logic too. It should be pretty much ready to land after this. (I think we'll eventually squash commits on merge in this PR, since each commit tends to modify the behavior of its predecessors.)
#[cfg(not(feature = "std"))] | ||
use crate::HashMap; | ||
#[cfg(feature = "std")] | ||
use std::collections::HashMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we don't need to guard against the std feature here and we can just use crate::HashMap;
, right?
/// Change in the frame layout information. | ||
#[derive(Clone, Copy, Debug, Eq, PartialEq, Hash)] | ||
pub enum FrameLayoutChange { | ||
/// Base CFA pointer moved to different register/offset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it's obvious from context here, would be nice to explicit CFA in this file at least once.
/// The entire frame layout must be preserved somewhere to be restored at a corresponding | ||
/// `Restore` change. | ||
/// | ||
/// This likely maps to the DWARF call frame instruction `.cfa_remember_state` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you end this sentence with a dot please? (ditto below for restore)
@@ -371,6 +372,29 @@ fn baldrdash_prologue_epilogue(func: &mut ir::Function, isa: &dyn TargetIsa) -> | |||
Ok(()) | |||
} | |||
|
|||
/// CFAState is `cranelift`'s model of the call frame layout at any particular point in a function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to explicit the CFA acronym once in this file too.
cf_ptr_offset: isize, | ||
/// The offset between the start of the call frame and the current stack pointer. This is | ||
/// primarily useful to point to where on the stack preserved registers are, but is maintained | ||
/// through the whole function for consistency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is slightly lying :) It is only preserved in the prologue and epilogue, as far as I can tell, not in the function's body itself. It would be nice to reword it a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dont have alloca support, so both esp and ebp arent modified between prologue and epilogue, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about regalloc's spills, but maybe regalloc just mandates a specific amount of spill slots, and something else allocates them in advance once and for all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spill slots are a kind of stack slots. I believe they get allocated together with the rest of the stack slots in prologue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And i imagine the emergency spill slot belongs to that group? In this case, current comment is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is a kind of stack slot: https://docs.rs/cranelift-codegen/0.46.1/cranelift_codegen/ir/stackslot/enum.StackSlotKind.html#variant.EmergencySlot
} | ||
} | ||
} | ||
} | ||
|
||
/// Insert an epilogue given a specific `return` instruction. | ||
/// This is used by common calling conventions such as System V. | ||
/// TODO implement and handle _cfa_state more than one epilogue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: outdated comment, can be removed I think?
if stack_size > 0 { | ||
pos.ins().adjust_sp_up_imm(Imm64::new(stack_size)); | ||
} | ||
|
||
// Pop all the callee-saved registers, stepping backward each time to | ||
// preserve the correct order. | ||
let fp_ret = pos.ins().x86_pop(reg_type); | ||
// account for CFA state in the reverse of `insert_common_prologue`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for all the comments here, please capitalize the first letter.
.into_iter() | ||
.map(|x| *x) | ||
.chain(std::iter::once(new_cfa)) | ||
.collect::<Box<[_]>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason we don't use a vec! for the changes? We could then use push
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was based on the fact that FrameLayoutChanges
is a Box<[_]>
and not a Vec<_>
. Using Vec<_>
for FrameLayoutChanges
and then using pop
would be faster. I didn't actually look why there was a Box<[_]>
in the first place when writing that comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we could use a Vec here, or keep the Box. Not sure it's very performance-sensitive since it's tied to debug collection mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it's very performance-sensitive since it's tied to debug collection mode.
It would be perf sensitive for cg_clif once I add unwinding support based on this.
cfa_state.current_depth += word_size; | ||
cfa_state.cf_ptr_offset -= word_size; | ||
// and now that we're going to overwrite `rbp`, `rsp` is the only way to get to the call frame. | ||
cfa_state.cf_ptr_reg = RU::rsp as RegUnit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to insert the layout changes after this instruction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurydelendik will you have time to address the comments here or should I do so and PR into your fork again? |
@iximeow at this moment, I lost track on what is going on with this PR. It will take me some time to address the review comments. If you can address the comments, which will be awesome, I recommend to just submit your fork's branch as a new/different PR -- and I'll close this one. |
I'll open a new PR and address the remaining comments there. Thanks! |
Thanks both! I'll close this one since work has been continued in #1204. |
Adds basic frame layout (or unwinding) capabilities to the x86 abi.