Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x64 new backend: port ABI implementation to shared infrastructure with AArch64. #2142

Merged
merged 1 commit into from
Sep 9, 2020

Conversation

cfallin
Copy link
Member

@cfallin cfallin commented Aug 18, 2020

Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

@cfallin cfallin requested a review from bnjbvr August 18, 2020 23:01
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen labels Aug 18, 2020
@github-actions
Copy link

Subscribe to Label Action

cc @bnjbvr

This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64"

Thus the following users have been cc'd because of the following labels:

  • bnjbvr: cranelift

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@cfallin cfallin force-pushed the machinst-abi-x64 branch 2 times, most recently from 3c6f7a5 to 347b781 Compare August 20, 2020 00:53
Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really hard to review from a single look without reviewing the rest of the refactoring, so @julian-seward1 would probably be a better reviewer here. Most of my remarks are mostly about the traits that happened in the previous refactoring, so feel free to defer addressing those in a later refactoring. As long as you've tested it with Spidermonkey and it passes all the tests, I'm fine with it.

cranelift/codegen/src/isa/x64/abi.rs Outdated Show resolved Hide resolved
Inst::EpiloguePlaceholder
}

fn gen_add_imm(into_reg: Writable<Reg>, from_reg: Reg, imm: u64) -> SmallVec<[Self::I; 4]> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't there when the trait was designed, but: would it make sense to pass a context, or the consumer of these insts directly here? (I remember some cases where it wasn't possible to do so, because it was used in both the lowering and code emission contexts, so maybe that's irrelevant.)
This would avoid the smallvec allocation entirely here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this, but (as you suspected) this is called in contexts where we have no LowerCtx or other direct consumer of callbacks. We could statically monomorphize on a closure argument that receives instructions, but that seems unnecessarily complex and would duplicate code in the binary; I made sure to size the SmallVecs so they should fit x64 and aarch64 cases in their inline storage, so the cost is (just) copying a few Insts instead. Happy to go the other way though if you'd prefer!

cranelift/codegen/src/isa/x64/abi.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/x64/abi.rs Outdated Show resolved Hide resolved
fn get_fixed_tmp_reg() -> Reg {
// Use a caller-save register for this. Note that we need not exclude it
// from regalloc on x64 because `gen_add_imm()` above never clobbers a
// scratch register. Thus the sequence ends up being: gen stack limit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reading this doesn't help understanding why it's fine to do so, which makes me think the trait functions should be redesigned so the temp reg is also returned from gen_add_imm, or something like this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out; yeah, it is a bit convoluted. I've updated the documentation on the trait definition to clearly describe the requirements placed on the machine backend's implementation and register choices, and I've renamed get_fixed_tmp_reg() to get_stacklimit_reg() to make its purpose more explicit.

I thought for a bit if there might be a better way to do this, but I haven't managed to find one, unless we push the whole "compute this GlobalValue using only caller-saves" logic into each machine backend. gen_add_imm() could return info about what it clobbers, but that still doesn't help the machine-independent implementation choose another register on its own. Really the machine-backend author has to choose fixed scratch registers and a corresponding add-with-large-immediate lowering (on RISCs at least) and just provide them to the machine-independent part.

On x64, anyway, we can always have a full 32-bit immediate on an add, so the more tricky requirements are moot; all we need is some arbitrary caller-save register here. Hopefully simple enough but I'm open to other ideas :-)

cranelift/codegen/src/isa/x64/abi.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/x64/abi.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/x64/abi.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/x64/abi.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/x64/abi.rs Outdated Show resolved Hide resolved
@cfallin cfallin force-pushed the machinst-abi-x64 branch 2 times, most recently from d94f180 to 5f71a03 Compare August 20, 2020 22:35
@cfallin
Copy link
Member Author

cfallin commented Aug 20, 2020

Updated -- thanks! This has been verified to work with SpiderMonkey as well.

@bnjbvr bnjbvr self-requested a review August 21, 2020 14:47
@bnjbvr
Copy link
Member

bnjbvr commented Aug 21, 2020

(I'm happy to give it a look on another day, unless Julian beats me to it; I really need to read the rest of the refactoring anyways :-))

Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a great cleanup, thanks a lot! (And sorry about the very long review time.)

It's likely we could even share more of the compute_args_loc logic by having some kind of architecture agnostic CallingConvention ABI, and then each architecture could have its own SystemVCallingConvention / BaldrdashCallingConvention impl, but let's think about this later :-)

cranelift/codegen/src/machinst/abi_impl.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/abi.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/abi.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/x64/abi.rs Show resolved Hide resolved
cranelift/codegen/src/isa/x64/abi.rs Outdated Show resolved Hide resolved
…h AArch64.

Previously, in bytecodealliance#2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

This also changes some register allocations in the aarch64 code because
call support in the shared ABI infra now passes a temp vreg in, rather
than requiring use of a fixed, non-allocable temp; tests have been
updated, and the runtime behavior is unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants