-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cranelift: Implement scalar fma
on x86
#4460
Conversation
Subscribe to Label Action
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "cranelift:module", "isle"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
48c77b7
to
ad5e01d
Compare
I keep trying to review this but finding too many things I don't understand about Cranelift yet. So I'll ask some questions instead. I think there are three main things happening in this PR:
Does that cover everything you did in this PR? I would find this easier to review without change (1), or if you pass the I'm curious if the target triple is reachable through one of the other fields that's already in the Step (3) seems like it's simple enough that it can't be wrong, but I haven't read up on the details of ISLE yet. The other thing I'd like to know is whether anybody is using the I think @cfallin or @fitzgen will need to review this but I hope my notes can at least help guide that review. |
Thanks for reviewing either way!
I think that comment still applies, we are still creating a new signature every time we lower Losing the I think using
Yeah, I couldn't find it, but would appreciate a double check on that.
Yeah pretty much.
I don't think so, since it is not implemented right now, but at least Thanks! |
ad5e01d
to
5f132e2
Compare
@cfallin Would it be possible to review this? The Fma op isn't that important (we have a better implementation on #4539), but the libcall mechanism is something that we need to lower other i128 ops and this is sort of blocking that. Changes since @jameysharp 's review:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good to me, thanks! I think it will need a rebase after #4571 goes in but it should be minor.
I don't see an actual libcall implementation here -- I guess it will result in a panic if used today in e.g. the Wasmtime embedding (but that's fine since the cranelift-wasm frontend won't use the opcode)? Or is it already implemented in cranelift-jit
?
cranelift/codegen/src/ir/libcall.rs
Outdated
| LibCall::ElfTlsGetAddr => unimplemented!(), | ||
} | ||
|
||
if call_conv.extends_baldrdash() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will become a compile error once my remove-Baldrdash PR merges (it's in-queue now), sorry!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll wait for it to be merged and rebase (it looks like it has a CI error).
Sorry, I don't quite understand understand. We implement a libcall lowering for So this is already reachable from wasmtime (from the And #4453 allows us to test them with runtests. This does cause a Illegal Instruction if no relocations are performed which is the state of the current runtest suite (and why we don't enable the |
I guess I meant specifically for |
We don't implement them. In I don't know how this works in Wasmtime, if we always need to provide those functions or if we also get them from the system. |
OK, that seems fine to me; just want to confirm that I wasn't missing something! It is at least a loud failure if someone tries to use the lowering in their own embedding and finds that the libcall isn't provided/implemented, so this shouldn't be a problem. |
5f132e2
to
317844e
Compare
x86 does not have dedicated instructions for scalar FMA, lower to a libcall which seems to be what llvm does.
317844e
to
00264b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
👋 Hey,
This PR Implements
fma
for scalar values.x86 does not have dedicated instructions for scalar
fma
, so we lower to a libcall which matches llvm's behaviour.I tried to implement it for SIMD as well but got a bit lost, ill try again later.
We can't enable
fma.clif
runtests until we merge #4453.