Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add {u,s}{add,sub,mul}_overflow instructions #5784

Merged
merged 9 commits into from
Apr 11, 2023

Conversation

T0b1-iOS
Copy link
Contributor

Currently, there is not really a way to efficiently detect arithmetic over/underflow which some ISAs (in particular x64) allow and can be important. For example, I'm currently trialing Cranelift as a backend in a database system which needs efficient overflow handling.
There also seems to be more general interest in overflow detection (see #1044).

This PR adds instructions for unsigned and signed add/sub/mul which return a second output indicating overflow.
add/sub support 8 through 128 bit integers while the mul variants only support up to 64 bit integers. (That's because the most efficient implementation for umul_overflow with 128bit integers is along the lines of 50 instructions on x64 at which point it is probably better to emit a function call).
I left iadd_cout and friends untouched as the semantics are a bit confusing right now, e.g. iadd_cout seems to be indicating signed overflow while the pseudocode in the description describes unsigned overflow).

In detail, this PR adds:

  • {u,s}{add,sub,mul}_overflow with an implementation in the interpreter (I hope the naming convention is ok)
  • runtests for these instructions
  • support for 8/16 bit operands for AluRmiR/AluRM instructions in the x64 backend
  • support for the usigned mul instruction on x64
  • support for UMAddL/SMAddL instructions in the aarch64 backend
  • lowerings of {u,s}{add,sub,mul}_overflow for x64 and aarch64

Currently, there are no emit tests since they were not important for the initial testing and I would like to know whether there is interest in these instructions being included first.
Additionally, the 128bit lowerings currently forego the flag helpers since they are, as far as I can tell, not really designed to produce three outputs so getting both the low/high part of an addition and the dst reg for a setcc seemed a bit complicated.
I also do not have a lot of knowledge about RV64/S390X so I don't think I can provide lowerings there.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. labels Feb 15, 2023
@cfallin
Copy link
Member

cfallin commented Feb 16, 2023

FWIW, I think that these instructions are a reasonable addition -- thanks for proposing and prototyping this!

I think, as you suggest as well, it's worth clarifying how this interacts with the existing operators; in particular

I left iadd_cout and friends untouched as the semantics are a bit confusing right now, e.g. iadd_cout seems to be indicating signed overflow while the pseudocode in the description describes unsigned overflow).

does indeed seem to be true (e.g. here for x86-64), but is pretty surprising to me, and IMHO is incorrect. It doesn't appear that iadd_cout is the target of any rewrite in cranelift-codegen proper, and is not used by cranelift-wasm; if at least cg_clif can adapt to these operators instead (cc @bjorn3 ?) then I like the explicitness of these opcode names much better and we should remove iadd_cout.

If others agree, I'm happy to do a full review here!

@bjorn3
Copy link
Contributor

bjorn3 commented Feb 16, 2023

cg_clif currently has all overflow checks manually due to iadd_cout not working in combinations of target archs and integer sizes. I did be more than happy to switch to the _overflow instructions added by this PR.

@T0b1-iOS T0b1-iOS force-pushed the arith_of branch 4 times, most recently from 3e4c8bb to 23f75de Compare March 25, 2023 21:12
@T0b1-iOS
Copy link
Contributor Author

Okay, sorry for the long wait but university got a bit busy the past month. I did the emit tests now (and fixed a few bugs along the way) and I think this is now in a somewhat cleaned up stage and could be reviewed.
The CI currently fails because of the s390x tests failing which is because QEMU incorrectly emulates the overflow flag (see https://gitlab.com/qemu-project/qemu/-/issues/618). This was fixed in QEMU 7.0.0 so the version in the CI would need to be updated but I don't know if I can do that.

@T0b1-iOS T0b1-iOS marked this pull request as ready for review March 25, 2023 22:32
@bjorn3
Copy link
Contributor

bjorn3 commented Mar 27, 2023

Tried this with cg_clif. Didn't find any failing tests. https://github.com/bjorn3/rustc_codegen_cranelift/tree/clif_overflow_insts

Copy link
Contributor

@afonso360 afonso360 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interpreter implementation LGTM! Thanks!

I'm running the fuzzer now, I'll report back if it finds anything.

One small note, I've had to disable the unimplemented ops in the fuzzer can you include that commit with these changes? Otherwise it crashes because it tries to generate those instructions when the architectures don't have lowerings for them.

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an initial pass over this -- thank you very much for a thorough implementation job on both x64 and aarch64!

I left a bunch of comments that are "nits", but also a few issues I think we should resolve before merging -- the with_flags fix is the most important one I think. I also notice you basically added 8/16-bit support to ALU ops generally in x64. It might be useful to split that out as a separate PR so we can review it (and so others can take a look more easily -- @abrown maybe in particular). Overall though, the shape of this looks right and it's close to mergeable -- thanks again!

cranelift/codegen/src/isa/aarch64/inst/mod.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/inst.isle Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/lower.isle Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/lower.isle Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/lower.isle Show resolved Hide resolved
cranelift/codegen/src/isa/aarch64/lower.isle Show resolved Hide resolved
cranelift/codegen/src/isa/x64/inst.isle Show resolved Hide resolved
cranelift/codegen/src/isa/x64/lower.isle Outdated Show resolved Hide resolved
cranelift/filetests/filetests/runtests/sadd_overflow.clif Outdated Show resolved Hide resolved
@T0b1-iOS T0b1-iOS requested a review from a team as a code owner April 3, 2023 20:06
@T0b1-iOS T0b1-iOS requested review from elliottt and removed request for a team April 3, 2023 20:06
@github-actions github-actions bot added the cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. label Apr 3, 2023
@elliottt elliottt removed their request for review April 3, 2023 22:18
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good now -- thanks so much for your patience as we iterated on the implementation!

I think once the merge conflict in the interpreter is resolved, this should be good to merge.

@T0b1-iOS
Copy link
Contributor Author

This looks good now -- thanks so much for your patience as we iterated on the implementation!

No problem, I was the slow one, after all ^^

I think once the merge conflict in the interpreter is resolved, this should be good to merge.

I'll rebase the branch if that's okay.

@cfallin cfallin enabled auto-merge April 11, 2023 20:15
@cfallin cfallin added this pull request to the merge queue Apr 11, 2023
Merged via the queue into bytecodealliance:main with commit 569089e Apr 11, 2023
brendandburns pushed a commit to brendandburns/wasmtime that referenced this pull request Apr 13, 2023
* add `{u,s}{add,sub,mul}_overflow` with interpreter

* add `{u,s}{add,sub,mul}_overflow` for x64

* add `{u,s}{add,sub,mul}_overflow` for aarch64

* 128bit filetests for `{u,s}{add,sub,mul}_overflow`

* `{u,s}{add,sub,mul}_overflow` emit tests for x64

* `{u,s}{add,sub,mul}_overflow` emit tests for aarch64

* Initial review changes

* add `with_flags_extended` helper

* add `with_flags_chained` helper
eduardomourar pushed a commit to eduardomourar/wasmtime that referenced this pull request Apr 16, 2023
* add `{u,s}{add,sub,mul}_overflow` with interpreter

* add `{u,s}{add,sub,mul}_overflow` for x64

* add `{u,s}{add,sub,mul}_overflow` for aarch64

* 128bit filetests for `{u,s}{add,sub,mul}_overflow`

* `{u,s}{add,sub,mul}_overflow` emit tests for x64

* `{u,s}{add,sub,mul}_overflow` emit tests for aarch64

* Initial review changes

* add `with_flags_extended` helper

* add `with_flags_chained` helper
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants