Implement fcvt_to_uint_sat (f32x4 -> i32x4) for x86 #1990

abrown · 2020-07-07T17:40:34Z

This replaces #1822; it consists of the same functionality but removes the AVX512 instruction lowering for the time being. There are two reasons for this:

the default MXCSR rounding is round to nearest even, which does not match the semantics required by i32x4.trunc_sat_f32x4_u. We can then use embedded rounding control but lose the ability to specify the vector length, so the instruction would operate on 512-bits which we should discuss (@sunfishcode has reported issues with 512-bit vectors in Spidermonkey)
the output of VCVTPS2UDQ for negative lanes is 0xFFFFFFFF (I had thought it would be 0x00000000); this can be resolved with the following sequence: v0 = pxor ...; v2 = fcmp gte v1, v0 (gte ensures they are ordered); v3 = vcvtps2udq v1; v4 = band v2, v3. However, I would like to look at this a little bit more before submitting a separate PR for it (this is the reason for keeping the legalization in enc_tables.rs and under narrow_avx, BTW).

github-actions · 2020-07-07T18:01:14Z

Subscribe to Label Action

cc @bnjbvr

This issue or pull request has been labeled: "cranelift", "cranelift:meta", "cranelift:wasm"

Thus the following users have been cc'd because of the following labels:

bnjbvr: cranelift

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

julian-seward1

Ok to land, but please remove the redundant mention of AVX512 in the commit message:
"This converts an f32x4 into an i32x4 (unsigned) with some rounding either by using an AVX512VL/F instruction--VCVTPS2UDQ--or a long sequence of SSE4.1 compatible instructions."

Thanks for your patience with this!

This converts an `f32x4` into an `i32x4` (unsigned) with rounding by using a long sequence of SSE4.1 compatible instructions.

…sat.i32x4

abrown mentioned this pull request Jul 7, 2020

Implement fcvt_to_uint_sat (f32x4 -> i32x4) for x86 #1822

Closed

abrown requested a review from julian-seward1 July 7, 2020 17:43

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:meta Everything related to the meta-language. cranelift:wasm labels Jul 7, 2020

abrown force-pushed the trunc-sat-unsigned-again branch from 5196d47 to 00a7db1 Compare July 8, 2020 16:02

abrown mentioned this pull request Jul 8, 2020

Implement SIMD widening instructions for x86 #1994

Merged

julian-seward1 approved these changes Jul 8, 2020

View reviewed changes

abrown added 3 commits July 8, 2020 09:18

Add x86 legalization for fcvt_to_uint_sat.i32x4

0450611

This converts an `f32x4` into an `i32x4` (unsigned) with rounding by using a long sequence of SSE4.1 compatible instructions.

Translate Wasm's i32x4.trunc_sat_f32x4_u to Cranelift's fcvt_to_uint_…

9287e00

…sat.i32x4

Enable more SIMD spec tests

ec04966

abrown force-pushed the trunc-sat-unsigned-again branch from 00a7db1 to ec04966 Compare July 8, 2020 16:19

abrown merged commit 5c35a96 into bytecodealliance:main Jul 8, 2020

abrown deleted the trunc-sat-unsigned-again branch July 8, 2020 17:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement fcvt_to_uint_sat (f32x4 -> i32x4) for x86 #1990

Implement fcvt_to_uint_sat (f32x4 -> i32x4) for x86 #1990

abrown commented Jul 7, 2020

github-actions bot commented Jul 7, 2020

julian-seward1 left a comment

Implement fcvt_to_uint_sat (f32x4 -> i32x4) for x86 #1990

Implement fcvt_to_uint_sat (f32x4 -> i32x4) for x86 #1990

Conversation

abrown commented Jul 7, 2020

github-actions bot commented Jul 7, 2020

Subscribe to Label Action

julian-seward1 left a comment

Choose a reason for hiding this comment