Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Move the various helper intrinsics to be implemented on the S.R.Intrinsics.Vector types #20147

Merged
merged 4 commits into from
Nov 8, 2018
Merged

Move the various helper intrinsics to be implemented on the S.R.Intrinsics.Vector types #20147

merged 4 commits into from
Nov 8, 2018

Conversation

tannergooding
Copy link
Member

No description provided.

@tannergooding
Copy link
Member Author

FYI. @CarolEidt, @fiigii, @eerhardt, @terrajobst

Still working on the other help intrinsics, but this is an early view (at least for StaticCast).

@fiigii
Copy link

fiigii commented Sep 26, 2018

We probably need generic versions as well.

public Vector128<U> As<U>()

@tannergooding
Copy link
Member Author

We probably need generic versions as well.

I asked about this last Tuesday (25th) and the design review group decided against it. Users wanting to use generic algorithms can write their own extension methods or use Unsafe.As<TFrom, TTo>.

@fiigii
Copy link

fiigii commented Sep 26, 2018

I encourage to have a generic version As<U>, which

  1. makes the intrinsic module more self-contained
  2. users can find their solution easier
  3. easier to guarantee the optimal codegen (I remember Unsafe.As had round-trip codegen on some platforms)

@tannergooding
Copy link
Member Author

I encourage to have a generic version As<T, U>

Feel free to open another issue requesting this. However, given the original design review (which decided we should expose AsInt32, etc) and my confirmation on the 25th (that they do not want a generic overload), I don't think they are likely to reverse the decision right now.

easier to guarantee the optimal codegen (I remember Unsafe.As had round-trip codegen on some platforms)

If there is a code-gen problem with Unsafe.As, it is probably worth logging and getting fixed

@tannergooding
Copy link
Member Author

Right. You have to do:

var zero = Sse.SetZeroVector128();
var x = Unsafe.As<Vector128<float>, Vector128<int>>(ref zero);

and trust that the JIT will optimize it appropriately.

-- I have sent an e-mail and included this on the list of things we should get confirmation on.

@fiigii
Copy link

fiigii commented Sep 26, 2018

@tannergooding Thanks, but the code seems to have really bad codegen (release publish, debug jit, release runtime/corelib)

static void Main(string[] args)
        {
            Vector128<float> v1 = Sse.SetZeroVector128();
            Vector128<int> v = Unsafe.As<Vector128<float>, Vector128<int>>(ref v1);
            Console.WriteLine(Sse2.Add(v, v));
        }
IN0010: 000000 push     rbp
IN0011: 000001 sub      rsp, 64
IN0012: 000005 lea      rbp, [rsp+40H]
IN0013: 00000A xor      rax, rax
IN0014: 00000C mov      qword ptr [V01 rbp-20H], rax
IN0015: 000010 mov      qword ptr [V01+0x8 rbp-18H], rax
IN0016: 000014 mov      gword ptr [V00 rbp-08H], rdi

G_M45811_IG02:        ; offs=000018H, size=004DH, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref

IN0001: 000018 vxorps   xmm0, xmm0, xmm0
IN0002: 00001D vmovapd  xmmword ptr [V01 rbp-20H], xmm0
IN0003: 000023 lea      rdi, bword ptr [V01 rbp-20H]
IN0004: 000027 call     Unsafe:As(byref):byref
IN0005: 00002C vmovupd  xmm0, xmmword ptr [rax]
IN0006: 000031 vmovapd  xmmword ptr [V03 rbp-30H], xmm0
IN0007: 000037 vmovapd  xmm0, xmmword ptr [V03 rbp-30H]
IN0008: 00003D vpaddd   xmm0, xmm0, xmmword ptr [V03 rbp-30H]
IN0009: 000043 vmovapd  xmmword ptr [V04 rbp-40H], xmm0
IN000a: 000049 lea      rsi, bword ptr [V04 rbp-40H]
IN000b: 00004D mov      rdi, 0x11D464F18
IN000c: 000057 call     CORINFO_HELP_BOX
IN000d: 00005C mov      rdi, rax
IN000e: 00005F call     Console:WriteLine(ref)
IN000f: 000064 nop      

G_M45811_IG03:        ; offs=000065H, size=0006H, epilog, nogc, emitadd

IN0017: 000065 lea      rsp, [rbp]
IN0018: 000069 pop      rbp
IN0019: 00006A ret    

@CarolEidt
Copy link

@fiigii - Now that tiered compilation is enabled, if you want to see the optimized code you need to set COMPlus_TieredCompilation to 0. (It is very easy to forget that; I do it all the time). Here is the code I get for your test case:

IN0009: 000000 sub      rsp, 56
IN000a: 000004 vmovaps  qword ptr [rsp+20H], xmm6

G_M17448_IG02:        ; offs=00000BH, size=0028H, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref

IN0001: 00000B vxorps   xmm6, xmm6, xmm6
IN0002: 000010 mov      rcx, 0x7FFD8F58F800
IN0003: 00001A call     CORINFO_HELP_NEWSFAST
IN0004: 00001F mov      rcx, rax
IN0005: 000022 vpaddd   xmm0, xmm6, xmm6
IN0006: 000027 vmovupd  xmmword ptr [rcx+8], xmm0
IN0007: 00002D call     System.Console:WriteLine(ref)
IN0008: 000032 nop      

G_M17448_IG03:        ; offs=000033H, size=000CH, epilog, nogc, emitadd

IN000b: 000033 vmovaps  xmm6, qword ptr [rsp+20H]
IN000c: 00003A add      rsp, 56
IN000d: 00003E ret      

Note that the boxing is done for the call to System.Console:WriteLine

@fiigii
Copy link

fiigii commented Oct 2, 2018

@CarolEidt Thank you!

@tannergooding
Copy link
Member Author

Updated to also cover the SetZeroVector intrinsics.

@tannergooding
Copy link
Member Author

Still TODO:

  • Expose constructors that take "upper/lower", such as: Vector256<T> Create(Vector128<T> lower, Vector128<T> upper)
  • Update the "Create" methods to have the accelerated paths
  • Potentially expose the "Insert/Extract Element" helpers (these are currently listed as helpers on Avx, and are already partially implemented to support the debugger)

@fiigii
Copy link

fiigii commented Oct 17, 2018

Potentially expose the "Insert/Extract Element" helpers (these are currently listed as helpers on Avx, and are already partially implemented to support the debugger)

Currently, we have AVX "Insert/Extract Element" helpers to match SSE4.1 instructions. It would be nice to have a bunch of platform-agnostic insert/extract for all the vector types (but we still need the SSE4.1 intrinsic).

@tannergooding tannergooding changed the title [WIP] Move the various helper intrinsics to be implemented on the S.R.Intrinsics.Vector types Move the various helper intrinsics to be implemented on the S.R.Intrinsics.Vector types Oct 19, 2018
@tannergooding
Copy link
Member Author

I think, from the API shape perspective. This matches what we would expect and what I would want to confirm at the follow up API review we'll be having.

If everyone else is also happy with this, I think we should be good to merge and we can do any minor/residual cleanup after the API review (we already reviewed and approved moving the helper methods to the base types, the follow-up review is confirming the moved APIs shapes and names are what we want).

After the CoreFX side is updated, I can finish the remaining JIT hookup for the various Create methods and we can add the "CreateScalarUnsafe" methods (or whatever name we decide upon) after the follow-up API review.

@fiigii
Copy link

fiigii commented Oct 19, 2018

@tannergooding Thanks for the work. The new APIs look good to me overall.
You have updated some test cases with the new APIs (that will execute the software fallback before CoreFX side updated and JIT implementation done) and commit partial JIT tweaks. Actually, I recommend that we just focus on adding APIs in this PR and leave tests/JIT update in the next PR (after CoreFX update). That would be good to review and checking JIT PMI diffs.

@tannergooding
Copy link
Member Author

@fiigii, updated to only add the new APIs and their software implementations.

@fiigii
Copy link

fiigii commented Oct 22, 2018

We need to move ExtendToVector256<T> to Vector128<T> class as well and add a Vector64<T> counterpart.

@tannergooding
Copy link
Member Author

This has been updated as per the API review that happened today:

  • CreateScalarUnsafe now exists and leaves the upper bits uninitialized
  • We have a generic As<U>() instance method
  • We have a Vector64<T>.ToVector128() and Vector64<T>.ToVector128Unsafe() instance method, the latter which leaves the upper bits uninitialized
    • Same applies to Vector128<T>.ToVector256() and Vector128<T>.ToVector256Unsafe() methods

@tannergooding
Copy link
Member Author

CC. @GrabYourPitchforks who was in the review.

@fiigii
Copy link

fiigii commented Nov 7, 2018

Could you please add XML comments for these helper intrinsics, especially *Unsafe ones.

@tannergooding
Copy link
Member Author

Could you please add XML comments for these helper intrinsics, especially *Unsafe ones.

Done.

@tannergooding
Copy link
Member Author

CoreFX side is here: dotnet/corefx#33302

Copy link

@fiigii fiigii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the work, especially the great documentation 😄

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I would have liked to see tests added for these at the same time as the IL implementation. The tests should presumably be in a shared directory as these are not target-specific.
That said, I'm fine with this going in now, and adding the tests with the JIT support.

@tannergooding
Copy link
Member Author

I would have liked to see tests added for these at the same time as the IL implementation. The tests should presumably be in a shared directory as these are not target-specific.

Right, the problem is that we can't add tests until after the new APIs appear in the reference assembly :(

@CarolEidt
Copy link

the problem is that we can't add tests until after the new APIs appear in the reference assembly

Ah yes, of course. Thanks for clarifying.

@tannergooding
Copy link
Member Author

Changed SetElement, SetLower, and SetUpper to WithElement, WithLower, and WithUpper; as per the feedback on dotnet/corefx#33302

@tannergooding tannergooding merged commit ce586ae into dotnet:master Nov 8, 2018
A-And pushed a commit to A-And/coreclr that referenced this pull request Nov 20, 2018
…nsics.Vector types (dotnet#20147)

* Renaming Vector64.cs, Vector128.cs, and Vector256.cs to be Vector64_1.cs, etc

* Adding some core helper methods to the Vector64, Vector128, and Vecto256 types.

* Adding some documentation comments to the System.Runtime.Intrinsics.Vector types

* Changing `Set` to `With`
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…nsics.Vector types (dotnet/coreclr#20147)

* Renaming Vector64.cs, Vector128.cs, and Vector256.cs to be Vector64_1.cs, etc

* Adding some core helper methods to the Vector64, Vector128, and Vecto256 types.

* Adding some documentation comments to the System.Runtime.Intrinsics.Vector types

* Changing `Set` to `With`


Commit migrated from dotnet/coreclr@ce586ae
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants