Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use crossplat vectors in GetPointerToFirstInvalidChar #90373

Merged
merged 1 commit into from
Aug 11, 2023

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Aug 11, 2023

This whole path is currently not used on ARM64 (it uses Vector<> instead).

Sse4.1 requirement was lifted to Sse2

@ghost
Copy link

ghost commented Aug 11, 2023

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

Issue Details

This whole path is currently not used on ARM64 (it uses Vector<> instead).

Sse4.1 requirement was lifted to Sse2 because of Vector128.Min which is not bad for Sse2, enough for R2R:

Vector128<ushort> Min(Vector128<ushort> a, Vector128<ushort> b)
    => Vector128.Min(a, b);
; Method mytest:Min
       movups   xmm0, xmmword ptr [rdx]
       movups   xmm1, xmmword ptr [reloc @RWD00]
       paddw    xmm0, xmm1
       movups   xmm2, xmmword ptr [r8]
       paddw    xmm2, xmm1
       pminsw   xmm0, xmm2
       psubw    xmm0, xmm1
       movups   xmmword ptr [rcx], xmm0
       mov      rax, rcx
       ret      
RWD00  	dq	8000800080008000h, 8000800080008000h
; Total bytes of code: 37
Author: EgorBo
Assignees: -
Labels:

area-System.Runtime.Intrinsics

Milestone: -

[MethodImpl(MethodImplOptions.AggressiveInlining)]
[CompExactlyDependsOn(typeof(AdvSimd.Arm64))]
[CompExactlyDependsOn(typeof(Sse2))]
internal static Vector128<ushort> AddSaturate(Vector128<ushort> left, Vector128<ushort> right)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be exposed as a public API with #82559

@EgorBo EgorBo force-pushed the fix-GetPointerToFirstInvalidChar branch from 3e0a6df to 226929f Compare August 11, 2023 08:46
{
charIsNonAscii = AdvSimd.Min(utf16Data, vector0080);
// Use Sse41.Min with opportunistic ISA check on R2R instead of
// slower SSE2-baked Vector128.Min
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is Vector128.Min slower than Sse41.Min?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector128.Min uses fallback without SSE4.1:

Vector128<ushort> Min(Vector128<ushort> a, Vector128<ushort> b)
    => Vector128.Min(a, b);
; Method mytest:Min
       movups   xmm0, xmmword ptr [rdx]
       movups   xmm1, xmmword ptr [reloc @RWD00]
       paddw    xmm0, xmm1
       movups   xmm2, xmmword ptr [r8]
       paddw    xmm2, xmm1
       pminsw   xmm0, xmm2
       psubw    xmm0, xmm1
       movups   xmmword ptr [rcx], xmm0
       mov      rax, rcx
       ret      
RWD00  	dq	8000800080008000h, 8000800080008000h
; Total bytes of code: 37

it probably can be improved but Sse4.1 Min is a single instruction under a quick opportunistic check branch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure R2R codegen that important though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a bug in Vector128.Min then?

Copy link
Member Author

@EgorBo EgorBo Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern is only for R2R/NAOT which is compiled as SSE2, for JIT everything is great as is.
the idea to emit:

if (runtimeIsaCheck == 1)
{
    // use single-instruction for Min
}
else
{
    // slow fallback because there is no SSE4.1
}

but I decided to drop it because it looks like we have this problem already in many places where we only check Vector128.IsHardwareAccelerated and don't care what exactly Vector128.* APIs need under the hood.

@stephentoub
Copy link
Member

stephentoub commented Aug 11, 2023

Shouldn't we be changing the fallback Vector<> path to use Vector128<> instead, too?

EDIT: #90391

@EgorBo
Copy link
Member Author

EgorBo commented Aug 11, 2023

Shouldn't we be changing the fallback Vector<> path to use Vector128<> instead, too?

EDIT: #90391

Ah, I didn't notice you already ported it, will revert mine

@EgorBo EgorBo force-pushed the fix-GetPointerToFirstInvalidChar branch from c68f5c3 to 33ad71e Compare August 11, 2023 13:23
@EgorBo
Copy link
Member Author

EgorBo commented Aug 11, 2023

wasm job is failing with dotnet/dnceng#450

@EgorBo EgorBo merged commit 0dfe81a into dotnet:main Aug 11, 2023
166 of 168 checks passed
@EgorBo EgorBo deleted the fix-GetPointerToFirstInvalidChar branch August 11, 2023 16:29
@ghost ghost locked as resolved and limited conversation to collaborators Sep 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants