Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GetIndexOfFirstNonAsciiByte_Vector not taken on ARM64 #90527

Closed
wants to merge 4 commits into from

Conversation

neon-sunset
Copy link
Contributor

@neon-sunset neon-sunset commented Aug 14, 2023

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Aug 14, 2023
@ghost
Copy link

ghost commented Aug 14, 2023

Tagging subscribers to this area: @dotnet/area-system-text-encoding
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #89924

Author: neon-sunset
Assignees: -
Labels:

area-System.Text.Encoding

Milestone: -

if (!Vector512.IsHardwareAccelerated &&
!Vector256.IsHardwareAccelerated &&
(Sse2.IsSupported || AdvSimd.IsSupported))
if (!Vector512.IsHardwareAccelerated && !Vector256.IsHardwareAccelerated && Sse2.IsSupported)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this, then doesn't this change leave a bunch of dead code in GetIndexOfFirstNonAsciiByte_Intrinsified that should be cleaned up? It has multiple blocks dedicated to AdvSimd which would never be reached. At which point the method should also be renamed to be SSE-specific.

But I'd also explicitly asked about this:
#88532 (comment)
and was told the preferring of the intrinsics path was by design. Maybe it was only measured on x64 and not on Arm?

cc: @tannergooding, @anthonycanino

Copy link
Contributor Author

@neon-sunset neon-sunset Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this, then doesn't this change leave a bunch of dead code in GetIndexOfFirstNonAsciiByte_Intrinsified that should be cleaned up? It has multiple blocks dedicated to AdvSimd which would never be reached. At which point the method should also be renamed to be SSE-specific.

Let me fix that.

#88532 (comment)

The _Vector path is estimated to be ~40% faster on osx-arm64, It is likely the switch was not measured on ARM64.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _Vector path is estimated to be ~40% faster on osx-arm64, It is likely the switch was not measured on ARM64.

@neon-sunset can you clarify please? .NET 7 was using the _Intrinsified path for Arm64: https://github.com/dotnet/runtime/blob/v7.0.10/src/libraries/System.Private.CoreLib/src/System/Text/ASCIIUtility.cs#L81-L91

The PR that introduced the _Vector path (#88532) didn't change _Intrinsified at all (outside using the centralized Vector128.Size constant) and didn't change which Arm64 was calling.

This then wouldn't seem like a regression, but rather simply a case where the new _Vector code is faster for Arm64. If that's the case, could you share your own numbers showing the .NET 7 vs .NET 8 difference (without this PR) and the difference for _Vector vs _Intrinsified for .NET 8 (this PR vs without this PR)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neon-sunset can you clarify please? .NET 7 was using the _Intrinsified path for Arm64: https://github.com/dotnet/runtime/blob/v7.0.10/src/libraries/System.Private.CoreLib/src/System/Text/ASCIIUtility.cs#L81-L91

You are correct, my mistake.

This then wouldn't seem like a regression, but rather simply a case where the new _Vector code is faster for Arm64. If that's the case, could you share your own numbers showing the .NET 7 vs .NET 8 difference (without this PR) and the difference for _Vector vs _Intrinsified for .NET 8 (this PR vs without this PR)?

This is the plan, I'm just waiting for the Release build to finish to post detailed numbers 😄

@ghost
Copy link

ghost commented Dec 7, 2023

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@github-actions github-actions bot locked and limited conversation to collaborators Jan 7, 2024
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Text.Encoding community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GetIndexOfFirstNonAsciiByte_Vector path in Ascii.Utility.cs is never exercised for AdvSimd/SSE4.1
3 participants