-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use multi-reg load/store for EncodeToUtf8 #95513
Conversation
Hi @kunalspathak , this is an initial version of encode to UTF8 using multi-register load/stores. This currently fails some asserts in LSRA phase while doing the crossgen for SPC. It fails while emitting |
src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Encoder.cs
Outdated
Show resolved
Hide resolved
Does this need an entry in their part notices file? |
Change-Id: Ie56b1786cdf8ac8d2067c0ba1fdfd3924dd9ca13
Sorry @danmoseley , I didn't understand which part notices file you're referring to. |
Initial benchmarking on N1 system show some good performance results.
|
if (src == srcEnd) | ||
goto DoneExit; | ||
} | ||
|
||
end = srcMax - 16; | ||
if ((Ssse3.IsSupported || AdvSimd.Arm64.IsSupported) && BitConverter.IsLittleEndian && (end >= src)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should remove the AdvSimd check here.
With this PR, it will use the vector128 version if the buffer length is <48 && >16
.
That's probably the best option for speed, but results in a bigger library.
@SwapnilGaikwad - do you mind sharing the disassembly? |
A separately compiled Full assembly
|
Thanks @SwapnilGaikwad for sharing the disassembly. I wanted to see the code quality when consecutive registers are involved. |
Vector128<byte> res4; | ||
Vector128<byte> tbl_enc1 = Vector128.Create("ABCDEFGHIJKLMNOP"u8).AsByte(); | ||
Vector128<byte> tbl_enc2 = Vector128.Create("QRSTUVWXYZabcdef"u8).AsByte(); | ||
Vector128<byte> tbl_enc3 = Vector128.Create("ghijklmnopqrstuv"u8).AsByte(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can load this encoding table from EncodingMap
from Line #774. This could help to reduce the code size but loading from memory/cache would be slightly slower than the regs. Benchmarks didn't show signficant difference. from-mem-artifacts
load data from EncodingMap
.
| Method | Toolchain | NumberOfBytes | Mean | Error | StdDev | Median | Min | Max | Ratio | MannWhitney(2%) |
|-------------------------------- |-------------------------------------------------------------------------------------- |-------------- |---------:|--------:|--------:|---------:|---------:|---------:|------:|---------------- |
| Base64Encode | /main/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 363.8 ns | 0.14 ns | 0.12 ns | 363.8 ns | 363.5 ns | 363.9 ns | 1.00 | Base |
| Base64Encode | /runtime/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 196.8 ns | 0.02 ns | 0.02 ns | 196.8 ns | 196.8 ns | 196.9 ns | 0.54 | Faster |
| Base64Encode | /runtime/from-mem-artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 195.3 ns | 0.08 ns | 0.07 ns | 195.3 ns | 195.2 ns | 195.5 ns | 0.54 | Faster |
| | | | | | | | | | | |
| Base64EncodeDestinationTooSmall | /main/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 368.2 ns | 0.30 ns | 0.28 ns | 368.2 ns | 367.8 ns | 368.7 ns | 1.00 | Base |
| Base64EncodeDestinationTooSmall | /runtime/artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 191.5 ns | 0.04 ns | 0.04 ns | 191.5 ns | 191.5 ns | 191.6 ns | 0.52 | Faster |
| Base64EncodeDestinationTooSmall | /runtime/from-mem-artifacts/tests/coreclr/linux.arm64.Release/Tests/Core_Root/corerun | 1000 | 194.0 ns | 0.22 ns | 0.20 ns | 193.9 ns | 193.7 ns | 194.4 ns | 0.53 | Faster |
Would you suggest to read encoding table from EncodingMap
?
/azp run runtime-coreclr libraries-jitstress |
/azp run runtime-coreclr libraries-jitstress2-jitstressregs |
Azure Pipelines successfully started running 1 pipeline(s). |
1 similar comment
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr libraries-jitstressregs |
Azure Pipelines successfully started running 1 pipeline(s). |
I guess he is talking about the reference made in https://github.com/dotnet/runtime/pull/95513/files#diff-b3b9edcf4c0d62e78954d826c44005cffb306b6ccf155f1a9228669229b7e765R496, but not sure where exactly to add this. @danmoseley - can you please confirm? |
Tagging subscribers to this area: @dotnet/area-system-buffers Issue DetailsThis implements the encode to UTF8 algorithm here.
|
ping @danmoseley |
Oops, yes, that was what caught my eye. Generally if we use significant ideas/code from elsewhere we add a credit in THIRD-PARTY-NOTICES.TXT at the root. Up to you. |
Looks like there is already an entry in there, but it's not immediately obvious. runtime/THIRD-PARTY-NOTICES.TXT Line 345 in 57bcccd
Contains an extra copy of the license from https://github.com/aklomp/base64/blob/master/LICENSE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[CompExactlyDependsOn(typeof(AdvSimd.Arm64))] | ||
private static unsafe void AdvSimdEncode(ref byte* srcBytes, ref byte* destBytes, byte* srcEnd, int sourceLength, int destLength, byte* srcStart, byte* destStart) | ||
{ | ||
// C# implementatino of https://github.com/aklomp/base64/blob/3a5add8652076612a8407627a42c768736a4263f/lib/arch/neon64/enc_loop.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
// C# implementatino of https://github.com/aklomp/base64/blob/3a5add8652076612a8407627a42c768736a4263f/lib/arch/neon64/enc_loop.c | |
// C# implementation of https://github.com/aklomp/base64/blob/3a5add8652076612a8407627a42c768736a4263f/lib/arch/neon64/enc_loop.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can do this in a follow-up PR
This reverts commit fdb03ca.
* Use multi-reg load/store for EncodeToUtf8 * Use the fixed version of multi-reg store * Update variable naming
…#96944) * Revert "[libs] Skip AdvSimdEncode on Mono (dotnet#96829)" This reverts commit 1a76e37. * Revert "Use multi-reg load/store for EncodeToUtf8 (dotnet#95513)" This reverts commit fdb03ca. * Wrap load/store vector APIs in '#if false' * Disable load/store vector tests * remove the trailing space
I assume this is the API being discussed. If so, it would be good to put in the initial comment so that it is easy for folks to fine. https://learn.microsoft.com/dotnet/api/system.buffers.text.base64.encodetoutf8 |
This implements the encode to UTF8 algorithm here.