-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/8.0] Vectorize TensorPrimitives APIs #93746
[release/8.0] Vectorize TensorPrimitives APIs #93746
Commits on Oct 20, 2023
-
Configuration menu - View commit details
-
Copy full SHA for bd57689 - Browse repository at this point
Copy the full SHA bd57689View commit details -
Simplify TensorPrimitive's AbsoluteOperator (dotnet#92577)
Vector{128/256/512} all provide Abs; no need to do this manually.
Configuration menu - View commit details
-
Copy full SHA for 4afeb64 - Browse repository at this point
Copy the full SHA 4afeb64View commit details -
Reduce some boilerplate in TensorPrimitive's IBinaryOperator (dotnet#…
…92576) Change a few of the static abstract interface methods to be virtual, as most implementations throw from these methods; we can consolidate that throwing to the base.
Configuration menu - View commit details
-
Copy full SHA for 3f246e3 - Browse repository at this point
Copy the full SHA 3f246e3View commit details -
Minor code cleanup in TensorPrimitives tests (dotnet#92575)
* Normalize some test naming * Alphabetize tests * Improve mistmatched length tests with all positions of the shorter tensor * Alphabetize methods in TensorPrimitives.cs
Configuration menu - View commit details
-
Copy full SHA for a0706c9 - Browse repository at this point
Copy the full SHA a0706c9View commit details -
Vectorize TensorPrimitives.Min/Max{Magnitude} (dotnet#92618)
* Vectorize TensorPrimitives.Min/Max{Magnitude} * Use AdvSimd.Max/Min * Rename some parameters/locals for consistency * Improve HorizontalAggregate * Move a few helpers * Avoid scalar path for returning found NaN
Configuration menu - View commit details
-
Copy full SHA for b55a315 - Browse repository at this point
Copy the full SHA b55a315View commit details -
Update TensorPrimitives aggregations to vectorize handling of remaini…
…ng elements (dotnet#92672) * Update TensorPrimitives.CosineSimilarity to vectorize handling of remaining elements * Vectorize remainder handling for Aggregate helpers
Configuration menu - View commit details
-
Copy full SHA for dabae03 - Browse repository at this point
Copy the full SHA dabae03View commit details -
Flesh out TensorPrimitives XML docs (dotnet#92749)
* Flesh out TensorPrimitives XML docs * Address PR feedback - Remove use of FusedMultiplyAdd from all but CosineSimilarity - Remove comments about platform/OS-specific behavior from Add/AddMultiply/Subtract/Multiply/MultiplyAdd/Divide/Negate - Loosen comments about NaN and which exact one is returned * Address PR feedback
Configuration menu - View commit details
-
Copy full SHA for 50e3948 - Browse repository at this point
Copy the full SHA 50e3948View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4088f05 - Browse repository at this point
Copy the full SHA 4088f05View commit details -
Enable TensorPrimitives to perform in-place operations (dotnet#92820)
Some operations would produce incorrect results if the same span was passed as both an input and an output. When vectorization was employed but the span's length wasn't a perfect multiple of a vector, we'd do the standard trick of performing one last operation on the last vector's worth of data; however, that relies on the operation being idempotent, and if a previous operation has overwritten input with a new value due to the same memory being used for input and output, some operations won't be idempotent. This fixes that by masking off the already processed elements. It adds tests to validate in-place use works, and it updates the docs to carve out this valid overlapping.
Configuration menu - View commit details
-
Copy full SHA for 02416c2 - Browse repository at this point
Copy the full SHA 02416c2View commit details -
Vectorize TensorPrimitives.ConvertToSingle (dotnet#92779)
* Vectorize TensorPrimitives.ConvertToSingle * Address PR feedback
Configuration menu - View commit details
-
Copy full SHA for fdff01f - Browse repository at this point
Copy the full SHA fdff01fView commit details -
Configuration menu - View commit details
-
Copy full SHA for bbd26a2 - Browse repository at this point
Copy the full SHA bbd26a2View commit details -
This vectorizes TensorPrimitives.Log2 (dotnet#92897)
* Add a way to support operations that can't be vectorized on netstandard * Updating TensorPrimitives.Log2 to be vectorized on .NET Core * Update src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/TensorPrimitives.netstandard.cs Co-authored-by: Stephen Toub <stoub@microsoft.com> * Ensure we do an arithmetic right shift in the Log2 vectorization * Ensure the code can compile on .NET 7 * Ensure that edge cases are properly handled and don't resolve to `x` * Ensure that Log2 special results are explicitly handled. --------- Co-authored-by: Stephen Toub <stoub@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for b92402b - Browse repository at this point
Copy the full SHA b92402bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 13ee491 - Browse repository at this point
Copy the full SHA 13ee491View commit details -
[wasm] Disable
TensorPrimitivesTests.ConvertToHalf_SpecialValues
(d……otnet#92953) Failing test: `System.Numerics.Tensors.Tests.TensorPrimitivesTests.ConvertToHalf_SpecialValues` Issue: dotnet#92885
Configuration menu - View commit details
-
Copy full SHA for 2091662 - Browse repository at this point
Copy the full SHA 2091662View commit details -
Configuration menu - View commit details
-
Copy full SHA for ec9762c - Browse repository at this point
Copy the full SHA ec9762cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 74c7e7a - Browse repository at this point
Copy the full SHA 74c7e7aView commit details -
Vectorize TensorPrimitives.Exp (dotnet#93018)
* Vectorize TensorPrimitives.Exp * Update src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/TensorPrimitives.netstandard.cs
Configuration menu - View commit details
-
Copy full SHA for f48d8b0 - Browse repository at this point
Copy the full SHA f48d8b0View commit details -
Vectorize TensorPrimitives.Sigmoid and TensorPrimitives.SoftMax (dotn…
…et#93029) * Vectorize TensorPrimitives.Sigmoid and TensorPrimitives.SoftMax - Adds a SigmoidOperator that just wraps the ExpOperator - Vectorizes both passes of SoftMax, on top of ExpOperator. Simplest way to do this was to augment the existing InvokeSpanScalarIntoSpan to take a transform operator. - In doing so, found some naming inconsistencies I'd previously introduced, so I did some automatic renaming to make things more consistent. - Added XML comments to all the internal/private surface area. - Fleshes out some tests (and test values). * Disable tests on mono * Address PR feedback
Configuration menu - View commit details
-
Copy full SHA for 6c63ae7 - Browse repository at this point
Copy the full SHA 6c63ae7View commit details -
Vectorize TensorPrimitives.Tanh/Cosh/Sinh (dotnet#93093)
* Vectorize TensorPrimitives.Tanh/Cosh/Sinh Tanh and Cosh are based on AOCL-LibM. AOCL-LibM doesn't appear to have a sinh implementation, so this Sinh is just based on the sinh formula based on exp(x). I also augmented the tests further, including: - Added more tests for sinh/cosh/tanh - Add an equality routine that supports comparing larger values with a tolerance - Tightened the tolerance for most functions - Changed some tests to be theories to be consistent with style elsewhere in the tests - Fixed some use of Math to be MathF * Remove unnecessary special-handling path from cosh * Remove unnecessary special-handling path from tanh * Redo sinh based on cosh * Address PR feedback
Configuration menu - View commit details
-
Copy full SHA for bc4d0cd - Browse repository at this point
Copy the full SHA bc4d0cdView commit details -
Configuration menu - View commit details
-
Copy full SHA for e9b29c0 - Browse repository at this point
Copy the full SHA e9b29c0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8db0a9b - Browse repository at this point
Copy the full SHA 8db0a9bView commit details -
Configuration menu - View commit details
-
Copy full SHA for cd02aa5 - Browse repository at this point
Copy the full SHA cd02aa5View commit details -
Fix TensorPrimitives.IndexOfXx corner-case when first element is seed…
… value (dotnet#93169) * Fix TensorPrimitives.IndexOfXx corner-case when first element is seed value Found as part of adding more tests for Min/Max{Magnitude} to validate they match their IndexOfXx variants. * Address PR feedback
Configuration menu - View commit details
-
Copy full SHA for fffa1d4 - Browse repository at this point
Copy the full SHA fffa1d4View commit details -
Improve a vector implementation to support alignment and non-temporal…
… tores (dotnet#93296) * Improve a vector implementation to support alignment and non-temporal stores * Fix a build error and mark a couple methods as AggressiveInlining * Fix the remaining block count computation * Ensure overlapping for small data on the V256/512 is handled * Ensure we only go down the vectorized path when supported for netstandard
Configuration menu - View commit details
-
Copy full SHA for 3a970a8 - Browse repository at this point
Copy the full SHA 3a970a8View commit details -
Configuration menu - View commit details
-
Copy full SHA for b0dd6ca - Browse repository at this point
Copy the full SHA b0dd6caView commit details -
Use the improved vectorization algorithm for binary and ternary Tenso…
…rPrimitives operations (dotnet#93409) * Update InvokeSpanSpanIntoSpan<TBinaryOperator> for TensorPrimitives to use the better SIMD algorithm * Update InvokeSpanScalarIntoSpan<TTransformOperator, TBinaryOperator> for TensorPrimitives to use the better SIMD algorithm * Update InvokeSpanSpanSpanIntoSpan<TTernaryOperator> for TensorPrimitives to use the better SIMD algorithm * Update InvokeSpanSpanScalarIntoSpan<TTernaryOperator> for TensorPrimitives to use the better SIMD algorithm * Update InvokeSpanScalarSpanIntoSpan<TTernaryOperator> for TensorPrimitives to use the better SIMD algorithm * Improve codegen slightly by using case 0, rather than default * Adjust the canAlign check to be latter, to reduce branch count for data under the threshold * Add a comment explaining the NonTemporalByteThreshold * Make sure xTransformOp.CanVectorize is checked on .NET Standard
Configuration menu - View commit details
-
Copy full SHA for 8a7a6bb - Browse repository at this point
Copy the full SHA 8a7a6bbView commit details -
Use the improved vectorization algorithm for aggregate TensorPrimitiv…
…es operations (dotnet#93695) * Improve the handling of the IAggregationOperator implementations * Update Aggregate<TTransformOperator, TAggregationOperator> for TensorPrimitives to use the better SIMD algorithm * Update Aggregate<TBinaryOperator, TAggregationOperator> for TensorPrimitives to use the better SIMD algorithm * Respond to PR feedback
Configuration menu - View commit details
-
Copy full SHA for 1c2126e - Browse repository at this point
Copy the full SHA 1c2126eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 13b47f4 - Browse repository at this point
Copy the full SHA 13b47f4View commit details -
Configuration menu - View commit details
-
Copy full SHA for f86414a - Browse repository at this point
Copy the full SHA f86414aView commit details -
Vectorizes IndexOfMin/Max/Magnitude (dotnet#93469)
* resolved merge conflicts * net core full done * minor code cleanup * NetStandard and PR fixes. * minor pr changes * Fix IndexOfMaxMagnitudeOperator * Fix IndexOfMaxMagnitudeOperator on netcore * updates from PR comments * netcore fixed * net standard updated * add reference assembly exclusions * made naive approach better * resolved PR comments * minor comment changes * minor formatting fixes * added inlining * fixes from PR comments * comments from pr * fixed spacing --------- Co-authored-by: Eric StJohn <ericstj@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for cb48e75 - Browse repository at this point
Copy the full SHA cb48e75View commit details