Add reductions #83

calebzulawski · 2021-03-08T05:49:18Z

Fixes #71.

workingjubilee · 2021-04-08T21:42:19Z

These will be complicated further by rust-lang/rust#83984

This reverts commit 1ea2f12.

calebzulawski · 2021-04-09T20:33:54Z

All ready to go with #80 merged.

Lokathor · 2021-04-09T21:00:24Z

Are these reducing ops horizontal ops? This needs more docs on the methods to explain what's happening.

calebzulawski · 2021-04-09T22:12:46Z

Yes they are, though I thought my documentation was pretty decent. Do you have any specific examples of what you mean?

Lokathor · 2021-04-09T22:32:56Z

I would include the word "horizontal" so that the difference is more noteable at a glance, and so that search results in rustdoc let you see it sooner (since they show item path and first line of docs but not type).

thomcc · 2021-04-09T23:41:15Z

I'd also suggest that horizontal should be in the name, so that it can be more easily grouped when describing in e.g. whatever guide we end up writing on how to write good/efficient SIMD code.

(E.g. One of the guidelines here would be to, where possible, try to avoid these in batch processing by deferring them until the end, because they're comparatively inefficient on most hardware — often O(log lanes) or O(lanes))

Also, it makes it easy to search for if you know what you want.

calebzulawski · 2021-04-10T02:49:32Z

I kind of didn't want to use horizontal because to me it's lingo (intel specific maybe?) and not really descriptive, and I think any documentation using it would need to define what that means anyway. I just looked at the documentation for arm vaddv and it doesn't use the word horizontal, it just defines it as "add across vector".

calebzulawski · 2021-04-10T02:50:42Z

I think reduce is potentially a more universal term (but sum might still be more obvious than something like reduce_add?)

thomcc · 2021-04-10T03:45:12Z

The term horizontal predates Intel's stuff certainly. It is lingo/jargon, but so is stuff like saturating.

Reduce can mean a lot of different things, and seems way more overloaded to me. I could imagine someone expecting reducing a vector to mean reducing the number of lanes, as just an example.

Horizontal, even if it is lingo/jargon, is pretty specific to SIMD, is defined in our guide (briefly) and the guide that was written for packed_simd (in depth), and is something of a critical concept to understand when writing SIMD — I suspect we have to come up with a name for it and provide a definition somewhere (and the docs can say "for more information on horizontal operations, see [the simd guide's entry on the topic](good link)), so I feel like we should use the common terminology for that, even if it's not perfect.

calebzulawski · 2021-04-10T04:11:18Z

Do we maybe want all of the reductions to be named horizontal_sum, horizontal_xor, horizontal_min, etc and leave the docs mostly as-is?

Lokathor · 2021-04-10T04:26:12Z

while that's certainly possible, consider that:

there's really no value in under-documenting a function. in core even the most obvious math functions have a full explanation of what it does along with a rustdoc example. Having less lines of rustdoc doesn't make your program run faster.
you yourself pointed out that "horizontal" is a jargon-y term just a few posts ago. If we take that as true (and honestly I think we can) then simply putting a jargony term on the front of the method names does not do a service to out beginner users.

That said, i got up from a nap like, after you posted your most recent reply 14 minutes ago, so i could be kinda half-asleep rigtht now.

calebzulawski · 2021-04-11T14:59:57Z

Updated the function names and docs, I think the new names are more consistent and the docs a little clearer.

workingjubilee · 2021-04-12T20:27:49Z

PPC errors are spurious.
Blocked, functionally, on a decision regarding wasm testing (issue #92).

crates/core_simd/src/reduction.rs

workingjubilee · 2021-04-14T03:42:55Z

Computes -> Returns?

Applicable Patch

diff --git a/crates/core_simd/src/reduction.rs b/crates/core_simd/src/reduction.rs
index e728f8a..118d1d8 100644
--- a/crates/core_simd/src/reduction.rs
+++ b/crates/core_simd/src/reduction.rs
@@ -4,46 +4,46 @@ macro_rules! impl_integer_reductions {
         where
             Self: crate::LanesAtMost32
         {
-            /// Horizontal wrapping add.  Computes the sum of the lanes of the vector, with wrapping addition.
+            /// Horizontal wrapping add.  Returns the sum of the lanes of the vector, with wrapping addition.
             #[inline]
             pub fn wrapping_sum(self) -> $scalar {
                 unsafe { crate::intrinsics::simd_reduce_add_ordered(self, 0) }
             }
 
-            /// Horizontal wrapping multiply.  Computes the product of the lanes of the vector, with wrapping multiplication.
+            /// Horizontal wrapping multiply.  Returns the product of the lanes of the vector, with wrapping multiplication.
             #[inline]
             pub fn wrapping_product(self) -> $scalar {
                 unsafe { crate::intrinsics::simd_reduce_mul_ordered(self, 1) }
             }
 
-            /// Horizontal bitwise "and".  Computes the cumulative bitwise "and" across the lanes of
+            /// Horizontal bitwise "and".  Returns the cumulative bitwise "and" across the lanes of
             /// the vector.
             #[inline]
             pub fn horizontal_and(self) -> $scalar {
                 unsafe { crate::intrinsics::simd_reduce_and(self) }
             }
 
-            /// Horizontal bitwise "or".  Computes the cumulative bitwise "or" across the lanes of
+            /// Horizontal bitwise "or".  Returns the cumulative bitwise "or" across the lanes of
             /// the vector.
             #[inline]
             pub fn horizontal_or(self) -> $scalar {
                 unsafe { crate::intrinsics::simd_reduce_or(self) }
             }
 
-            /// Horizontal bitwise "xor".  Computes the cumulative bitwise "xor" across the lanes of
+            /// Horizontal bitwise "xor".  Returns the cumulative bitwise "xor" across the lanes of
             /// the vector.
             #[inline]
             pub fn horizontal_xor(self) -> $scalar {
                 unsafe { crate::intrinsics::simd_reduce_xor(self) }
             }
 
-            /// Horizontal maximum.  Computes the maximum lane in the vector.
+            /// Horizontal maximum.  Returns the maximum lane in the vector.
             #[inline]
             pub fn horizontal_max(self) -> $scalar {
                 unsafe { crate::intrinsics::simd_reduce_max(self) }
             }
 
-            /// Horizontal minimum.  Computes the minimum lane in the vector.
+            /// Horizontal minimum.  Returns the minimum lane in the vector.
             #[inline]
             pub fn horizontal_min(self) -> $scalar {
                 unsafe { crate::intrinsics::simd_reduce_min(self) }
@@ -59,7 +59,7 @@ macro_rules! impl_float_reductions {
             Self: crate::LanesAtMost32
         {
 
-            /// Horizontal add.  Computes the sum of the lanes of the vector.
+            /// Horizontal add.  Returns the sum of the lanes of the vector.
             #[inline]
             pub fn sum(self) -> $scalar {
                 // LLVM sum is inaccurate on i586
@@ -70,7 +70,7 @@ macro_rules! impl_float_reductions {
                 }
             }
 
-            /// Horizontal multiply.  Computes the sum of the lanes of the vector.
+            /// Horizontal multiply.  Returns the sum of the lanes of the vector.
             #[inline]
             pub fn product(self) -> $scalar {
                 // LLVM product is inaccurate on i586
@@ -81,7 +81,7 @@ macro_rules! impl_float_reductions {
                 }
             }
 
-            /// Horizontal maximum.  Computes the maximum lane in the vector.
+            /// Horizontal maximum.  Returns the maximum lane in the vector.
             ///
             /// Returns values based on equality, so a vector containing both `0.` and `-0.` may
             /// return either.  This function will not return `NaN` unless all lanes are `NaN`.
@@ -90,7 +90,7 @@ macro_rules! impl_float_reductions {
                 unsafe { crate::intrinsics::simd_reduce_max(self) }
             }
 
-            /// Horizontal minimum.  Computes the minimum lane in the vector.
+            /// Horizontal minimum.  Returns the minimum lane in the vector.
             ///
             /// Returns values based on equality, so a vector containing both `0.` and `-0.` may
             /// return either.  This function will not return `NaN` unless all lanes are `NaN`.

workingjubilee · 2021-04-15T22:14:35Z

Re: nomenclature, vertical:lanewise::horizontal:vectorwise?

calebzulawski · 2021-04-19T01:12:40Z

Do we think this is good to go?

Lokathor

Other than the possible name changes, looks solid.

Lokathor · 2021-04-19T03:47:56Z

crates/core_simd/src/reduction.rs

+        {
+            /// Horizontal wrapping add.  Returns the sum of the lanes of the vector, with wrapping addition.
+            #[inline]
+            pub fn wrapping_sum(self) -> $scalar {


should this say horizontal_ at the front of the name like the others?

I left it out of sum and product because I think the names are more self explanatory, since there's no "vertical" sum (and mirror the ones in std::iter a bit, too).

Kind of like how it wouldn't really make sense to make it Mask{N}::horizontal_all.

well, these types should implement the Sum and Product traits: https://doc.rust-lang.org/std/iter/trait.Sum.html but if you think it won't be confusing we can skip it for now

Unfortunately we can't implement those traits since they only work on iterators. They get implemented on the iterator Item, not the iterator itself.

More specifically you can already do vector.to_array().sum() but it won't be SIMD.

https://docs.rs/wide/0.6.4/wide/struct.f32x4.html#impl-Sum%3CRHS%3E you just do it like this?

Mmm, I think there's ways to get around that actually, where we can use our ability to interact with the underlying values of the iterator and transform calls into horizontal reductions.

To elaborate on how that would look:

An iterator over a 16 byte vector (yon typical SSE/Neon register) that iterated one step (doing whatever) and then had product() called on it could have the first lane set to 1 and then issue a horizontal mul instruction. Likewise, sum, but the additive identity is 0. This works because we can extract an ExactSizeIterator from a SIMD vector.

Well, first of all that would require different iterators from Sum and Product, which doesn't really make sense. But I think that's also just not a very good iterator for normal iteration, compared to an array.

how about having the iterator be something like:

struct Iter<SimdType: AsElements> { value: SimdType, range: Range<usize>, } impl<SimdType: AsElements> Iterator for Iter<SimdType> { type Item = SimdType::Element; fn next(&mut self) -> Option<Self::Item> { let index = self.range.next()?; unsafe { Some(self.value.get_element_unchecked(index)) } } } impl<SimdType: AsElements> DoubleEndedIterator for Iter<SimdType> { fn next_back(&mut self) -> Option<Self::Item> { let index = self.range.next_back()?; unsafe { Some(self.value.get_element_unchecked(index)) } } } trait SumProductHelper<I: Iterator<Item=Self>> { fn sum(iter: I) -> Self; fn product(iter: I) -> Self; } impl<I: Iterator<Item=Self>> SumProductHelper<I> for i32 { default fn sum(iter: I) -> Self { iter.fold(0i32, Add::add) } default fn product(iter: I) -> Self { iter.fold(1i32, Mul::mul) } } impl<SimdType: AsElements<Element = Self>> SumProductHelper<Iter<SimdType>> for i32 { fn sum(iter: Iter<SimdType>) -> Self { iter.value.fill_outside_range(iter.range, 0i32).sum() } fn product(iter: Iter<SimdType>) -> Self { iter.value.fill_outside_range(iter.range, 1i32).product() } } impl Sum for i32 { fn sum<I: Iterator<Item=Self>>(iter: I) -> Self { SumProductHelper::sum(iter) } }

I suppose this would work, but it makes the iterator pretty big and wordy and I'm not convinced this is any better than just doing as_array or as_slice. Especially since accessing a slice is just getelementptr which can optimize to insertelement/extractelement

crates/core_simd/src/reduction.rs

calebzulawski · 2021-04-19T23:48:34Z

Updated all of those fn names

calebzulawski · 2021-04-22T04:33:20Z

@workingjubilee okay with naming etc?

crates/core_simd/src/reduction.rs

workingjubilee · 2021-04-22T23:44:33Z

Looks good!

calebzulawski changed the title ~~Reductions~~ Add reductions Mar 8, 2021

calebzulawski force-pushed the feature/reductions branch from 3a925a0 to b642dd0 Compare April 3, 2021 21:06

calebzulawski added 13 commits April 9, 2021 16:31

Add floating-point classification functions

b0a005d

Various bug fixes

d7649f4

Add intrinsics

926cf3a

Implement reductions

875b31c

Add tests

a7b82ad

Enable special handling of zero

193cd14

Fix mask ops

02608d4

Update documentation and fix i586 inaccuracy

64f5648

Fix i586 detection

4b8cbd5

Revert i586 fix, fix test instead

b51febb

Revert "Revert i586 fix, fix test instead"

3fae09b

This reverts commit 1ea2f12.

Fix test sum/product implementation

3cf970f

Enable i586 workaround for both f32 and f64

e2fa502

calebzulawski force-pushed the feature/reductions branch from b642dd0 to e2fa502 Compare April 9, 2021 20:31

calebzulawski marked this pull request as ready for review April 9, 2021 20:33

calebzulawski requested review from workingjubilee and Lokathor April 9, 2021 20:34

calebzulawski mentioned this pull request Apr 9, 2021

Use neg intrinsics #89

Merged

Improve function names and docs

e127586

Lokathor reviewed Apr 12, 2021

View reviewed changes

crates/core_simd/src/reduction.rs Outdated Show resolved Hide resolved

Update docs

01d78aa

Lokathor reviewed Apr 19, 2021

View reviewed changes

Rename sum, product to horizontal_{sum,product}

828b274

Lokathor approved these changes Apr 20, 2021

View reviewed changes

workingjubilee reviewed Apr 22, 2021

View reviewed changes

crates/core_simd/src/reduction.rs Outdated Show resolved Hide resolved

Remove wrapping from sum/product fns

04ee107

workingjubilee merged commit 24ebae8 into master Apr 22, 2021

calebzulawski deleted the feature/reductions branch August 7, 2021 18:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reductions #83

Add reductions #83

calebzulawski commented Mar 8, 2021 •

edited

Loading

workingjubilee commented Apr 8, 2021

calebzulawski commented Apr 9, 2021

Lokathor commented Apr 9, 2021

calebzulawski commented Apr 9, 2021

Lokathor commented Apr 9, 2021

thomcc commented Apr 9, 2021

calebzulawski commented Apr 10, 2021

calebzulawski commented Apr 10, 2021

thomcc commented Apr 10, 2021 •

edited

Loading

calebzulawski commented Apr 10, 2021

Lokathor commented Apr 10, 2021

calebzulawski commented Apr 11, 2021

workingjubilee commented Apr 12, 2021 •

edited

Loading

workingjubilee commented Apr 14, 2021

workingjubilee commented Apr 15, 2021 •

edited

Loading

calebzulawski commented Apr 19, 2021

Lokathor left a comment

Lokathor Apr 19, 2021

calebzulawski Apr 19, 2021 •

edited

Loading

Lokathor Apr 19, 2021

calebzulawski Apr 19, 2021

Lokathor Apr 19, 2021

workingjubilee Apr 21, 2021 •

edited

Loading

workingjubilee Apr 21, 2021

calebzulawski Apr 21, 2021

programmerjake Apr 21, 2021

calebzulawski Apr 22, 2021 •

edited

Loading

calebzulawski commented Apr 19, 2021

calebzulawski commented Apr 22, 2021

workingjubilee commented Apr 22, 2021

Add reductions #83

Add reductions #83

Conversation

calebzulawski commented Mar 8, 2021 • edited Loading

workingjubilee commented Apr 8, 2021

calebzulawski commented Apr 9, 2021

Lokathor commented Apr 9, 2021

calebzulawski commented Apr 9, 2021

Lokathor commented Apr 9, 2021

thomcc commented Apr 9, 2021

calebzulawski commented Apr 10, 2021

calebzulawski commented Apr 10, 2021

thomcc commented Apr 10, 2021 • edited Loading

calebzulawski commented Apr 10, 2021

Lokathor commented Apr 10, 2021

calebzulawski commented Apr 11, 2021

workingjubilee commented Apr 12, 2021 • edited Loading

workingjubilee commented Apr 14, 2021

workingjubilee commented Apr 15, 2021 • edited Loading

calebzulawski commented Apr 19, 2021

Lokathor left a comment

Choose a reason for hiding this comment

Lokathor Apr 19, 2021

Choose a reason for hiding this comment

calebzulawski Apr 19, 2021 • edited Loading

Choose a reason for hiding this comment

Lokathor Apr 19, 2021

Choose a reason for hiding this comment

calebzulawski Apr 19, 2021

Choose a reason for hiding this comment

Lokathor Apr 19, 2021

Choose a reason for hiding this comment

workingjubilee Apr 21, 2021 • edited Loading

Choose a reason for hiding this comment

workingjubilee Apr 21, 2021

Choose a reason for hiding this comment

calebzulawski Apr 21, 2021

Choose a reason for hiding this comment

programmerjake Apr 21, 2021

Choose a reason for hiding this comment

calebzulawski Apr 22, 2021 • edited Loading

Choose a reason for hiding this comment

calebzulawski commented Apr 19, 2021

calebzulawski commented Apr 22, 2021

workingjubilee commented Apr 22, 2021

calebzulawski commented Mar 8, 2021 •

edited

Loading

thomcc commented Apr 10, 2021 •

edited

Loading

workingjubilee commented Apr 12, 2021 •

edited

Loading

workingjubilee commented Apr 15, 2021 •

edited

Loading

calebzulawski Apr 19, 2021 •

edited

Loading

workingjubilee Apr 21, 2021 •

edited

Loading

calebzulawski Apr 22, 2021 •

edited

Loading