Implement 16-bit SSE2 & AVX2 vector division #94

adbancroft · 2022-02-12T01:57:59Z

The current implementation is a workaround: this PR uses the appropriate vector operations to implement 16-bit division for __m256i & __m128i

This reverts commit 5731cec. This allows #94 to be mergeable again

ridiculousfish

Nice, LGTM, thank you!

ridiculousfish · 2022-03-25T21:03:43Z

Huh?

This will includes at least [1]. [1]: ridiculousfish/libdivide#94 v2: Define LIBDIVIDE_* macros mutually exclusive Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

ridiculousfish added a commit that referenced this pull request Feb 12, 2022

Revert "clang-format the main files"

eee14df

This reverts commit 5731cec. This allows #94 to be mergeable again

adbancroft force-pushed the 16bit_avx256_avx128 branch from c34a98b to 9dbd861 Compare February 12, 2022 02:03

ridiculousfish self-requested a review February 12, 2022 04:11

ridiculousfish approved these changes Feb 12, 2022

View reviewed changes

adbancroft added 2 commits February 11, 2022 23:28

Full implementation of 16-bit vec256

d290f5b

Full implementation of 16-bit vec128

a6090fa

adbancroft force-pushed the 16bit_avx256_avx128 branch from 9dbd861 to a6090fa Compare February 12, 2022 05:31

adbancroft merged commit 1978f27 into ridiculousfish:master Feb 12, 2022

adbancroft deleted the 16bit_avx256_avx128 branch February 12, 2022 14:26

azat mentioned this pull request Dec 11, 2022

Bump libdivide (to gain some new optimizations) ClickHouse/ClickHouse#44132

Merged

azat added a commit to azat/ClickHouse that referenced this pull request Dec 13, 2022

Update libdivide

cdfe62f

This will includes at least [1]. [1]: ridiculousfish/libdivide#94 v2: Define LIBDIVIDE_* macros mutually exclusive Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

chenrui333 mentioned this pull request Aug 1, 2024

libdivide 5.1 Homebrew/homebrew-core#179234

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement 16-bit SSE2 & AVX2 vector division #94

Implement 16-bit SSE2 & AVX2 vector division #94

adbancroft commented Feb 12, 2022

ridiculousfish left a comment

ridiculousfish commented Mar 25, 2022

Implement 16-bit SSE2 & AVX2 vector division #94

Implement 16-bit SSE2 & AVX2 vector division #94

Conversation

adbancroft commented Feb 12, 2022

ridiculousfish left a comment

Choose a reason for hiding this comment

ridiculousfish commented Mar 25, 2022