Releases · DLTcollab/sse2neon · GitHub

25 Dec 20:41

jserv

v1.7.0 Latest

Latest

What's Changed

refactor: Add missing ARM64 implementation by @howjmay in #576
test: Build/run with crypto and/or crc by @howjmay in #574
doc: Describe the right coverage of SSE2NEON_PRECISE_MINMAX by @howjmay in #578
refactor: Reimplement _mm_movelh_ps for Arm64 by @howjmay in #579
tests: Cover all immediate numbers by @howjmay in #584
test: Use macro for validate results by @howjmay in #585
Improve precision of mm{rsqrt,sqrt,rcp,div}_{ps,ss} conversions by @Cuda-Chen in #580
Fix MSVC compile issues by @toxieainc in #588
Tweak MSVC ifdef guard for _BitScanForward64 by @aqrit in #592
Add notice that NEON handles certain IEEE single-precision values by @Cuda-Chen in #593
Add infinity test in test_mm_{max,min}_{pd,sd} by @Cuda-Chen in #594
Remove Kahan algorithm in _mm_dp_ps by @Cuda-Chen in #597
MSVC support by @anthony-linaro in #596
test: Cover all the valid imm range in tests by @howjmay in #586
Add test running for MSVC to CI by @anthony-linaro in #598
Align result to SSE when input is 0.0f/-0.0f in mm_rsqrt{ps, ss} by @Cuda-Chen in #599
fix: Fix exceeding width of type warning by @howjmay in #601
docs: Fix the typos by @howjmay in #603
docs: Fix the typos by @spacemiqote in #605
Fix build for gcc-13 and 32 bit arm systems. by @balister in #609
Fix unused parameters warning by @anakinxc in #610
Fixed gcc strict prototype and other build errors by @mnjdhl in #611
Fix _mm_cmplt_sd and _mm_cmpnlt_sd test cases by @Cuda-Chen in #612
disambiguate vector type to avoid errors depending on lax conversion … by @JoachimSchurig in #614
docs: fix typo failback by @howjmay in #616
Introduce fast and deterministic RNG by @Cuda-Chen in #615
fix: Fix typo nand by @howjmay in #617
fix: Fix MSVC warnings by @howjmay in #604
Add A32 support in CI by @Cuda-Chen in #620
Fix _mm_test_mix_ones_zeros and _mm_testnzc_si128 by @aqrit in #621

New Contributors

@anthony-linaro made their first contribution in #596
@spacemiqote made their first contribution in #605
@anakinxc made their first contribution in #610
@mnjdhl made their first contribution in #611
@JoachimSchurig made their first contribution in #614

Full Changelog: v1.6.0...v1.7.0

Contributors

balister, aqrit, and 8 other contributors

Assets 2

26 Dec 08:02

jserv

v1.6.0

What's Changed

100% intrinsics coverage for SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2 and AES extension.
Implement _rdtsc by @Cuda-Chen in #532
Improve _mm_srai_epi32 to handle complex arguments by @Developer-Ecosystem-Engineering in #533
Implement _mm_cmpestri and _mm_cmpestrm by @Cuda-Chen in #534
Implement five _mm_cmpestr by @Cuda-Chen in #552
Implement _mm_cmpistri and _mm_cmpistrm by @Cuda-Chen in #553
Implement five _mm_cmpistr by @Cuda-Chen in #555
tests: Fix warnings raised by clang++ by @Cuda-Chen in #540
Exclude _mm_malloc/free definitions on Windows by @invertego in #541
Remove designated initialization of an array by @invertego in #542
Reintroduce ext-based implementations for shift intrinsics by @AymenQ in #543
Improve performance of float-to-integer intrinsics by @AymenQ in #546
Support __builtin_shuffle as an alternative to __builtin_shufflevector by @AymenQ in #545
Improve performance of various intrinsics by @AymenQ in #549
Vectorize _mm_minpos_epu16 by @AymenQ in #551
Align _mm_prefetch behavior to document by @howjmay in #550
Add clang/Windows build by @invertego in #556
Test all valid immediates in _mm_dp_pd by @Cuda-Chen in #557
Optimize _mm_aesenclast_si128 for Arm64 by @howjmay in #561
Implement _mm_aesdec_si128 by @howjmay in #559
Implement _mm_aesdeclast_si128 by @howjmay in #565
Implement _mm_aesimc_si128 by @howjmay in #567
Optimize aeskeygenassist_si128 for Arm64 by @howjmay in #569
Update Intel intrinsics document links by @howjmay in #570

New Contributors

@Cuda-Chen made their first contribution in #532
@Developer-Ecosystem-Engineering made their first contribution in #533
@balister made their first contribution in #535
@invertego made their first contribution in #541
@AymenQ made their first contribution in #543

Full Changelog: v1.5.1...v1.6.0

Contributors

balister, AymenQ, and 4 other contributors

Assets 2

02 May 21:56

marktwtn

v1.5.1

What's Changed

fix: Fix dividing zero error in validateFloatError by @howjmay in #515
Fix compilation with standardized C compilers by @jserv in #516
Fix _mm_storel_epi64 by @andrewevstyukhin in #517
Add support for 32-bit targets on ARMv8 architectures by @jonathanhue in #520
Use CRC and directed rounding intrinsics on A32 by @jonathanhue in #522
fix: Fix alignment in tests by @howjmay in #523

New Contributors

@sleepybishop made their first contribution in #508
@luzpaz made their first contribution in #509
@andrewevstyukhin made their first contribution in #517
@jonathanhue made their first contribution in #520

Full Changelog: v1.5.0...v1.5.1

Contributors

jserv, luzpaz, and 4 other contributors

Assets 2

27 Nov 08:42

marktwtn

v1.5.0

Around 94% of the SSE intrinsics are implemented in the release.
The rest of the unimplemented intrinsics are:

Exception related macros
_mm_clflush()
Memory barrier intrinsics
String comparison intrinsics

Assets 2