v1.6.0
What's Changed
- 100% intrinsics coverage for SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2 and AES extension.
- Implement
_rdtsc
by @Cuda-Chen in #532 - Improve
_mm_srai_epi32
to handle complex arguments by @Developer-Ecosystem-Engineering in #533 - Implement
_mm_cmpestri
and_mm_cmpestrm
by @Cuda-Chen in #534 - Implement five
_mm_cmpestr
by @Cuda-Chen in #552 - Implement
_mm_cmpistri
and_mm_cmpistrm
by @Cuda-Chen in #553 - Implement five
_mm_cmpistr
by @Cuda-Chen in #555 - tests: Fix warnings raised by clang++ by @Cuda-Chen in #540
- Exclude
_mm_malloc
/free
definitions on Windows by @invertego in #541 - Remove designated initialization of an array by @invertego in #542
- Reintroduce
ext
-based implementations for shift intrinsics by @AymenQ in #543 - Improve performance of float-to-integer intrinsics by @AymenQ in #546
- Support
__builtin_shuffle
as an alternative to__builtin_shufflevector
by @AymenQ in #545 - Improve performance of various intrinsics by @AymenQ in #549
- Vectorize
_mm_minpos_epu16
by @AymenQ in #551 - Align
_mm_prefetch
behavior to document by @howjmay in #550 - Add clang/Windows build by @invertego in #556
- Test all valid immediates in
_mm_dp_pd
by @Cuda-Chen in #557 - Optimize
_mm_aesenclast_si128
for Arm64 by @howjmay in #561 - Implement
_mm_aesdec_si128
by @howjmay in #559 - Implement
_mm_aesdeclast_si128
by @howjmay in #565 - Implement
_mm_aesimc_si128
by @howjmay in #567 - Optimize
aeskeygenassist_si128
for Arm64 by @howjmay in #569 - Update Intel intrinsics document links by @howjmay in #570
New Contributors
- @Cuda-Chen made their first contribution in #532
- @Developer-Ecosystem-Engineering made their first contribution in #533
- @balister made their first contribution in #535
- @invertego made their first contribution in #541
- @AymenQ made their first contribution in #543
Full Changelog: v1.5.1...v1.6.0