Skip to content

v1.3.0

Compare
Choose a tag to compare
@zbjornson zbjornson released this 16 Nov 02:20
· 31 commits to master since this release

Improved

  • cbb3c9e Replace the cpuid.exe util used to detect CPU features on Windows with init-time CPU dispatch.
  • cbb3c9e Add compiler flags required for AVX2 support on macOS.
  • cbb3c9e Refactor vector code again. (Nicer and fixes a performance issue on Linux with 16-bit types.)
  • cbb3c9e Silence safe warnings from GCC8.
  • 6b33db7 Modernize JS syntax. (Officially: Node.js v6 is required.)
  • 6cadeea Condense readme.
  • 084dd30 Improve perf in several cases with GCC by aligning loops to 32-B boundaries.
  • d030c66 Improve perf across the board by unrolling loops 8x.
  • b2ed0fb Improve perf of unaligned arrays.

Added

  • cbb3c9e Export the ISE used ("SSSE3", "AVX2", "AVX512" or "NEON").
  • 6b4a168 AVX512 implementation. This is disabled by default because it is slower than the AVX2 version.
  • 4c00148 ARM NEON implementation.