`ensure_simd`: compile-time check whether NEON or AVX2 is available

Many of my tools use AVX2 or NEON instructions performance reasons. While NEON is automatically enabled on aarch64 architectures, AVX2 is not enabled by default on x64, even though your system is very likely to support it.

This library does a compile-time check that the target architecture indeed supports either NEON or AVX2 instructions. This is especially relevant for CI builds, where it is easy to mis-configure the build flags and accidentally build a binary without AVX2 support.

In case the AVX2 feature is not enabled on x64, crates like wide and the portable-simd feature will automatically fall back to scalar or 128-bit SIMD instructions, which are less efficient than the intended 256-bit AVX2 instructions. Thus, this crate ensures that compiled binaries actually use the intended fast-path.

If you intentionally target x86 machines without AVX2 support, the check can be manually disabled by enabling the scalar feature. Then, non-AVX2 fallbacks will be used.

The ensure_simd function can be used at the start of main() to do a run-time check that the CPU that is running the binary actually supports AVX2 instructions.

This blog post contains some more background.

Installing a binary using SIMD instructions

On aarch64, NEON is always available, and you should not run into any issues -- carg install <tool> should just work.

On x64, you will need to manually instruct cargo to use the instruction sets available on your architecture:

RUSTFLAGS="-C target-cpu=native" cargo install <tool>

Alternatively, if you prefer a more portable binary (e.g. in case the build machine supports AVX512 but you plan to copy the binary to less fancy machines), do:

RUSTFLAGS="-C target-cpu=x86-64-v3" cargo install <tool>

If your machine is very (>10 years) old and does not support x86-64-v3 (wikipedia), and thus no AVX2, you can also explicitly ask to build without AVX2 instructions, but this will give reduced performance:

cargo install <tool> -F scalar

Distributing binaries using SIMD instructions

For maximal performance, we recommend to use target-cpu=native in the repository-local configuration:

# .cargo/config.toml
[build]
# By default, we want maximum performance rather than portability.
rustflags = ["-C", "target-cpu=native"]

But for CI builds that produce distributed binaries (for GitHub releases, bioconda, pypi, ...), we instead recommend more conservative defaults:

# .cargo/config-portable.toml
[target.'cfg(target_arch="x86_64")']
# x86-64-v2 does not have AVX2, but we need that.
# x86-64-v4 has AVX512 which we explicitly do not include for portability.
rustflags = ["-C", "target-cpu=x86-64-v3"]

[target.'cfg(all(target_arch="aarch64", target_os="macos"))']
# For aarch64 macos builds, specifically target M1 rather than generic aarch64.
rustflags = ["-C", "target-cpu=apple-a14"]

Then, in your workflow configuration, run mv .cargo/config-portable.toml .cargo/config.toml before invoking cargo build.

The blog post links some examples for distributing to GitHub releases, bioconda, and pypi.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.cargo		.cargo
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

`ensure_simd`: compile-time check whether NEON or AVX2 is available

Installing a binary using SIMD instructions

Distributing binaries using SIMD instructions

About

Uh oh!

Releases

Packages

Languages

RagnarGrootKoerkamp/ensure_simd

Folders and files

Latest commit

History

Repository files navigation

ensure_simd: compile-time check whether NEON or AVX2 is available

Installing a binary using SIMD instructions

Distributing binaries using SIMD instructions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`ensure_simd`: compile-time check whether NEON or AVX2 is available

Packages