Skip to content

Configuring Compilation for Maximum Performance

James Bradbury edited this page Jan 25, 2022 · 1 revision

Our binary releases all target 64-bit intel architectures with SSE extensions enabled. This strikes a workable balance between supporting a wide range of machines whilst enjoying some of the performance benefits of newer CPU features. If you have a newer machine that supports additional instruction sets like AVX, AVX2, FMA and so forth, you may wish to take advantage or these.Likewise, you may wish to build for an architecture we don't directly support, like 32-bit Intel or ARM.

Both GCC and Clang allow you to supply a native setting which will enable optimal settings for your particular CPU, giving possible performance benefits at the cost of portability. You can pass these in to CMake via a flag CMAKE_CXX_FLAGS or, equivalently, set the CXXFLAGS environment variable in your shell before running CMake. On x64 this could look like

cmake -DCMAKE_CXX_FLAGS=-march=native <your other stuff> ..

on ARM, you might need to use mcpu rather than march:

cmake -DCMAKE_CXX_FLAGS=-mcpu=native <your other stuff> ..

MSVC doesn't have an equivalent to this handy feature, but for 64 bit systems you can enable various extensions via the /arch flag (reference).

To build natively on newer Apple Silicon machines, you may need to explictly pass -DCMAKE_OSX_ARCHITECTURES="x86_64;arm64" (which should produce a universal (fat) binary).

Similarly, to build for 32-bit macOS, you might need to pass -DCMAKE_OSX_ARCHITECTURES="x86;x86_64" , although we can't make any promises that 32-bit builds will always function (and you'll need an XCode version < 10).