Rust issue 3499.
A vector type is defined by the element type and the number of elements.
Rust recognizes 10 machine types: i8
, i16
, i32
, i64
,
u8
, u16
, u32
, u64
, f32
, f64
.
OpenCL defines Built-in Vector Data Types intended to be compiled to SIMD.
Possible element types are char
, short
, int
, long
,
uchar
, ushort
, uint
, ulong
, float
, double
,
which maps one to one to Rust machine types.
Possible number of elements are 2, 3, 4, 8, 16.
3-elements vector types were introduced in OpenCL 1.1.
All possible combinations of element types and number of elements are
defined. While OpenCL defines bool
and half
scalar types,
they are not allowed as element types of vector types at the moment,
although they are reserved for future extensions.
LLVM has a vector type intended to be compiled to SIMD.
All integers and floating point types can be element types. Apart from
Rust's 10 machine types, these include i1
and half
.
There is no restriction on number of elements.
Intel Software Developer's Manual (Order Number 253665) provide information about SSE SIMD.
According to Software Developer's Manual section Packed SIMD Data Types, there are 64-bit and 128-bit vector types. (It is missing in the section, but AVX also defines 256-bit vector types.) 64-bit vectors can be 8 packed byte integers, 4 packed word integers, 2 packed doubleword integers. (No floating-point values are available for 64-bit vectors.) 128-bit vectors can be 16 packed byte integers, 8 packed word integers, 4 packed doubleword integers, 2 packed quadword integers, 4 packed single precision floating point, 2 packed double precision floating point. Some integer operations are available in both signed and unsigned version, but some are not: for example, there is no unsigned integer comparison.
ARM publications RVCT Assembler Guide (ARM DUI 0204) and RVCT Compiler Reference Guide (ARM DUI 0348) provide informations about NEON SIMD.
According to Assembler Guide section NEON and VFP data types,
NEON data types are S8
, S16
, S32
, S64
,
U8
, U16
, U32
, U64
, F16
, F32
, P8
, P16
.
F64
is not available and F16
support is optional.
P8
and P16
are polynomials over GF(2).
Vector types need to be either 64 bits or 128 bits long.
According to Compiler Reference Guide section Vector data types,
these types are named like int8x8_t
, int8x16_t
,
uint8x8_t
, uint8x16_t
, float32x2_t
, float32x4_t
in C.
The number of elements can be 1 in case of int64x1_t
.
XXX Research AltiVec.
SIMD architectures usually define C API(compiler intrinsics) in
addition to instructions. C API is provided by headers such as
emmintrin.h
for SSE2, arm_neon.h
for NEON, altivec.h
for
AltiVec. These APIs are architecture-specific, but common to all
compilers.
Compilers can also provide language extensions common to all architectures. GCC provides vector_size attribute, Clang provides ext_vector_type attribute. XXX Research MSVC.
There is the issue of what to do when SIMD is not available. GCC provides indirect function, which can select optimized version at load time, using ifunc attribute. GCC 4.8 also introduced function multiversioning for C++, which uses the same mechanism.
AMD developed SSEPlus, which emulates newer SSE instructions on older architectures. The interface provided is the same as compiler intrinsics.
Johann Wolfgang Goethe-Universität Frankfurt am Main's
Department of High Performance Computer Architecture develops
softwares to help analysis of experimental data from CERN.
One such software is Vc, a C++ library to help the use of SIMD.
It defines C++ classes int_v
, uint_v
, float_v
, double_v
,
which provides overloaded C++ operators.
Microsoft published XNAMath as part of DirectX SDK. XNAMath was later renamed to DirectXMath and published as part of Windows SDK. Since Windows 8 runs on both Intel and ARM architectures and Xbox 360 uses PowerPC, DirectXMath abstracts over SSE, NEON, AltiVec.
D has core.simd.
Mono provides Mono.Simd.
GHC SIMD project.
Haskell Beats C Using Generalized Stream Fusion paper.