translate-c: Add support for vector expressions #8389

ehaas · 2021-03-29T17:11:04Z

Includes vector types, __builtin_shufflevector, and __builtin_convertvector

ehaas · 2021-03-29T17:15:55Z

Open questions:

Some vector types require an alignment e.g. from xmmintrin.h we have: typedef float __m128 __attribute__((__vector_size__(16), __aligned__(16))); - is there a way to require a vector to have a certain alignment?
Most real-world vector code will use intrinsics e.g. __builtin_ia32_packsswb - is there any way to use these from Zig? Currently any code that uses these will get demoted, so this PR shouldn't result in any regressions there, but it may limit the usefulness of translated vector expressions somewhat.

Includes vector types, __builtin_shufflevector, and __builtin_convertvector

andrewrk · 2021-04-06T18:16:00Z

is there a way to require a vector to have a certain alignment?

There is not. However, all Zig vector types have alignment 16, so I think we can safely ignore that value if it is <= 16. And I suspect that we will not encounter a higher alignment in practice (please correct me if I'm wrong).

Most real-world vector code will use intrinsics e.g. __builtin_ia32_packsswb - is there any way to use these from Zig?

I think we should be able to translate these function calls into zig SIMD expressions on vectors, right? I'm not sure exactly what that function does but if it was for example vector addition then we could translate to e.g. a + b.

I see that you are already doing that in this PR with @shuffle though, so I think I did not quite understand your question fully.

andrewrk

This is pretty slick. Nice work!

ehaas · 2021-04-06T20:52:31Z

I found an issue talking about intrinsics: #7702

__builtin_ia32_packsswb was just a random example I picked - if you look in mmintrin.h you'll see a function _mm_packs_pi16 that calls it. It corresponds to an MMX instruction PACKSSWB that turns 8 signed 16-bit integers into 8 signed 8-bit integers with saturation (anything greater than 127 becomes 127, less than -128 becomes -128 - see https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_m_packsswb)

There are a lot of them - just for x86 we have: https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/x86-Built-in-Functions.html#x86-Built-in-Functions

In theory all of them could be emulated in pure Zig but that would be a lot of work and pretty error prone and possibly miss out on the ability to compile them down to a single instruction depending on how smart the backend is. The proposal I linked talked about possibly having a way to forward them to the backend but I have no idea how difficult that would be to implement.

I think in general user code would not directly use these builtins; using our example they would call _mm_packs_pi16 which is implemented in some Zig stdlib location. That implementation would perhaps look at the backend (llvm? pure Zig?) and the target (mmx enabled?) and either forward it to the backend, emulate it in Zig, or be a @compileError()

andrewrk · 2021-04-07T02:49:09Z

The plan for these things is to have builtin functions for all of them. We have an entire @ namespace for compiler-provided functions, so there's no concern about cluttering up the global namespace. Pretty much anything that has a dedicated hardware instruction on some backend, will have either syntax support or support via builtin functions. Then when a given target ISA has the instruction, the backend lowers it to machine code, and when the target ISA does not have the instruction, the backend lowers it to shim code (perhaps via a call to compiler_rt).

ehaas · 2021-04-07T03:30:27Z

Cool, so if I'm understanding correctly, for the purposes of translate-c we might have something like this in std/c/builtins.zig:

pub fn __builtin_ia32_packsswb(a: Vector(4, i16), b: Vector(4, i16)) callconv(.Inline) Vector(8, i8) {
    return @"intrinsic.mmx.packsswb"(a, b);
}

and then translate-c would not have to do any special work, it would just notice that the function is defined and emit a call to it. Alternatively translate-c could know how to map certain __builtin_ functions to the right intrinsic and then just directly emit a call to the @ function.

On the topic of alignment - some AVX stuff has higher than 16 byte alignment. avx512fintrin.h and avx512bf16intrin.h define some vector types with 32 or 64 byte alignment, and avxintrin.h has some 32-byte aligned types. I could probably check if the vector has alignment greater than 16 and demote it in that case.

daurnimator · 2021-04-07T03:55:55Z

@ehaas I think I'd rather see it somewhere in the zig standard library as conditional comptime assembly, e.g.

fn saturatingTruncate(comptime destType: T, src: anytype) T {
  if (!@isComptime() // see https://github.com/ziglang/zig/issues/868
    and destType == Vector(i8,8)
    and @TypeOf(src) == Vector(i16,8)
    and std.builtin.cpu.arch == .x86
    and std.builtin.cpu.features.has(.mmx)
  ) {
    return asm("packsswb ...." : src); // or alternatively, this could be the LLVM intrinsic? see https://github.com/ziglang/zig/issues/4466 or https://github.com/ziglang/zig/issues/2291
  }
  
  // implementation using primitive addition/etc.
}

andrewrk · 2021-04-07T04:03:01Z

Cool, so if I'm understanding correctly...

Yep, looks like we're on the same page. Although I'd probably want to name it something less fancy, such as @mmx_packsswb, and as you mentioned, have the translate-c code directly emit calls to it, rather than a wrapper function.

On the topic of alignment

Interesting- I'm glad you brought this up. I think we might need to consider adjusting zig's type system to take into account vector alignment. Would you mind opening a new issue for this topic?

translate-c: Add support for vector expressions

5937a33

Includes vector types, __builtin_shufflevector, and __builtin_convertvector

ehaas force-pushed the translate-c-vectors branch from 7cc1157 to 5937a33 Compare April 6, 2021 03:57

ehaas marked this pull request as ready for review April 6, 2021 15:22

andrewrk approved these changes Apr 6, 2021

View reviewed changes

andrewrk merged commit 8de14a9 into ziglang:master Apr 6, 2021

ehaas mentioned this pull request Apr 7, 2021

Some vector types require alignment greater than 16 bytes #8452

Open

ehaas deleted the translate-c-vectors branch July 29, 2022 04:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

translate-c: Add support for vector expressions #8389

translate-c: Add support for vector expressions #8389

ehaas commented Mar 29, 2021

ehaas commented Mar 29, 2021

andrewrk commented Apr 6, 2021

andrewrk left a comment

ehaas commented Apr 6, 2021

andrewrk commented Apr 7, 2021

ehaas commented Apr 7, 2021

daurnimator commented Apr 7, 2021

andrewrk commented Apr 7, 2021

translate-c: Add support for vector expressions #8389

translate-c: Add support for vector expressions #8389

Conversation

ehaas commented Mar 29, 2021

ehaas commented Mar 29, 2021

andrewrk commented Apr 6, 2021

andrewrk left a comment

Choose a reason for hiding this comment

ehaas commented Apr 6, 2021

andrewrk commented Apr 7, 2021

ehaas commented Apr 7, 2021

daurnimator commented Apr 7, 2021

andrewrk commented Apr 7, 2021