Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal add @compilerInternal and move most intrinsics to libraries #4466

Closed
momumi opened this issue Feb 15, 2020 · 8 comments
Closed

Proposal add @compilerInternal and move most intrinsics to libraries #4466

momumi opened this issue Feb 15, 2020 · 8 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@momumi
Copy link
Contributor

momumi commented Feb 15, 2020

The idea is to add a @compilerInternal builtin to implement compiler specific features in a standardized way. Here's an example of what it might look like:

@compilerInternal(compiler_name, <compiler specific extensions>);

// a @sin() intrinsic might look like this:
@compilerInternal("zig", "llvm.sin.f32", .{val});

// The compiler will compare `compiler_name` with its own name, and if it doesn't
// match it will produce an error.
//
// So to check which compiler is used, do something like:
if (@compilerInternal("zig")) {
   // compiler specific implementation
   @compilerInternal("zig", .....);
} else {
   // fall back for other compilers
}

The behavior of @compilerInternal is compiler dependent, the only requirements are:

  • the first argument is the compiler name
  • the arguments must be valid zig syntax.

Here's why I think this feature will be useful:

Allow code to take advantage of LLVM intrinsics when needed

Provides a way to use LLVM intrinsics #2291 without making LLVM a core part of the language

Allows intrinsics to be implemented in a separate library

With @compilerInternal it should be possible to move most builtin intrinsics to a separate intrinsic library. Then instead of using something like @sin, use std.math.sin. Here's why I think this is a good idea:

  1. I don't want to have the option to choose between @sin() or std.math.sin(), I just want to use std.math.sin and get the best implementation.

  2. I think having builtins in the default namespace makes them a core part of the language, but a lot of them are highly ISA specific. For example, most embedded systems are lucky to have hardware floating point, so stuff like @sin seems out of place as a core part of the language.

  3. One design goal of zig is to make function calls obvious in code, but @sin is magic. Does it use hardware instruction? Does it call a software implementation? Which one (std.math.sin, libc sin)? Were is the source code? Using std.math.sin is less magical.

  4. Another reason to move intrinsics to libraries is it's easier to switch out the implementation. For example, if I use const sin = std.math.sin; it's much easier to later change the implementation to const sin = mySin;. I can't do this as easily with @sin.

  5. The intrinsics library would always resolve to the hardware implementation and produce a compiler error for unsupported architectures. This makes the programmers intent more clear.

  6. Makes alternative implementations of zig easier to implement since there's less builtins in the language.

ISA specific extensions

Gives alternative compiler implementations/backends an official way to extended the language without breaking standard zig code. The keil 8051 C compiler is an example of how ISA specific extensions can be done wrong. They added new keywords to support 8051, but they choose words like code, data. If you used those as variable names, the official "workaround" is to rename all your variables.

So for example with compiler internal this might look something like:

fn Flash(T: type) type {
   return @compilerInternal("keil", "code", T);
}

fn Sfr(addr: u8) type {
   return @compilerInternal("keil", "sfr", addr);
}

// Serial number stored in device Flash
const serial_number: Flash(u32) = 0x76543210;

// TCON is a special function register
var TCON: Sfr(0x88) = undefined;

// etc...

(This is just an example, in reality zig would probably handle this use case with: address spaces and/or link sections)

@momumi
Copy link
Contributor Author

momumi commented Feb 15, 2020

Here's a list of builtins that could potential be moved to libraries with this change:

  • @atomicLoad
  • @atmoicRmw
  • @atomicStore
  • @byteSwap
  • @bitReverse
  • @cmpxchgStrong
  • @cmpxchgWeak
  • @clz
  • @ctz
  • @divExact
  • @divFloor
  • @divTrunc
  • @fence
  • @memcpy
  • @memset
  • @mod
  • @popCount
  • @rem
  • @shlExact
  • @shrExact
  • @sqrt
  • @sin
  • @cos
  • @exp
  • @exp2
  • @log
  • @log2
  • @log10
  • @fabs
  • @floor
  • @ceil
  • @round
  • @Vector
  • @shuffle
  • @splat
  • @mulAdd

Choosing these based on one/both of these criteria:

  • Their behavior is strongly ISA dependent.
  • It is possible to implement them with other primitives

It might be a good idea to handle @Vector operations in std.simd since they have quite different characteristics from primitive types like ints and floats.

@BarabasGitHub
Copy link
Contributor

BarabasGitHub commented Feb 15, 2020

It might be a good idea to handle @vector operations in std.simd since they have quite different characteristics from primitive types like ints and floats.

Funny that you mention this. I was just thinking about this last night. My question was: "Can't we just 'cImport' (or something similar) all the intrinsics for each platform and make them available in a specific std.intrinsics library?"

Your proposal is very much in the same direction and obviously more thought out. So I'm supportive of this idea. Not sure if it should be compiler internal or something else as what is available depends more on the target than on the compiler I guess. I imagine this can also make the compiler simpler, because it will have way less buildins and instead most if not all can move to library code.

In any case it would be good to have access to all intrinsics one way or another.
This instead of having fancy build-in stuff in the language for every specific thing, which will inevitably leave useful things out.

@SpexGuy
Copy link
Contributor

SpexGuy commented Feb 15, 2020

The separation between standards and implementation in C++ has lead to great difficulty when porting codebases between compilers. Something I really like about Zig is that intrinsics are mandated as part of the language spec and not compiler extensions. This puts the implementation burden on the compiler implementer, instead of every programmer using the compiler. I think there's room in Zig for something like a @compilerInternal intrinsic, but we shouldn't necessarily use it as a replacement for existing intrinsics.

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Feb 18, 2020
@andrewrk andrewrk added this to the 0.7.0 milestone Feb 18, 2020
@shawnl
Copy link
Contributor

shawnl commented Mar 13, 2020

The proposal as written is not sufficient to replace all the intrinsic it claims. Specifically, the intrinsics that accept integers accept any llvm integer, from 1 bit to 65565 bits.

@shawnl
Copy link
Contributor

shawnl commented May 9, 2020

The intrinsics library would always resolve to the hardware implementation and produce a compiler error for unsupported architectures. This makes the programmers intent more clear.

Hardware details are totally irrelevant to a language. What should be guaranteed is IEEE-754 (modulo subnormals, and <fenv.h>).

@momumi
Copy link
Contributor Author

momumi commented Apr 15, 2021

The separation between standards and implementation in C++ has lead to great difficulty when porting codebases between compilers. Something I really like about Zig is that intrinsics are mandated as part of the language spec and not compiler extensions. This puts the implementation burden on the compiler implementer, instead of every programmer using the compiler. I think there's room in Zig for something like a @compilerInternal intrinsic, but we shouldn't necessarily use it as a replacement for existing intrinsics.

Yes, I agree compiler specific behavior is bad and should be avoided as much as possible. I've dealt with the crazy macro hackery to workaround different compilers before and it's no fun. The intent of this proposal is to help with this.

This feature would primarily be used by compiler authors inside the standard library and should ideally never be used in 99.9% of user code (I hoped the name compilerInternal would reflect this). Most user would only use this feature indirectly through std.intrinsics, std.simd etc.

However, new compiler implementations likely need to implement new ISA specific behavior (besides perfomance, what's the point of writing a new compiler otherwise?). @compilerInternal gives them a way to implement this without worrying about clobbering other compiler implementations and without having to wait for the zig standard to add support for their new architecture. Then ideally some time in the future the ISA feature becomes wide spread and gets adopted into std.intrinsics and the use of @compilerInternal is deprecated for the new standard defined way.

@tecanec
Copy link
Contributor

tecanec commented Apr 23, 2021

I'm interested in having a built-in that provides intristics. There are some great x86 instructions that I want to use, but which aren't supported by Zig. I'm currently experimenting with fast ways to decode variable-length integers, and most of my solutions benefit from certain unsupported instructions like BZHI, BEXTR and PEXT. My BZHI-based assembly-written solution is slightly but noticeably faster than the Zig-version (the speed ratio between the two is approximately 7/6), and I think the latter's lack of BZHI is to blame. Of course, #7702 serves as another example of this.

However, I'm not the greatest fan of tying this to the compiler. I'd be more interesting in tying it to things like instruction sets and OS-specific interrupts. If a compiler has some special features, they could be tied to this as well, but in general, if a @compilerInternal-based API can be standardized across compilers, it should be. The APIs obviously can't be universally compatible, as that'd rob @compilerInternal of its purpose, but letting compiler-specific details roam wild would be catastrophic.

Instead of tying @compilerInternal to the compiler, I'd suggest tying it to "Intrinsic-sets". These would be tied to platform-specific things like Unix, LLVM, webassembly and x86. These intrinsic-sets would be standardized, although not necessarily always by the zig software foundation. Third-party intrinsic-sets may exist as well, but they should be considered more favorable if they're recognized by either the ZSF or whoever is responsible for the platform that the intrinsic-set seeks to support. (Perhaps all of this means that @compilerInternal isn't such a befitting name.)

Here's an example of what I want to be able to do (some safety checks omitted):

pub fn truncateUintToVariableLength(comptime IntType: type, x: IntType, len: std.math.Log2Int(IntType)) IntType {
    if (@supportsIntrinsicSet("immintrin") and @typeInfo(IntType).Int.bits <= 64) { // @supportsIntrinsicSet returns a comptime bool.
        if (@typeInfo(IntType).Int.bits > 32) {
            return @executeIntrinsic("immintrin", "_bzhi_u64", .{x, len});
        } else {
            return @executeIntrinsic("immintrin", "_bzhi_u32", .{x, len});
        }
    } else {
        return x & ((1 << len) - 1);
    }
}

@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021
@andrewrk
Copy link
Member

andrewrk commented Nov 23, 2021

Thanks for the proposal. I'm rejecting it as I don't see it as beneficial.

Provides a way to use LLVM intrinsics #2291 without making LLVM a core part of the language

I consider this to be a downside. We don't want to expose LLVM intrinsics directly.

  • I don't want to have the option to choose between @sin() or std.math.sin(), I just want to use std.math.sin and get the best implementation.

Always use @sin. After #7265 is done the std.math.sin implementation will be moved over to compiler-rt and there will not be std.math.sin available.

  • I think having builtins in the default namespace makes them a core part of the language, but a lot of them are highly ISA specific. For example, most embedded systems are lucky to have hardware floating point, so stuff like @sin seems out of place as a core part of the language.

@sin is well-defined in the Zig language to work on all ISAs. It is not implementation-defined. It does not matter if there is no sin instruction, or no floating point instructions. This is the purpose of compiler-rt. Applications have access to comptime checks to do something different if it is determined that invoking @sin at runtime would be ill-advised.

  • One design goal of zig is to make function calls obvious in code, but @sin is magic. Does it use hardware instruction? Does it call a software implementation? Which one (std.math.sin, libc sin)? Were is the source code? Using std.math.sin is less magical.

It's not magic, it's the same thing as +. Does it use hardware instruction? Does it call a software implementation? Which one (compiler_rt, libgcc)? Where is the source code? (the answer is instruction selection decides whether to emit machine code inline or to make a call to compiler_rt)

Using std.math.sin is not less magical. Anyway, as noted above std.math.sin is deprecated in favor of @sin.

  • Another reason to move intrinsics to libraries is it's easier to switch out the implementation. For example, if I use const sin = std.math.sin; it's much easier to later change the implementation to const sin = mySin;. I can't do this as easily with @sin.

You wouldn't want to do this for the same reason that you wouldn't want to swap out the implementation of +.

  • The intrinsics library would always resolve to the hardware implementation and produce a compiler error for unsupported architectures. This makes the programmers intent more clear.

It is more common for the intent to be "perform the mathematical sin function" than "use a hardware sin instruction or emit a compile error". In the rarer latter case the programmer is free to use inline assembly.

  • Makes alternative implementations of zig easier to implement since there's less builtins in the language.

This just moves the burden to the programmer who now has to add a dependency on a specific compiler implementation in order to use a builtin.

ISA specific extensions

Anyway, none of this really matters because this proposal is just arguing for namespacing differently, and having one namespace in the language and one namespace outside the language.

Namespacing with a prefix such as @wasmFoo @wasmBar is equivalent to having "wasm" be a string argument to @compilerInternal. So the only meaningful thing left in this proposal is having one namespace be in the language spec and one namespace be outside of it. Like @SpexGuy said, I agree that there is room for such a proposal, but as it stands I do not see a sufficiently motivating use case for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

6 participants