diff --git a/design/27539-internal-abi.md b/design/27539-internal-abi.md new file mode 100644 index 00000000..dc2dcb57 --- /dev/null +++ b/design/27539-internal-abi.md @@ -0,0 +1,463 @@ +# Proposal: Create an undefined internal calling convention + +Author(s): Austin Clements + +Last updated: 2019-01-14 + +Discussion at https://golang.org/issue/27539. + +## Abstract + +Go's current calling convention interferes with several significant +optimizations, such as [register +passing](https://golang.org/issue/18597) (a potential 5% win). +Despite the obvious appeal of these optimizations, we've encountered +significant roadblocks to their implementation. +While Go's calling convention isn't covered by the [Go 1 compatibility +promise](https://golang.org/doc/go1compat), it's impossible to write +Go assembly code without depending on it, and there are many important +packages that use Go assembly. +As a result, much of Go's calling convention is effectively public and +must be maintained in a backwards-compatible way. + +We propose a way forward based on having multiple calling conventions. +We propose maintaining the existing calling convention and introducing +a new, private calling convention that is explicitly not +backwards-compatible and not accessible to assembly code, with a +mechanism to keep different calling convention transparently +inter-operable. +This same mechanism can be used to introduce other public, stable +calling conventions in the future, but the details of that are outside +the scope of this proposal. + +This proposal is *not* about any specific new calling convention. +It's about *enabling* new calling conventions to work in the existing +Go ecosystem. +This is one step in a longer-term plan. + + +## Background + +Language environments depend on *application binary interfaces* (ABIs) +to define the machine-level conventions for operating within that +environment. +One key aspect of an ABI is the *calling convention*, which defines +how function calls in the language operate at a machine-code level. + +Go's calling convention specifies how functions pass argument values +and results (on the stack), which registers have fixed functions +(e.g., R10 on ARM is the "g" register) or may be clobbered by a call +(all non-fixed function registers), and how to interact with stack +growth, the scheduler, and the garbage collector. + +Go's calling convention as of Go 1.11 is simple and nearly universal +across platforms, but also inefficient and inflexible. +It is rife with opportunities for improving performance. +For example, experiments with [passing arguments and results in +registers](https://golang.org/issue/18597) suggest a 5% performance +win. +Propagating register clobbers up the call graph could avoid +unnecessary stack spills. +Keeping the stack bound in a fixed register could eliminate two +dependent memory loads on every function entry on x86. +Passing dynamic allocation scopes could reduce heap allocations. + +And yet, even though the calling convention is invisible to Go +programs, almost every substantive change we've attempted has been +stymied because changes break existing Go *assembly* code. +While there's relatively little Go assembly (roughly 170 kLOC in +public GitHub repositories*), it tends to lie at the heart +of important packages like crypto and numerical libraries. + +This proposal operates within two key constraints: + +1. We can't break existing assembly code, even though it isn't + technically covered by Go 1 compatibility. + There's too much of it and it's too important. + Hence, we can't change the calling convention used by existing + assembly code. + +2. We can't depend on a transition periods after which existing + assembly would break. + Too much code simply doesn't get updated, or if it does, it doesn't + get re-vendored. + Hence, it's not enough to give people a transition path to a new + calling convention and some time. + Existing code must continue to work. + +This proposal resolves this tension by introducing multiple calling +conventions. +Initially, we propose two: one is stable, documented, and codifies the +rules of the current calling convention; the other is unstable, +internal, and may change from release to release. + +* This counts non-comment, non-whitespace lines of code in +unique files. It excludes vendored source and source with a "Go +Authors" copyright notice. + + +## Proposal + +We propose introducing a second calling convention. + +* `ABI0` is the current calling convention, which passes arguments and + results on the stack, clobbers all registers on calls, and has a few + platform-dependent fixed registers. + +* `ABIInternal` is unstable and may change from release to release. + Initially, it will be identical to `ABI0`, but `ABIInternal` opens + the door for changes. + +Once we're happy with `ABIInternal`, we may "snapshot" it as a new +stable `ABI1`, allowing assembly code to be written against the +presumably faster, new calling convention. +This would not eliminate `ABIInternal`, as `ABIInternal` could later +diverge from `ABI1`, though `ABI1` and `ABIInternal` may be identical +for some time. + +A text symbol can provide different definitions for different ABIs. +One of these will be the "native" implementation—`ABIInternal` for +functions defined in Go and `ABI0` for functions defined in +assembly—while the others will be "ABI wrappers" that simply translate +to the ABI of the native implementation and call it. +In the linker, each symbol is already identified with a (name, +version) pair. +The implementation will simply map ABIs to linker symbol versions. + +All functions defined in Go will be natively `ABIInternal`, and the Go +compiler will assume all functions provide an `ABIInternal` +implementation. +Hence, all cross-package calls and all indirect calls (closure calls +and interface method calls) will use `ABIInternal`. +If the native implementation of the called function is `ABI0`, this +will call a wrapper, which will call the `ABI0` implementation. +For direct calls, if the compiler knows the target is a native `ABI0` +function, it can optimize that call to use `ABI0` directly, but this +is strictly an optimization. + +All functions defined in assembly will be natively `ABI0`, and all +references to text symbols from assembly will use the `ABI0` +definition. +To introduce another stable ABI in the future, we would extend the +assembly symbol syntax with a way to specify the ABI, but `ABI0` must +be assumed for all unqualified symbols for backwards compatibility. + +In order to transparently bridge the two (or more) ABIs, we will +extend the assembler with a mode to scan for all text symbol +definitions and references in assembly code, and report these to the +compiler. +When these symbols are referenced or defined, respectively, from Go +code in the same package, the compiler will use the type information +available in Go declarations and function stubs to produce the +necessary ABI wrapper definitions. + +The linker will check that all symbol references use the correct ABI +and ultimately keep everything honest. + + +## Rationale + +The above approach allows us to introduce an internal calling +convention without any modifications to any safe Go code, or the vast +majority of assembly-using packages. +This is largely afforded by the extra build step that scans for +assembly symbol definitions and references. + +There are two major trade-off axes that lead to different designs. + +### Implicit vs explicit + +Rather than implicitly scanning assembly code for symbol definitions +and references, we could instead introduce pragma comments that users +could use to explicitly inform the compiler of symbol ABIs. +This would make these ABI boundaries evident in code, but would likely +break many more existing packages. + +In order to keep any assembly-using packages working as-is, this +approach would need default rules. +For example, body-less function stubs would likely need to default to +`ABI0`. +Any Go functions called from assembly would still need explicit +annotations, though such calls are rare. +This would cover most assembly-using packages, but function stubs are +also used for Go symbols pushed across package boundaries using +`//go:linkname`. +For link-named symbols, a pragma would be necessary to undo the +default `ABI0` behavior, and would depend on how the target function +was implemented. + +Ultimately, there's no set of default rules that keeps all existing +code working. +Hence, this design proposes extracting symbols from assembly source to +derive the correct ABIs in the vast majority of cases. + +### Wrappers vs single implementation + +In this proposal, a single function can provide multiple entry-points +for different calling conventions. +One of these is the "native" implementation and the others are +intended to translate the calling convention and then invoke the +native implementation. + +An alternative would be for each function to provide a single calling +convention and require all calls to that function to follow that +calling convention. +Other languages use this approach, such as C (e.g., +`fastcall`/`stdcall`/`cdecl`) and Rust (`extern "C"`, etc). +This works well for direct calls, but for direct calls it's also +possible to compile away this proposal's ABI wrapper. +However, it dramatically complicates indirect calls since it requires +the calling convention to become *part of the type*. +Hence, in Go, we would either have to extend the type system, or +declare that only `ABIInternal` functions can be used in closures and +interface satisfaction, both of which are less than ideal. + +Using ABI wrappers has the added advantage that calls to a Go function +from Go can use the fastest available ABI, while still allowing calls +via the stable ABI from assembly. + +### When to generate wrappers + +Finally, there's flexibility in this design around when exactly to +generate ABI wrappers. +In the current proposal, ABI wrappers are always generated in the +package where both the definition and the reference to a symbol +appear. +However, ABI wrappers can be generated anywhere Go type information is +available. + +For example, the compiler could generate an `ABIInternal`→`ABI0` +wrapper when an `ABI0` function is stored in a closure or method +table, regardless of which package that happens in. +And the compiler could generate an `ABI0`→`ABIInternal` wrapper when +it encounters an `ABI0` reference from assembly by finding the +function's type either in the current package or via export info from +another package. + + +## Compatibility + +This proposed change does not affect the functioning of any safe Go +code. +It can affect code that goes outside the [compatibility +guidelines](https://golang.org/doc/go1compat), but is designed to +minimize this impact. +Specifically: + +1. Unsafe Go code can observe the calling convention, though doing so + requires violating even the [allowed uses of + unsafe.Pointer](https://golang.org/pkg/unsafe/#Pointer). + This does arise in the internal implementation of the runtime and + in cgo, both of which will have to be adjusted when we actually + change the calling convention. + +2. Cross-package references where the definition and the reference are + different ABIs may no longer link. + +There are various ways to form cross-package references in Go, though +all depends on `//go:linkname` (which is explicitly unsafe) or +complicated assembly symbol naming. +Specifically, the following types of cross-package references may no +longer link: + +
+ | def | +||||
---|---|---|---|---|---|
Go | +Go+push | +asm | +asm+push | +||
ref | +Go | ✓ | ✓ | ✓ | ✗¹ | +
Go+pull | ✓ | ✓ | ✗¹ | ✗¹ | +|
asm | ✓ | ✗² | ✓ | ✓ | +|
asm+xref | ✗² | ✗² | ✓ | ✓ | +