Skip to content

GcToolchainTricks

nathany edited this page Dec 11, 2014 · 13 revisions

Update: the Plan 9 C compilers are going away in future releases of Go (probably in Go 1.5), so some of the tricks below that use them are deprecated.

Introduction

This page documents some less well-known (perhaps advanced) tricks for the gc toolchain (and the Go tool).

C code without cgo

Use the bundled Plan 9 C Compiler 6c

Dave Cheney has written an excellent blog post about this: http://dave.cheney.net/2013/09/07/how-to-include-c-code-in-your-go-package

Use syso file to embed arbitrary self-contained C code

Basically, you write your assembly language in GNU as(1) format, but make sure all the interface functions are using Go's ABI (everything on stack, etc., please read Go 1.2 Assembler Introduction for more details).

The most important step is compiling that file to file.syso (gcc -c -O3 -o file.syso file.S), and put the resulting syso in the package source directory. And then, suppose your assembly function is named Func, you need one stub Plan 9 assembly file to call it:

TEXT ·Func(SB),$0-8 // please set the correct parameter size (8) here
	JMP Func(SB)

then you just declare Func in your package and use it, go build will be able to pick up the syso and link it into the package.

Notes:

  • The binary produced won't use cgo, and the overhead is just an unconditional JMP that could be perfectly branch predicted. But, please be aware that because it doesn't use cgo, your assembly function is running on Go stack, and it shouldn't use too much stack (a safe value is less than ~100 bytes) or terrible things will happen. For compute kernels, this requirement isn't too restricting.
  • Please make sure you‘ve included all library dependencies in your C code. libc is not available, and most notably, libgcc is also not available (esp. when you're using gcc __builtin_funcs, please use nm(1) to double-check that your file doesn't contain any undefined symbols).
  • It's also possible to call back Go functions from C code, but this is left as an exercise for the reader.
  • this trick is supported on all Go 1.x releases.
  • the Go linker is pretty capable in that you just need to prepare .syso file for each architecture, not for each OS/Arch combination (assuming you don't use OS-specific constructs, obviously), and the Go linker is perfectly capable to link, for example, Mach-O object files into ELF binaries. So be sure to name your syso file with names like file_amd64.syso, file_386.syso.

Bundle data into Go binary

There are a lot of ways to bundle data in Go binary, for example:

  • zip the data files, and append the zip file to end of Go binary, then use zip -A prog to adjust the bundled zip header. You can use archive/zip to open the program as a zip file, and access its contents easily. There are existing packages that helps with this, for example, https://godoc.org/bitbucket.org/tebeka/nrsc; This requires post-processing the program binary, which is not suitable for non-main packages that require static data. Also, you must collect all data files into one zip file, which means that it's impossible to use multiple packages that utilize this method.
  • Embed the binary file as a string or []byte in Go program. This method is not recommended, not only because the generated Go source file is much larger than the binary files themselves, also because static large []byte slows down the compilation of the package and the gc compiler uses a lot of memory to compile it (this is a known bug of gc). For example, see the tools/godoc/static package.
  • use similar syso technique to bundle the data. Precompile the data file as syso file using GNU as(1)'s .incbin pseudo-instruction.

The key trick for the 3rd alternative is that the linker for the gc toolchain has the ability to link COFF object files of a different architecture into the binary without problem, so you don't need to provide syso files for all supported architectures. As long as the syso file doesn't contain instructions, you can just use one to embed the data.

The assembly template to generate the COFF .syso file:

/* data.S, as -o data.syso */
.section .rdata,"dr" /* put in COFF section .rdata */
.globl _bindataA /* no longer need to prepend package name here */
.globl _ebindataA
_bindataA:
.incbin "dataA"
_ebindataA:

.globl _bindataB /* no longer need to prepend package name here */
.globl _ebindataB
_bindataB:
.incbin "dataB"
_ebindataB:

And two other files, first a Plan 9 C source file that assembles the slice for Go:

/* slice.c */
#include "runtime.h"
extern byte _bindataA[], _bindataB[], _ebindataA, _ebindataB;

void ·getDataSlices(Slice a, Slice b) {
  a.array = _bindataA;
  a.len = a.cap = &_ebindataA - _bindataA;
  b.array = _bindataB;
  b.len = b.cap = &_ebindataB - _bindataB;
  FLUSH(&a);
  FLUSH(&b);
}

And finally, the Go file that uses the embedded slide:

/* data.go */
package bindata

func getDataSlices() ([]byte, []byte) // defined in slice.c

var A, B = getDataSlices()

Note: you will need an as(1) capable of generating the COFF syso file, you can build one easily on Unix:

wget http://ftp.gnu.org/gnu/binutils/binutils-2.22.tar.bz2   # any newer version also works
tar xf binutils-2.22.tar.bz2
cd binutils-2.22
mkdir build; cd build
../configure --target=i386-foo-pe --enable-ld=no --enable-gold=no
make
# use gas/as-new to assemble your data.S
# all the other file could be discarded.

Drawback of this issue is that it seems incompatible to cgo, so only use it when you don't use cgo, at least for now. I (minux) is working on figuring out why they're incompatible.

Clone this wiki locally