Skip to content

unsafe: unsafe.Slice escape to heap #72732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
awalterschulze opened this issue Mar 7, 2025 · 11 comments
Closed

unsafe: unsafe.Slice escape to heap #72732

awalterschulze opened this issue Mar 7, 2025 · 11 comments
Labels
BugReport Issues describing a possible bug in the Go implementation. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@awalterschulze
Copy link

Go version

go version go1.24.0 darwin/arm64

Output of go env in your module/workspace:

AR='ar'
CC='clang'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='clang++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN='/Users/yyy/github/bin'
GOCACHE='/Users/yyy/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/yyy/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/th/45dyv7rs03g5t8v9clfswcp80000gn/T/go-build3229043422=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/yyy/github/src/github.com/katydid/parser-go-json/go.mod'
GOMODCACHE='/Users/yyy/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/yyy/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/yyy/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.24.0'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

I see reflect.SliceHeader is deprecated and I am trying to upgrade to unsafe.Slice.

I want to use it to cast from an int64 to a slice of bytes, without allocating any memory on the heap.

I tried upgrading to unsafe.Slice, but it seems that the pointer is leaked to the heap, which is usually not the case with reflect.SliceHeader.

Here is a small reproduction:

$ go build -gcflags "-m -l" main.go
# command-line-arguments
./main.go:10:13: ... argument does not escape
./main.go:10:14: "hello" escapes to heap
./main.go:16:35: &reflect.SliceHeader{...} does not escape
./main.go:24:26: moved to heap: i
./main.go:29:13: make([]byte, 8) escapes to heap
$ cat -n main.go
     1  package main
     2
     3  import (
     4          "fmt"
     5          "reflect"
     6          "unsafe"
     7  )
     8
     9  func main() {
    10          fmt.Println("hello")
    11
    12  }
    13
    14  // Doesn't allocate, but is deprecated
    15  func castFromInt64(i int64) []byte {
    16          return *(*[]byte)(unsafe.Pointer(&reflect.SliceHeader{
    17                  Len:  8,
    18                  Cap:  8,
    19                  Data: uintptr(unsafe.Pointer(&i)),
    20          }))
    21  }
    22
    23  // Allocates, so need to keep using deprecated method
    24  func unsafeCastFromInt64(i int64) []byte {
    25          return unsafe.Slice((*byte)(unsafe.Pointer(&i)), 8)
    26  }
    27
    28  func takeint(i int64) []byte {
    29          return make([]byte, 8)
    30  }

I can't seem to record an allocation in isolation, so maybe I did something else wrong here:

func TestAllocsUnsafeCast(t *testing.T) {
	defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(1))
	f := func() {
		want := int64(math.MaxInt64)
		bs := unsafeCastFromInt64(want)
		got := castToInt64(bs)
		if got != want {
			t.Fatalf("want %d got %d", want, got)
		}
	}
	allocs := testing.AllocsPerRun(1000, f)
	if allocs > 0 {
		t.Fatalf("UnsafeCast Allocs = %f", allocs)
	}
}

But I can reproduce it, if I use it in a larger program and make a single change:

katydid/parser-go-json@16defb4

$ cd json/parse 
$ go test -run=TestNoAllocsOnAverage -v
=== RUN   TestNoAllocsOnAverage
    alloc_test.go:27: seed = 1741363591861790000, got 3 allocs, want 0 allocs
--- FAIL: TestNoAllocsOnAverage (0.04s)
FAIL
exit status 1

The assembler analysis is also interesting and shows that it is significantly more instructions to use unsafe.Slice that to use reflect.SliceHeader for doing this type of cast: https://godbolt.org/z/azPebxG79

So I am just wondering if there is better way to cast from int64 to a slice of bytes without allocating any memoory?

Or is there a concern about the performance implications of moving from reflect.SliceHeader to unsafe.Slice?

I am also wondering if there is a new way to cast from a byte slice to string without doing a copy?

This is the old way, that I used to do it:

func castToString(buf []byte) string {
	header := (*reflect.SliceHeader)(unsafe.Pointer(&buf))
	strHeader := reflect.StringHeader{Data: header.Data, Len: header.Len}
	return *(*string)(unsafe.Pointer(&strHeader))
}

What did you see happen?

This is also answered in previous question.

What did you expect to see?

This is also answered in previous question.

@cuonglm
Copy link
Member

cuonglm commented Mar 7, 2025

Note that both castFromInt64 and castToString violate the 6th rule of unsafe.Pointer.

@randall77
Copy link
Contributor

 Data: uintptr(unsafe.Pointer(&i)),

This violates unsafe.Pointer rules. You might end up with a pointer to an unallocated part of the stack. Your int64 will be overwritten by random other operations. (You probably don't get bitten by this because castFromInt64 inlines, but turn inlining off and bad things will probably happen.) Rule 6 of the unsafe.Pointer rules doesn't allow what you are doing there. You need to cast a *[]byte to a *reflect.SliceHeader and write its Data field, not build a whole new reflect.SliceHeader object and reinterpret that as a slice.

I would use the new-ish unsafe.Slice for this now: unsafe.Slice((*byte)(unsafe.Pointer(&i)), 8)

But that will likely allocate. I'm not sure how you think you can get away without allocation. i has to live somewhere after your cast function returns?

@seankhliao seankhliao added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Mar 7, 2025
@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label Mar 7, 2025
@thepudds
Copy link
Contributor

thepudds commented Mar 7, 2025

Hi @awalterschulze, quick drive-by comment (and I might not have understood what you are trying to do here), but have you tried using encoding/binary? If the goal is to avoid an allocation, I think you should be able to convert from uint64 to []byte without allocating and without using unsafe or reflect. There's probably a better description somewhere, but see for example #42958, or you could just try it and benchmark (and/or look at the assembly, etc.).

@thepudds thepudds added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Mar 7, 2025
@ianlancetaylor
Copy link
Member

So I am just wondering if there is better way to cast from int64 to a slice of bytes without allocating any memoory?

What does this mean in practice? What are you really trying to do?

@awalterschulze
Copy link
Author

Thank you for all the help, I'll try to answer these comment by comment.

So first thank you @gabyhelp for finding Unsafely converting []byte to string, safely?

It seems the answer is:

func castToString(buf []byte) string {
	return *(*string)(unsafe.Pointer(&buf))
}

@awalterschulze
Copy link
Author

@ianlancetaylor

What does this mean in practice? What are you really trying to do?

tl;dr I am building a validator, which doesn't need a copy of any of the data after parsing is complete.

I am building a json parser for a serialization format agnostic validation language.
It needs to be very very fast. I have found that the bottleneck in validation is the parser (even when validating protocol buffers), since my validator is already quite optimized.
So it is important to do as many optimizations as possible on the parser itself:

  1. Do not do any parsing work that is not required by the validator (check correctness of format, but do not parse numbers or strings that are not necessary for validation).
  2. Do not allocate memory to the heap, this is possible, since I do not need to copy any data, since I do not need it after validating. The validator only needs to apply functions, like lessThan, equal to a specific field and after Next or Skip is called on the parser, that value is forgotten.
  3. Learn how to use AVX512 SIMD instructions to parse json faster, as has been done in other languages.

Here is the full pull request that updates the parser to use unsafe casting to int64 and float64.
katydid/parser-go-json#21
Everything in the top level (json folder) is deprecated, but a supported conversion to the old parser interface, until I can convert all parsers to the new interface. Everything in deeper folder levels (json/parser, json/token, json/scan) is the new parser implementation.

@awalterschulze
Copy link
Author

@thepudds if binary binary.LittleEndian.Uint64 is inlined, then that is perfectly acceptable.

// Uint64 returns the uint64 representation of b[0:8].
func (littleEndian) Uint64(b []byte) uint64 {
	_ = b[7] // bounds check hint to compiler; see golang.org/issue/14808
	return uint64(b[0]) | uint64(b[1])<<8 | uint64(b[2])<<16 | uint64(b[3])<<24 |
		uint64(b[4])<<32 | uint64(b[5])<<40 | uint64(b[6])<<48 | uint64(b[7])<<56
}

But PutUint64 does require allocating a buffer, that needs to be passed in:

// PutUint64 stores v into b[0:8].
func (littleEndian) PutUint64(b []byte, v uint64) {
	_ = b[7] // early bounds check to guarantee safety of writes below
	b[0] = byte(v)
	b[1] = byte(v >> 8)
	b[2] = byte(v >> 16)
	b[3] = byte(v >> 24)
	b[4] = byte(v >> 32)
	b[5] = byte(v >> 40)
	b[6] = byte(v >> 48)
	b[7] = byte(v >> 56)
}

Maybe I misunderstood, what you were trying to say?

@awalterschulze
Copy link
Author

Note that both castFromInt64 and castToString violate the 6th rule of unsafe.Pointer.

@cuonglm thanks, I didn't know about these rules. I fixed the violation of castToString:

func castToString(buf []byte) string {
	return *(*string)(unsafe.Pointer(&buf))
}

, but unsure how to fix the violation in castFromInt64?

@awalterschulze
Copy link
Author

This violates unsafe.Pointer rules.

@randall77 Thank you I didn't know about the rules and thank you for also explaining it a bit more.

I'm not sure how you think you can get away without allocation. i has to live somewhere after your cast function returns?

I do keep holding onto i for as long as I need it. The validator only needs to until it moves onto the Next json token to parse and then it can be overwritten.

I do not expect you to check my work, but if you want then this is what tokenInt is for in the tokenizer.go:
https://github.com/katydid/parser-go-json/pull/21/files#diff-ea8bd0ff9f8a65c67d32e1590f164465afbb6bb5dfe3f53ba6155bf983c97718R112

@awalterschulze
Copy link
Author

Actually thanks again @gabyhelp for showing me a safer way to convert bytes to string

func castToString(buf []byte) string {
	return unsafe.String(unsafe.SliceData(buf), len(buf))
}

@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

7 participants