Skip to content

Byte slice converted with unsafe from string changes its address #47247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
leviska opened this issue Jul 16, 2021 · 2 comments
Closed

Byte slice converted with unsafe from string changes its address #47247

leviska opened this issue Jul 16, 2021 · 2 comments

Comments

@leviska
Copy link

leviska commented Jul 16, 2021

This is a copy of this question https://stackoverflow.com/q/68401381/5516391 , because I convinced, that this looks like a compiler bug

I have this function to convert string to slice of bytes without copying

func StringToByteUnsafe(s string) []byte {
	strh := (*reflect.StringHeader)(unsafe.Pointer(&s))
	var sh reflect.SliceHeader
	sh.Data = strh.Data
	sh.Len = strh.Len
	sh.Cap = strh.Len
	return *(*[]byte)(unsafe.Pointer(&sh))
}

That works fine, but with very specific setup gives very strange behavior:

The setup is here: https://github.com/leviska/go-unsafe-gc/blob/main/pkg/pkg_test.go

What happens:

  1. Create a byte slice
  2. Convert it into temporary (rvalue) string and with unsafe convert it into byte slice again
  3. Then, copy this slice (by reference)
  4. Then, do something with the second slice inside goroutine
  5. Print the pointers before and after

And I have this output on my linux mint laptop with go 1.16:

go test ./pkg -v -count=1
=== RUN   TestSomething
0xc000046720 123 0xc000046720 123
0xc000076f20 123 0xc000046721 z
--- PASS: TestSomething (0.84s)
PASS
ok      github.com/leviska/go-unsafe-gc/pkg     0.847s

So, the first slice magically changes its address, while the second isn't

If we replace the goroutine with runtime.GC() (and may be play with the code a little bit), we can get the both pointers to change the value (to the same one).

If we change the unsafe cast to just []byte() everything works without changing the addresses. Also, if we change it to the unsafe cast from here https://stackoverflow.com/a/66218124/5516391 everything works as expected.

func StringToByteUnsafe(str string) []byte { // this works fine
	var buf = *(*[]byte)(unsafe.Pointer(&str))
	(*reflect.SliceHeader)(unsafe.Pointer(&buf)).Cap = len(str)
	return buf
}

I run it with GOGC=off and got the same result. I run it with -race and got no errors.

If you run this as main package with main function, it seems to work correctly. Also if you remove the Convert function. My guess is that compiler optimizes stuff in this cases.

After playing with this code a little bit more, I think, that this can be a compiler bug or strange UB. Can you help me understand what's happening here? If it's not a bug, then

  1. Why and how go runtime magically changes the address of the variable?
  2. Why in concurentless case it can change both addresses, while in concurrent can't?
  3. What's the difference between this unsafe cast and the cast from stackoverflow answer? Why it does work?

What version of Go are you using (go version)?

$ go version
go version go1.16.4 linux/amd64

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/leviska/.cache/go-build"
GOENV="/home/leviska/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/leviska/go/pkg/mod"
GONOPROXY="gitlab.ozon.ru/*"
GONOSUMDB="gitlab.ozon.ru/*"
GOOS="linux"
GOPATH="/home/leviska/go"
GOPRIVATE="gitlab.ozon.ru/*"
GOPROXY="https://athens.s.o3.ru"
GOROOT="/usr/local/go"
GOSUMDB="off"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.4"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/leviska/projects/seq-db/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2605705126=/tmp/go-build -gno-record-gcc-switches"
@cuonglm
Copy link
Member

cuonglm commented Jul 16, 2021

Your first version:

func StringToByteUnsafe(s string) []byte {
	strh := (*reflect.StringHeader)(unsafe.Pointer(&s))
	var sh reflect.SliceHeader
	sh.Data = strh.Data
	sh.Len = strh.Len
	sh.Cap = strh.Len
	return *(*[]byte)(unsafe.Pointer(&sh))
}

is invalid, as it uses reflect.SliceHeader as plain struct. You can run go vet on it, and go vet will warn you.

@cherrymui
Copy link
Member

The backing store of a is allocated on stack, because it does not escape. And goroutine stacks can move dynamically. b, on the other hand, escapes to heap, because it is passed to another goroutine. In general, we don't assume the address of an object don't change.

This works as intended. Thanks.

If you have further questions, you will get better answers if you ask on a forum rather than on the issue tracker. See https://golang.org/wiki/Questions. Thanks.

@golang golang locked and limited conversation to collaborators Jul 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants