Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/tls: deadlock/hang in handshake #71269

Closed
porridge opened this issue Jan 14, 2025 · 7 comments
Closed

crypto/tls: deadlock/hang in handshake #71269

porridge opened this issue Jan 14, 2025 · 7 comments
Labels
BugReport Issues describing a possible bug in the Go implementation. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@porridge
Copy link

Go version

go version go1.23.4 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/mowsiany/.cache/go-build'
GOENV='/home/mowsiany/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/mowsiany/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/mowsiany/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.4'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/mowsiany/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/mowsiany/tmp/20250114-repro-dNr/stackrox/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/nix-shell.qUzvHg/nix-shell.IbCSwR/go-build182911575=/tmp/go-build -gno-record-gcc-switches'

What did you do?

We have a test which sets up a TLS connection using custom CA, certificates and VerifyPeerCertificate function.
In Go 1.23.x some interaction between these settings causes both the client and the server to hang in crypto/tls.(*Conn).flush() indefinitely.

I'm still working on getting a minimal repro case running (the TLS config setup is quite involved in our codebase).

But since apparently this is already fixed in 1.24rc1, I'm reporting this early in the hope that maintainers will be able to say "oh right commit suchandsuch hapened to fix this, let's just backport".

Fully reproducible with:

git clone https://github.com/stackrox/stackrox
cd stackrox
git checkout e08de5092de957a17042abd5636831c5a4b436fb # to pin to a point in time in case a future commit is broken
go test -v -timeout 10s github.com/stackrox/rox/central/tlsconfig -run TestManager/TestNoExtraCertIssuedInStackRoxNamespace
[...]
ok  	github.com/stackrox/rox/central/tlsconfig	0.070s

Then bump 1.22.5 to 1.23.4 in go.mod and try again 👉🏻 failure, see stack traces below.
Then try the same with GOTOOLCHAIN=go1.24rc1 👉🏻 success.

FTR, the "Error watching TLS certificate directory" message is irrelevant.

What did you see happen?

$ go test -v -timeout 10s github.com/stackrox/rox/central/tlsconfig -run TestManager/TestNoExtraCertIssuedInStackRoxNamespace
=== RUN   TestManager
=== RUN   TestManager/TestNoExtraCertIssuedInStackRoxNamespace
pkg/mtls/certwatch: 2025/01/14 14:11:47.757254 certwatch.go:86: Error: Error watching TLS certificate directory "/run/secrets/stackrox.io/default-tls-cert": reading contents of directory /run/secrets/stackrox.io/default-tls-cert: open /run/secrets/stackrox.io/default-tls-cert: no such file or directory. Not updating TLS certificates!
panic: test timed out after 10s
	running tests:
		TestManager (10s)
		TestManager/TestNoExtraCertIssuedInStackRoxNamespace (10s)

goroutine 32 [running]:
testing.(*M).startAlarm.func1()
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:2373 +0x385
created by time.goFunc
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/time/sleep.go:215 +0x2d

goroutine 1 [chan receive]:
testing.(*T).Run(0xc0005fa340, {0x284272a?, 0x0?}, 0x2a30a80)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:1751 +0x3ab
testing.runTests.func1(0xc0005fa340)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:2168 +0x37
testing.tRunner(0xc0005fa340, 0xc000093c70)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:1690 +0xf4
testing.runTests(0xc00069cd50, {0x42a3ee0, 0x2, 0x2}, {0x4750ed?, 0xc0003f82a0?, 0x4340640?})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:2166 +0x43d
testing.(*M).Run(0xc000448280)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:2034 +0x64a
main.main()
	_testmain.go:47 +0x9b

goroutine 53 [select]:
github.com/golang/glog.(*fileSink).flushDaemon(0x434a758)
	/home/mowsiany/go/pkg/mod/github.com/golang/glog@v1.2.2/glog_file.go:349 +0xb4
created by github.com/golang/glog.init.1 in goroutine 1
	/home/mowsiany/go/pkg/mod/github.com/golang/glog@v1.2.2/glog_file.go:164 +0x126

goroutine 57 [chan receive]:
testing.(*T).Run(0xc0005fa4e0, {0x22fef43?, 0x22297ef?}, 0xc00078a3f0)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:1751 +0x3ab
github.com/stretchr/testify/suite.runTests({0x2c91150, 0xc0005fa4e0}, {0xc00067e000?, 0x2?, 0x2?})
	/home/mowsiany/go/pkg/mod/github.com/stretchr/testify@v1.10.0/suite/suite.go:247 +0xef
github.com/stretchr/testify/suite.Run(0xc0005fa4e0, {0x2c70d28, 0xc0006513c0})
	/home/mowsiany/go/pkg/mod/github.com/stretchr/testify@v1.10.0/suite/suite.go:220 +0x486
github.com/stackrox/rox/central/tlsconfig.TestManager(0xc0005fa4e0)
	/home/mowsiany/tmp/20250114-repro-dNr/stackrox/central/tlsconfig/manager_impl_test.go:26 +0x33
testing.tRunner(0xc0005fa4e0, 0x2a30a80)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:1690 +0xf4
created by testing.(*T).Run in goroutine 1
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:1743 +0x390

goroutine 58 [select]:
net.(*pipe).write(0xc00033a680, {0xc000bc3b80, 0x1e, 0x20})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/net/pipe.go:194 +0x2f2
net.(*pipe).Write(0x7c7853?, {0xc000bc3b80?, 0x2c81f48?, 0xc0003fe5a0?})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/net/pipe.go:174 +0x1c
crypto/tls.(*Conn).flush(0xc0001f6a88)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:958 +0x42
crypto/tls.(*Conn).handshakeContext(0xc0001f6a88, {0x2c81f48, 0xc00056d720})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:1574 +0x3e7
crypto/tls.(*Conn).HandshakeContext(...)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:1508
github.com/stackrox/rox/central/tlsconfig.(*managerTestSuite).testConnectionWithManager(0xc0006513c0, 0x234f920?, {0xc000809730, 0x3, 0x0?}, {0xc000baf710, 0x2, 0x0?})
	/home/mowsiany/tmp/20250114-repro-dNr/stackrox/central/tlsconfig/manager_impl_test.go:129 +0x7dd
github.com/stackrox/rox/central/tlsconfig.(*managerTestSuite).TestNoExtraCertIssuedInStackRoxNamespace(0xc0006513c0)
	/home/mowsiany/tmp/20250114-repro-dNr/stackrox/central/tlsconfig/manager_impl_test.go:63 +0x209
reflect.Value.call({0xc00035a640?, 0xc0005ad850?, 0x13?}, {0x282e4ca, 0x4}, {0xc0000eff28, 0x1, 0x1?})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/reflect/value.go:581 +0xca6
reflect.Value.Call({0xc00035a640?, 0xc0005ad850?, 0x3a4de60?}, {0xc0000d6728?, 0xf?, 0x0?})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/reflect/value.go:365 +0xb9
github.com/stretchr/testify/suite.Run.func1(0xc0005fa820)
	/home/mowsiany/go/pkg/mod/github.com/stretchr/testify@v1.10.0/suite/suite.go:202 +0x485
testing.tRunner(0xc0005fa820, 0xc00078a3f0)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:1690 +0xf4
created by testing.(*T).Run in goroutine 57
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/testing/testing.go:1743 +0x390

goroutine 59 [select]:
github.com/stackrox/rox/pkg/k8scfgwatch.(*pollWatcher).Run(0xc00035be80, {0x7f6a6c7a4998, 0x436d400}, 0xc00078a3f0?)
	/home/mowsiany/tmp/20250114-repro-dNr/stackrox/pkg/k8scfgwatch/poll_watcher.go:38 +0xbe
created by github.com/stackrox/rox/pkg/k8scfgwatch.WatchConfigMountDir in goroutine 58
	/home/mowsiany/tmp/20250114-repro-dNr/stackrox/pkg/k8scfgwatch/watch.go:42 +0x13e

goroutine 60 [select]:
net.(*pipe).write(0xc00033a600, {0xc000bc8a80, 0x735, 0xa80})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/net/pipe.go:194 +0x2f2
net.(*pipe).Write(0xc000005188?, {0xc000bc8a80?, 0xc000b63f80?, 0xc0008ded88?})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/net/pipe.go:174 +0x1c
crypto/tls.(*Conn).flush(0xc000005188)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:958 +0x42
crypto/tls.(*serverHandshakeStateTLS13).handshake(0xc000b9fc28)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/handshake_server_tls13.go:77 +0xa5
crypto/tls.(*Conn).serverHandshake(0xc000005188, {0x2c81f48, 0xc0002e8e10})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/handshake_server.go:54 +0x16a
crypto/tls.(*Conn).handshakeContext(0xc000005188, {0x2c81f48, 0xc00056d720})
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:1568 +0x3a6
crypto/tls.(*Conn).HandshakeContext(...)
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:1508
github.com/stackrox/rox/central/tlsconfig.(*managerTestSuite).testConnectionWithManager.func1()
	/home/mowsiany/tmp/20250114-repro-dNr/stackrox/central/tlsconfig/manager_impl_test.go:95 +0xf1
created by github.com/stackrox/rox/central/tlsconfig.(*managerTestSuite).testConnectionWithManager in goroutine 58
	/home/mowsiany/tmp/20250114-repro-dNr/stackrox/central/tlsconfig/manager_impl_test.go:91 +0x31d

goroutine 82 [select]:
crypto/tls.(*Conn).handshakeContext.func2()
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:1544 +0x86
created by crypto/tls.(*Conn).handshakeContext in goroutine 60
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:1543 +0x285

goroutine 29 [select]:
crypto/tls.(*Conn).handshakeContext.func2()
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:1544 +0x86
created by crypto/tls.(*Conn).handshakeContext in goroutine 58
	/home/mowsiany/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.23.4.linux-amd64/src/crypto/tls/conn.go:1543 +0x285
FAIL	github.com/stackrox/rox/central/tlsconfig	10.051s
FAIL

What did you expect to see?

Success, like in the 1.22 and 1.24rc1 cases.

@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label Jan 14, 2025
@porridge
Copy link
Author

FTR I managed to simplify this down to this code: https://github.com/porridge/go-tls-hang-repro

Unfortunately building in playground times out.

@seankhliao
Copy link
Member

are you sure it's not a problem of your pipe? switching it out for net.Listen("tcp", ":0") seems to make it work

@mknyszek
Copy link
Contributor

CC @golang/security

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 14, 2025
@mknyszek mknyszek added this to the Backlog milestone Jan 14, 2025
@littlerest1

This comment has been minimized.

@porridge
Copy link
Author

@seankhliao I think the pipe works fine, but crypto/tls might not support such unbuffered connection, as @ianlancetaylor said in #24198 (comment)
Please feel free to close this issue if this is still your stance, and in that case sorry for the noise 😅

@porridge
Copy link
Author

(BTW I was not able to reproduce this when using TCP on Linux even when reducing the socket buffer sizes to minimum allowed.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants