Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api/client.go#clone() re-claims a non-re-entrant lock #22393

Closed
mju opened this issue Aug 17, 2023 · 3 comments · Fixed by #22410
Closed

api/client.go#clone() re-claims a non-re-entrant lock #22393

mju opened this issue Aug 17, 2023 · 3 comments · Fixed by #22410
Labels
bug Used to indicate a potential bug core/api

Comments

@mju
Copy link

mju commented Aug 17, 2023

Describe the bug
In api/client.go, clone() claims the same read lock twice. The second time is via Headers().

The doc on RWMutex says in here and here that re-entrance is not allowed.

We ran into this problem when a goroutine called CloneWithHeaders() and another called SetCloneToken().
After CloneWithHeaders() acquired the first read lock, SetCloneToken() tried to claim a write lock. As a result,
no new read lock could be claimed and goroutine one couldn't release the existing read lock. As a result,
goroutine two couldn't acquire the write lock. Below are the stacktraces.

goroutine 12055 [semacquire, 147 minutes]:
sync.runtime_SemacquireMutex(0x151c280?, 0x10?, 0x1708100?)
GOROOT/src/runtime/sema.go:77 +0x25
sync.(*RWMutex).RLock(...)
GOROOT/src/sync/rwmutex.go:71
github.com/hashicorp/vault/api.(*Client).Headers(0xc000d98a00)
external/com_github_hashicorp_vault_api/client.go:1005 +0x6e
github.com/hashicorp/vault/api.(*Client).clone(0xc000d98a00, 0x1)
external/com_github_hashicorp_vault_api/client.go:1177 +0x2b6
github.com/hashicorp/vault/api.(*Client).CloneWithHeaders(...)
external/com_github_hashicorp_vault_api/client.go:1141
goroutine 11984 [semacquire, 147 minutes]:
sync.runtime_SemacquireMutex(0x203000?, 0x0?, 0xc0016e0030?)
	GOROOT/src/runtime/sema.go:77 +0x25
sync.(*RWMutex).Lock(0x4852de?)
	GOROOT/src/sync/rwmutex.go:152 +0x71
github.com/hashicorp/vault/api.(*Client).SetCloneToken(0xc000d98a00, 0x1)
	external/com_github_hashicorp_vault_api/client.go:1079 +0x33

To Reproduce
Steps to reproduce the behavior:

To reproduce this issue, you need to have a small piece of code that has the following operations.
We had them back-to-back.

		client.SetCloneToken(true)
		c, err := client.CloneWithHeaders()

Then you need to execute this small piece of code with multiple threads non-stop.
You will see the problem if you are (un)lucky enough.

Note that regardless of the reproduction, please consider Go's documentation on RWMutex.

We use pprof to spy on the goroutines.

Expected behavior
No deadlocking.

Environment:

  • Vault Server Version (retrieve with vault status): 1.12.0
  • Vault CLI Version (retrieve with vault version): v1.8.3
  • Server Operating System/Architecture: Unbuntu 18.04

Vault server configuration file(s):

n/a

Additional context
A problem I can see that makes client.go risky is that top-level functions can invoke each other.
I think if you structure the code so that only top-level functions can claim locks and that top-level
functions cannot invoke each other, then it's unlikely for lock-related issues to take place.

See this post on why golang doesn't offer re-entrant locks in the first place.

@VioletHynes
Copy link
Contributor

Thanks for the report! Very good bug report and a real, identified bug. I'm going to look into a fix for this once I free up.

@heatherezell heatherezell added bug Used to indicate a potential bug core/api labels Aug 17, 2023
@mju
Copy link
Author

mju commented Aug 17, 2023

Thanks for the report! Very good bug report and a real, identified bug. I'm going to look into a fix for this once I free up.

Hello there, I have created a PR the both reproduced the problem and fixed the problem.
PR: #22409

@VioletHynes
Copy link
Contributor

It seems we almost had the same idea! I have a PR too, here: #22410

I fixed up a few other areas and moved a little more locking out of the top level functions.

If it's okay with you, I think it might be better to close your PR and merge mine, as I have a little extra clean-up and a closer similarity to naming we have around other similar methods (e.g. headersInternal for the version that holds the lock).

Thanks a tonne for the report! Great description and easy to reproduce. Apologies that we both worked on the PR at the same time, I should have been clearer!

mju added a commit to mju/vault that referenced this issue Aug 17, 2023
Remove recursive locking from CloneWithHeaders() in client.go
golang doesn't support reentrant locks in general.
hashicorp#22393 has all the details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Used to indicate a potential bug core/api
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants