-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gRPC crashes with concurrent delegations queries #8545
Comments
The problem seem to be the following line: k.validatorCache[strValue] = newCachedValidator(validator, strValue) This is used in order to cache the queried validator and avoid duplicate operations later on. There seem to be two ways of fix this:
|
I took a quick look and that caching seems quite odd. First, it refers to caching the amino decoding for performance reasons, which don't seem to apply so much with protobuf (that no one updated the comment, means no one really thought about whether it still made sense). Secondly, the cache is only updated on I would highly suggest removing this caching. And if it does turn out with benchmarking that this is a bottleneck, then reimplement it much more robustly. |
@ethanfrey Thanks for confirming that I thought as well. I've just run the following benchmark test: func BenchmarkGetValidator(b *testing.B) {
// 900 is the max number we are allowed to use in order to avoid simapp.CreateTestPubKeys
// panic: encoding/hex: odd length hex string
var powersNumber = 900
var totalPower int64 = 0
var powers = make([]int64, powersNumber)
for i := range powers {
powers[i] = int64(i)
totalPower += int64(i)
}
app, ctx, _, valAddrs, vals := initValidators(b, totalPower, len(powers), powers)
for _, validator := range vals {
app.StakingKeeper.SetValidator(ctx, validator)
}
b.ResetTimer()
for _, addr := range valAddrs {
_, _ = app.StakingKeeper.GetValidator(ctx, addr)
}
} Here are the results:
|
Nice benchmark. But those numbers are way too low (< 1ns??) You need to add a loop to do this b.ResetTimer()
for n := 0; n < b.N; n++ {
for _, addr := range valAddrs {
_, _ = app.StakingKeeper.GetValidator(ctx, addr)
}
} Can you update the benchmark and give the new results? |
@ethanfrey Sure thing, here you go: Running it one time
Running it three times, output the third time:
|
So, it not only crashes the node, but it makes it slower??? Time to go! |
Dropping the backport label, as it actually should be added to the PR that fixes this :) Thanks for digging into this @RiccardoM ! |
Summary of Bug
Querying the gRPC concurrently for validator delegations makes it crash with error
fatal error: concurrent map writes
.Version
v0.40.1
Steps to Reproduce
Try querying the redelegations for multiple validators concurrently.
Log output
Code example
For Admin Use
The text was updated successfully, but these errors were encountered: