Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize any repeated calls to proto unmarshal and get signers in ante handlers #16738

Closed
Tracked by #15677
tac0turtle opened this issue Jun 28, 2023 · 15 comments
Closed
Tracked by #15677
Assignees
Labels
T: Performance Performance improvements

Comments

@tac0turtle
Copy link
Member

No description provided.

@tac0turtle tac0turtle mentioned this issue Jun 28, 2023
8 tasks
@github-actions github-actions bot added the needs-triage Issue that needs to be triaged label Jun 28, 2023
@alexanderbez
Copy link
Contributor

First, I would aggregate via a quick grep where we have unmarshal and GetSigners calls. Then I would probably use a generic cache to cache those values s.t. the cache can be passed between ante decorators.

@yihuang
Copy link
Collaborator

yihuang commented Jun 30, 2023

there are other similar cases, for example in ethermint we have duplicated GetParams calls, which are not shared because in different decorators.

@alexanderbez
Copy link
Contributor

alexanderbez commented Jun 30, 2023

A generalized way would be to have an ante-handler "manager" that simple contains the entire chain and a cache. Then any call could look like:

v, ok := cache.Get(getParamsKey)
if ok {
  params, ok = v.(Params)
  if !ok {
    // error
  }
} else {
  params = GetParams()
  // set cache
}

Not pretty, but it's generalized.

@tac0turtle
Copy link
Member Author

lets focus on the changes from getsigner as this may a performance regression.

@elias-orijtech any chance you can prioritise this?

@elias-orijtech
Copy link
Contributor

Sure. I think I need more detail, so I'll bring it up in today's call.

@alexanderbez
Copy link
Contributor

Thanks @elias-orijtech! We can keep it simple for now.

@elias-orijtech
Copy link
Contributor

Is this urgent, or can it wait for my return from vacation start of August?

@tac0turtle
Copy link
Member Author

its kind of urgent to benchmark this, maybe @odeke-em can jump in? dont want to hold you from vacation

@elias-orijtech
Copy link
Contributor

elias-orijtech commented Jul 11, 2023

I took a look at this today, but got stuck on the deep dependency on pgregory.net/rapid. Details below.

At @kocubinski's suggestion, I'm aiming for converting aminojson tests for equivalence to benchmarks. For example,

func TestAminoJSON_Equivalence(t *testing.T) {

however, the rapidproto.MessageGenerator depends on rapid for generating values, and rapid is designed for tests, not for benchmarks. rapid.Generator[T] does export an Example call that fits our needs, but unfortunately calling that results in a nil pointer panic when our rapidproto generator calls T.Helper. One such trace is

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x50 pc=0x10526e4c8]

goroutine 12 [running]:
pgregory.net/rapid.(*T).Helper(0x0?)
	<autogenerated>:1 +0x28
gotest.tools/v3/assert.NilError({0x1079ae0c0, 0x140002f0300}, {0x0?, 0x0}, {0x0, 0x0, 0x0})
	/Users/a/go/pkg/mod/gotest.tools/v3@v3.5.0/assert/assert.go:174 +0x64
github.com/cosmos/cosmos-proto/rapidproto.GeneratorOptions.genAny({{0x140006d0d00, 0x1, 0x1}, 0x14000e10ae0, {0x107998310, 0x1400011e3f0}, 0x0, 0x1, {0x109019750, 0x1, ...}}, ...)
	/Users/a/go/pkg/mod/github.com/cosmos/cosmos-proto@v1.0.0-beta.3/rapidproto/rapidproto.go:267 +0x1b8
github.com/cosmos/cosmos-proto/rapidproto.GeneratorOptions.setFieldValue({{0x140006d0d00, 0x1, 0x1}, 0x14000e10ae0, {0x107998310, 0x1400011e3f0}, 0x0, 0x1, {0x109019750, 0x1, ...}}, ...)
	/Users/a/go/pkg/mod/github.com/cosmos/cosmos-proto@v1.0.0-beta.3/rapidproto/rapidproto.go:159 +0x4cc
github.com/cosmos/cosmos-proto/rapidproto.GeneratorOptions.setFields({{0x140006d0d00, 0x1, 0x1}, 0x14000e10ae0, {0x107998310, 0x1400011e3f0}, 0x0, 0x1, {0x109019750, 0x1, ...}}, ...)
	/Users/a/go/pkg/mod/github.com/cosmos/cosmos-proto@v1.0.0-beta.3/rapidproto/rapidproto.go:105 +0x3b4
github.com/cosmos/cosmos-proto/rapidproto.GeneratorOptions.setFieldValue({{0x140006d0d00, 0x1, 0x1}, 0x14000e10ae0, {0x107998310, 0x1400011e3f0}, 0x0, 0x1, {0x109019750, 0x1, ...}}, ...)
	/Users/a/go/pkg/mod/github.com/cosmos/cosmos-proto@v1.0.0-beta.3/rapidproto/rapidproto.go:162 +0x424
github.com/cosmos/cosmos-proto/rapidproto.GeneratorOptions.setFields({{0x140006d0d00, 0x1, 0x1}, 0x14000e10ae0, {0x107998310, 0x1400011e3f0}, 0x0, 0x1, {0x109019750, 0x1, ...}}, ...)
	/Users/a/go/pkg/mod/github.com/cosmos/cosmos-proto@v1.0.0-beta.3/rapidproto/rapidproto.go:105 +0x3b4
github.com/cosmos/cosmos-proto/rapidproto.MessageGenerator[...].func1()
	/Users/a/go/pkg/mod/github.com/cosmos/cosmos-proto@v1.0.0-beta.3/rapidproto/rapidproto.go:20 +0xc0
pgregory.net/rapid.(*customGen[...]).maybeValue(0x104f6928c, 0x1090af3c0?)
	/Users/a/proj/orijtech/rapid/combinators.go:50 +0x64
pgregory.net/rapid.find[...](0x140010337c0?, 0x140002f0280, 0x5)
	/Users/a/proj/orijtech/rapid/combinators.go:111 +0x80
pgregory.net/rapid.(*customGen[...]).value(0x104f5c1cc?, 0x140000b6888?)
	/Users/a/proj/orijtech/rapid/combinators.go:36 +0x48
pgregory.net/rapid.(*Generator[...]).value(0x0, 0x140002f0280?)
	/Users/a/proj/orijtech/rapid/generator.go:74 +0x64
pgregory.net/rapid.recoverValue[...](...)
	/Users/a/proj/orijtech/rapid/generator.go:118
pgregory.net/rapid.example[...](0x0?, 0x1079ae120?)
	/Users/a/proj/orijtech/rapid/generator.go:105 +0x2c
pgregory.net/rapid.(*Generator[...]).Example(0x1079cfc60, {0x140000b6b30, 0x1?, 0x140000b6bf8?})
	/Users/a/proj/orijtech/rapid/generator.go:87 +0xf0
github.com/cosmos/cosmos-sdk/tests/integration/tx/aminojson.BenchmarkGetSignBytes.func1(0x14000447b80?)
	/Users/a/proj/orijtech/cosmos-sdk/tests/integration/tx/aminojson/benchmark_test.go:66 +0x128
testing.(*B).runN(0x14000447b80, 0x1)
	/nix/store/bvihxp3hnnvpjilgqw1srfqm4dyx6xa5-go-1.20.4/share/go/src/testing/benchmark.go:193 +0x12c
testing.(*B).run1.func1()
	/nix/store/bvihxp3hnnvpjilgqw1srfqm4dyx6xa5-go-1.20.4/share/go/src/testing/benchmark.go:233 +0x50
created by testing.(*B).run1
	/nix/store/bvihxp3hnnvpjilgqw1srfqm4dyx6xa5-go-1.20.4/share/go/src/testing/benchmark.go:226 +0x90

This doesn't seem immediately fixable, because rapid anonymously embeds the *testing.TB, presumably because testing.T.Helper is sensitive the call stack.

If anyone has any ideas on how to proceed, I have some time tomorrow as well.

@alexanderbez
Copy link
Contributor

Why do we need to use rapid at all for benchmarks?

@elias-orijtech
Copy link
Contributor

We don't, but rapidproto is what conveniently generates messages that I assume is close the production messages. Is there another, more suitable, package for generating such messages?

@alexanderbez
Copy link
Contributor

I would just generate a message or two by hand. What we really want to see here is the time saved by repeated calls on the same method(s) on a given tx, so I don't imagine fuzzing a message type will show us a great deal of insight.

@tac0turtle
Copy link
Member Author

to avoid duplication could we not do something at compile time where we find all the messages and their signers so when messages come in we already know which value is the signer, this may help avoid any performance regression. cc @kocubinski

@tac0turtle
Copy link
Member Author

@odeke-em can you post your findings here so we can close this issue

@tac0turtle tac0turtle added T: Performance Performance improvements and removed needs-triage Issue that needs to be triaged labels Aug 11, 2023
@odeke-em
Copy link
Collaborator

Thanks @tac0turtle!

Synopsis

So some good news, from examining a bunch of profiles, we don't have regressions from repeated calls to proto.Unmarshal and we actually some improvements in the antehandler calls

Deep dive

From examining x/auth/ante.TestAnteHandlerMultiSigner in v0.47.0, there was a convolution and calls back and forth to ChainAnteDecorators from a bunch of other AnteHandlers due to every antehandler invoking .Next. Notice the arrows to and from types.ChainAnteDecorators showing repeated calls back and forth?
image
whereas in the current main, we've got only 2 repeated calls back and forth for ante.SigGasConsumeDecorator and ante.SigVerificationDecorator.AnteHandle as the only invoked antehandlers
image

That change I believe comes from this improvement

         .          .    367:	// increment sequence of all signers
         .          .    368:	signers, err := sigTx.GetSigners()
         .          .    369:	if err != nil {
         .          .    370:		return sdk.Context{}, err
         .          .    371:	}

82659a7#diff-b94baf116534b68c0bc48cb6f in which if the Tx's .GetSigners method returns a non-nil error it doesn't process any more and indeed the profile picture shows no other processing happens except calling the next ante handler
so that's actually a great improvement in the latest release.

The only places I could see successive calls to proto.Unmarshal were in the obvious place: types.Any.Unpack* but those signatures have been like that since forever since we firstly have to invoke proto.Unmarshal(bz, anySave) to get the TypeURL which is then passed into the interfaceRegistry to match which value should be unpacked hence the second proto.Unmarshal invocation; in fact as I showed previously the new code doesn't unnecessarily call every antehandler

Wholesome transaction's lifetime profiling through simapp

I profiled simapp runs for simapp.TestFullAppSimulation, examined results and plenty of profiles between v0.47.0 and v0.50.X, there were 3 extra proto.Unmarshal calls in v0.50.X (139 calls) vs v0.47.0 (136 calls) but I didn't notice any repeated calls to proto.Unmarshal out of the ordinary

Verdict

I believe that we can go forth with the current release! However myself and @elias-orijtech should and shall be vigilant about getting continuous profiles and examining them then some continuous audits for performance improvements, it shaves the number of AnteHandlers called from 12 down to 2 just in that call for DeliverMsgs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T: Performance Performance improvements
Projects
None yet
Development

No branches or pull requests

5 participants