Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPC queries aren't running in parallel despite the note in the 0.44.3 release #10859

Closed
4 tasks
reuvenpo opened this issue Jan 3, 2022 · 5 comments
Closed
4 tasks
Labels
T:Bug T: Client UX T: Dev UX UX for SDK developers (i.e. how to call our code) T:tech debt Tech debt that should be cleaned up

Comments

@reuvenpo
Copy link

reuvenpo commented Jan 3, 2022

Summary of Bug

The release notes for v0.44.3 mention that:

The main performance improvement concerns gRPC queries, which are now able to run concurrently on the node (#10045).

But my testing shows that this is not the case, and queries actually continue to run sequentially, rather than utilize all the available resources and run in parallel, even though I'm using a test chain with the latest version of the cosmos SDK (v0.44.5).

An example of how this looks like can be seen in Secret Network's codebase, where I've been experimenting with this. I can see that gRPC queries run in parallel (in my code) up to this point and come back to my code sequentially here. (The code in the linked file is compiled from this .proto file.).

After advising the Terra codebase, I found that they have noticed this as well, and have worked around this by forking tendermint and in their own branch committing this change. They then replaced the stock tendermint implementation with their own.

This is odd, as the v0.44.3 release notes state that tendermint is being skipped in the gRPC process, but nonetheless, I tried adding that same replace directive in my own codebase which has resolved the issue:

replace github.com/tendermint/tendermint => github.com/terra-money/tendermint v0.34.14-public.2

Is the change in the Terra repository stable? Should it be upstreamed and a new version of cosmos-sdk released to resolve this? Perhaps another solution should be considered?

Thanks for looking into this and helping resolve this issue!

Version

releases v0.44.3 and above.

Steps to Reproduce

Set up a test chain based on the latest SDK, add a debug print before and after the concrete implementation of one of the queries (or create your own), maybe add a long sleep time in there too, and then (using some concurrent client like with JS) create 10-100 queries and wait for their responses concurrently (e.g. with await Promise.all(promises); in JS), Now look at the logs. The queries will output their debug logs sequentially.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@alexanderbez
Copy link
Contributor

@marbar3778 do you have any insight here? I thought we added a RW mutex to Tendermint in the ABCI layer recently, no?

@alexanderbez alexanderbez self-assigned this Jan 3, 2022
@tac0turtle
Copy link
Member

tac0turtle commented Jan 3, 2022

Talked in slack but to reiterate, we reverted RWMutex in abc due to p2p issues. On top of this gRPC shouldn't be routed through Tendermint anymore but it seems it is or there is a mutex in ABI holding things back.

Based off the issue, by changing the Tendermint ABCI mutex to RW it makes me assume its still being routed through Tendermint still

@tac0turtle
Copy link
Member

@reuvenpo would you want to test this pr #10997 to see if it solves your issue?

@alexanderbez alexanderbez removed their assignment Mar 21, 2022
@tac0turtle tac0turtle added T:Bug T: Dev UX UX for SDK developers (i.e. how to call our code) T: Client UX T:tech debt Tech debt that should be cleaned up labels Oct 21, 2022
@TobiaszCudnik
Copy link

TobiaszCudnik commented Feb 9, 2023

Can not reproduce it as of ad847c4.

Ive created a draft PR with a regression test. Im new to the codebase so I may be missing something obvious here.

@tac0turtle
Copy link
Member

closing this based off the work of tobias

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T:Bug T: Client UX T: Dev UX UX for SDK developers (i.e. how to call our code) T:tech debt Tech debt that should be cleaned up
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants