-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Optimistic Execution #16581
feat: Optimistic Execution #16581
Conversation
I think we need to be especially careful when OE is aborted, because
Maybe to keep it simple we can execute OE only if height > initialHeight. That way we can just clear up the state 100% of the times |
baseapp/abci.go
Outdated
// processed the first block, as we want to avoid overwriting the finalizeState | ||
// after state changes during InitChain. | ||
if req.Height > app.initialHeight { | ||
// abort any running OE | ||
if app.oeEnabled && app.oeInfo != nil && app.oeInfo.Running() { | ||
app.oeInfo.Abort() | ||
_, _ = app.oeInfo.WaitResult() // ignore the result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change potentially affects state.
Call sequence:
(*github.com/cosmos/cosmos-sdk/baseapp.BaseApp).ProcessProposal (baseapp/abci.go:467)
oe.logger.Error("OE aborted due to hash mismatch", "oe_hash", hex.EncodeToString(oe.request.Hash), "req_hash", hex.EncodeToString(reqHash), "oe_height", oe.request.Height, "req_height", oe.request.Height) | ||
oe.cancelFunc() | ||
return true | ||
} else if oe.abortRate > 0 && rand.Intn(100) < oe.abortRate { |
Check failure
Code scanning / gosec
Use of weak random number generator (math/rand instead of crypto/rand)
I tried cleaning up the concurrency model with little success, now I'm passing a context to internalFinalizeBlock which makes it unaware of the whole OE thing, it uses this context just to know if it needs to return early. Using channels is a cleaner way to do this, but the problem is that the function we are passing as our FinalizeBlockFn has access to baseapp and to its properties, so we have partial control over what's going on with the state. This ended up in getting wrong app hashes from time to time. A great improvement would be to have FinalizeBlock to use only whatever is passed to it and with little to no access to baseapp. This way we could run the entire block completely concurrently by passing only cached copies of the contexts and values; and by the time we abort we just discard all of these. |
go func() { | ||
start := time.Now() | ||
resp, err := oe.finalizeBlockFunc(ctx, oe.request) | ||
oe.mtx.Lock() | ||
executionTime := time.Since(start) | ||
oe.logger.Debug("OE finished", "duration", executionTime.String(), "height", req.Height, "hash", hex.EncodeToString(req.Hash)) | ||
oe.response, oe.err = resp, err | ||
close(oe.stopCh) | ||
oe.mtx.Unlock() | ||
}() |
Check notice
Code scanning / CodeQL
Spawning a Go routine
oe.initialized = true | ||
|
||
go func() { | ||
start := time.Now() |
Check warning
Code scanning / CodeQL
Calling the system time
Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com>
I've added a check so we don't start optimistic execution when ProcessProposal returns a "reject" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly doc nits.
@@ -241,7 +242,7 @@ func NewSimApp( | |||
voteExtHandler := NewVoteExtensionHandler() | |||
voteExtHandler.SetHandlers(bApp) | |||
} | |||
baseAppOptions = append(baseAppOptions, voteExtOp) | |||
baseAppOptions = append(baseAppOptions, voteExtOp, baseapp.SetOptimisticExecution()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it purposely done so that simapp only uses optimistic execution?
Or should we add it here:
Line 529 in e2637b8
baseapp.SetQueryGasLimit(cast.ToUint64(appOpts.Get(FlagQueryGasLimit))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I wouldn't like to add a flag for this, because that could give node operators the chance of not running optimistic execution if the developers chose to do so (they can still modify the code but that takes some extra effort).
Could we add a flag on simapp only? wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, a flag wouldn't make sense.
I was taking about adding it in our default baseapp options.
But I think I remember conversation about having it disabled by default first 🤔, so it makes sense to add it only in simapp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK!
@Mergifyio backport release/v0.50.x |
✅ Backports have been created
|
Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com> (cherry picked from commit 8df065b) # Conflicts: # CHANGELOG.md # baseapp/abci_test.go
This comment was marked as spam.
This comment was marked as spam.
Great work @facundomedica 👏 |
Description
RFC: #16499
Closes: #XXXX
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!
to the type prefix if API or client breaking changeCHANGELOG.md
Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...
!
in the type prefix if API or client breaking change