Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Immediate replication task hydration after successful transaction #4980

Merged
merged 13 commits into from
Sep 2, 2022

Conversation

vytautas-karpavicius
Copy link
Contributor

@vytautas-karpavicius vytautas-karpavicius commented Aug 30, 2022

What changed?

  • Added NewImmediateTaskHydrator with corresponding inner bits for providing readily available data for replication message hydration.
  • After successful transactions notify replication tasks with related data to immediately hydrate them and put them into cache.

Why?
After successful transaction we already have all necessary bits available for replication message hydration. We can take advantage of that - preemptively prepare replication messages and put them into cache, which will later save additional database calls (readhistorybranch and getworkflowexecution).

This should reduce overall load on database and also reduce replication latency.

How did you test it?

  • Unit tests.
  • Staging2

Potential risks

Release notes

Documentation Changes

@coveralls
Copy link

coveralls commented Aug 30, 2022

Pull Request Test Coverage Report for Build 0182fd26-638b-410e-9c20-7b9583f6bdc4

  • 110 of 166 (66.27%) changed or added relevant lines in 8 files are covered.
  • 64 unchanged lines in 13 files lost coverage.
  • Overall coverage increased (+0.02%) to 57.263%

Changes Missing Coverage Covered Lines Changed/Added Lines %
common/persistence/workflowExecutionInfo.go 10 14 71.43%
service/history/execution/context.go 52 68 76.47%
service/history/historyEngine.go 4 40 10.0%
Files with Coverage Reduction New Missed Lines %
common/task/weightedRoundRobinTaskScheduler.go 1 88.6%
service/history/reset/resetter.go 1 82.0%
common/cache/lru.go 2 92.2%
common/persistence/historyManager.go 2 66.67%
service/history/execution/mutable_state_builder.go 2 69.03%
service/history/task/transfer_active_task_executor.go 2 71.58%
service/matching/matcher.go 2 91.46%
service/matching/taskListManager.go 2 76.62%
service/history/task/task.go 3 77.06%
service/history/execution/cache.go 6 74.61%
Totals Coverage Status
Change from base Build 0182fc21-2e39-4846-b79d-75b8a9292382: 0.02%
Covered Lines: 85041
Relevant Lines: 148510

💛 - Coveralls

@vytautas-karpavicius vytautas-karpavicius requested a review from a team August 31, 2022 10:40
@vytautas-karpavicius vytautas-karpavicius marked this pull request as ready for review August 31, 2022 10:40

func (ms immediateMutableState) IsWorkflowExecutionRunning() bool {
// Immediate mutable state is always running
return true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the impact if the workflow get closed immediately before replication?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some consideration I've update this.

This is used as early return when hydrating sync activity tasks. I think this check is somewhat redundant, since there will be no pending activity left after completion, and replication task will not be hydrated either way.

But for now let's keep it here for consistency with regular hydration from loaded mutable state. I may remove this entirely from both immediate and regular hydration later.

Copy link
Contributor

@mantas-sidlauskas mantas-sidlauskas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, couple of nit level suggestions

@@ -368,12 +369,12 @@ func (c *contextImpl) CreateWorkflowExecution(
resp, err := c.createWorkflowExecutionWithRetry(ctx, createRequest)
if err != nil {
if c.isPersistenceTimeoutError(err) {
c.notifyTasksFromWorkflowSnapshot(newWorkflow, true)
c.notifyTasksFromWorkflowSnapshot(newWorkflow, events.PersistedBlobs{persistedHistory}, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, optional: to put more stress on the fact that notifyTasksFromWorkflowSnapshot must be called always. This also makes true/false self explanatory

Suggested change
c.notifyTasksFromWorkflowSnapshot(newWorkflow, events.PersistedBlobs{persistedHistory}, true)
resp, err := c.createWorkflowExecutionWithRetry(ctx, createRequest)
c.notifyTasksFromWorkflowSnapshot(newWorkflow, events.PersistedBlobs{persistedHistory}, c.isPersistenceTimeoutError(err))
if err != nil {
return err
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notification are not call when there is an error but it is not isPersistenceTimeoutError.
This can probably be simplified, but I will leave as is for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, missed that return. thanks

service/history/replication/task_hydrator.go Outdated Show resolved Hide resolved
service/history/replication/task_hydrator.go Outdated Show resolved Hide resolved
versionHistories,
activities,
history.Find(info.BranchToken, info.FirstEventID),
history.Find(info.NewRunBranchToken, 1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this always be 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed magic number 1 to a constant common.FirstEventID used elsewhere as well.
This is always 1 at least in the current implementation: https://github.com/uber/cadence/blob/master/service/history/replication/task_hydrator.go#L227

@vytautas-karpavicius vytautas-karpavicius merged commit 3362f85 into master Sep 2, 2022
@vytautas-karpavicius vytautas-karpavicius deleted the preemptive-hydration branch September 2, 2022 08:32
Groxx pushed a commit that referenced this pull request Sep 6, 2022
)

* Immediate replication task hydration after successful transaction

* Regenerate mocks

* Fixing tests

* More test fixes

* Test fixes

* More comments and unit tests

* Minor

* Clean commit

* Pass isRunning field to immediateMutableState

* Addressing review comments
Groxx pushed a commit that referenced this pull request Oct 6, 2022
)

* Immediate replication task hydration after successful transaction

* Regenerate mocks

* Fixing tests

* More test fixes

* Test fixes

* More comments and unit tests

* Minor

* Clean commit

* Pass isRunning field to immediateMutableState

* Addressing review comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants