Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use common ParentExecutionInfo in proto IDLs #3997

Conversation

vytautas-karpavicius
Copy link
Contributor

What changed?
Use common api.v1.ParentExecutionInfo to group fields related to parent execution.

Why?
So, the primary reason that have led to this proposal, is the fact that all parent related fields are optional. To properly represent such fields we would need to wrap them with wrapper types. However since they always come together, it makes sense to group them into separate type which then can then carry nullable semantics for all of them.

We already have such type which was used within internal API (shared.v1.ParentExecutionInfo). There are two additional places in publicly exposed API api.v1.WorkflowExecutionStartedEventAttributes and api.v1.WorkflowExecutionInfo that also have parent execution information. However not all of the fields are exposed there. One solution would to unify exposed fields and just single type to represent that. This is nice because it:

  • Unifies exposed parent execution information
  • Automatically provides not set semantics and groups fields that can be set all or none.
    This does mean, that additional missing fields needs to be filled in (this PR attempts to do that). We can leave thrift IDLs as is, but they will be added for proto.

Some alternatives, that would work as well, but not as nice IMHO.

  • Simply use wrapper types to achieve optional semantics and leave fields ungrouped.
  • Use different grouping message(s) instead of reusing one from internal API.

How did you test it?

Potential risks

@vytautas-karpavicius vytautas-karpavicius requested a review from a team February 18, 2021 13:59
@coveralls
Copy link

coveralls commented Feb 18, 2021

Coverage Status

Coverage decreased (-0.07%) to 64.568% when pulling 76de2cf on vytautas-karpavicius:proto-parent-execution-info into 80b7bfc on uber:master.

@vytautas-karpavicius
Copy link
Contributor Author

There is also a similar situation for external_workflow_execution + external_initiated_event_id within history.RequestCancelWorkflowExecutionRequest. They are both either set or not. Depending on the decision made with this PR, we should be consistent and do the same thing there. So, for instance we could have:

message ExternalExecutionInfo {
  api.v1.WorkflowExecution workflow_execution = 1;
  int64 initiated_id = 2;
}

It also begs a question why external execution does not come with domain fields, as does parent execution.

if attributes.ParentWorkflowDomain != nil {
if attributes.ParentWorkflowDomainID != nil {
parentDomainID = attributes.ParentWorkflowDomainID
} else if attributes.ParentWorkflowDomain != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only domain or domain ID will be set in the attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both would be set. But for older history ParentWorkflowDomainID is not available, thus it would fallback to previous logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth a code comment for that, as it's a bit unclear from the code alone. it's pretty clear it's a fallback, but not about when it may no longer be necessary (if ever).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment.

@@ -1501,6 +1501,10 @@ func (e *historyEngineImpl) DescribeWorkflowExecution(
RunID: executionInfo.ParentRunID,
}
result.WorkflowExecutionInfo.ParentDomainID = common.StringPtr(executionInfo.ParentDomainID)
result.WorkflowExecutionInfo.ParentInitiatedID = common.Int64Ptr(executionInfo.InitiatedID)
if entry, err := e.getActiveDomainEntry(executionInfo.ParentDomainID); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

er == nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, great catch! Fixed it.

@Groxx
Copy link
Member

Groxx commented Feb 25, 2021

You've committed the cadence-bench binary btw

Copy link
Member

@Groxx Groxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also begs a question why external execution does not come with domain fields, as does parent execution.

I was wondering that while looking at that message, yeah... that said, I think all of these endpoints require you to pass in a domain? So in principle none of them need domain info AFAICT (though I don't see much of a downside to including them). Though I would be interested if there are counter-examples.

That said: tbh I don't really follow what the "external" workflow implies. If anything though, given the name, it seems like it'd be more appropriate for it to have domain info than for WorkflowExecutionInfo / etc.


Either way: consistency++, and these yeah, these fields all seem strongly inter-related, a separate message container makes a lot of sense IMO. I'm not really seeing a downside, so 👍 from me. Check comments of course, as you always do, and it may be worth getting another pair of eyes / waiting for @yux0? I'll leave that up to you tho.

if attributes.ParentWorkflowDomain != nil {
if attributes.ParentWorkflowDomainID != nil {
parentDomainID = attributes.ParentWorkflowDomainID
} else if attributes.ParentWorkflowDomain != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth a code comment for that, as it's a bit unclear from the code alone. it's pretty clear it's a fallback, but not about when it may no longer be necessary (if ever).

@@ -90,31 +90,29 @@ message HistoryEvent {

message WorkflowExecutionStartedEventAttributes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fairly-large similarity between this and WorkflowExecutionInfo had me comparing things out of curiosity for a while... they may not all make sense to exist in WorkflowExecutionInfo (e.g. identity is per-event, not per-workflow), but I wonder if some of the missing fields would be useful to bring over.

e.g. when listing or describing a workflow:

  • input/result seem potentially useful (though possibly large data on list endpoints - defaulting to null makes sense, could be filled in when explicitly requested)
  • original / continue run_id seems like something that should exist
  • cron schedule and retry policy are almost definitely relevant to people looking at a Describe
  • *_start_to_close_timeouts are workflow-wide, possibly relevant. (... though now that I think about it, it's a little strange that those are not modify-able as a workflow runs. the use would be fairly rare, probably?)

@@ -576,6 +576,7 @@ func (b *HistoryBuilder) newWorkflowExecutionStartedEvent(

parentInfo := startRequest.ParentExecutionInfo
if parentInfo != nil {
attributes.ParentWorkflowDomainID = parentInfo.DomainUUID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR except by proximity, but:
This kind of code:

	attributes := &types.WorkflowExecutionStartedEventAttributes{}
	attributes.WorkflowType = request.WorkflowType
	attributes.TaskList = request.TaskList
	...

Is rather hard to lint, e.g. that fields aren't forgotten. Or to be confident of it while coding.
It currently looks complete, but may be worth rewriting to a:

	attributes := &types.WorkflowExecutionStartedEventAttributes{
		WorkflowType:                        request.WorkflowType,
		TaskList:                            request.TaskList,
		Header:                              request.Header,
		Input:                               request.Input,
		ExecutionStartToCloseTimeoutSeconds: common.Int32Ptr(*request.ExecutionStartToCloseTimeoutSeconds),

Initializer (with an if for parentinfo stuff, since there's not much choice there).

As a concrete example, I can ctrl-space in that initializer and immediately see that 3 fields are missing - the parentinfo ones, handled below. I have yet to see tooling that helps even that much with the field-assignment pattern.


I've got a 90%-complete linter that checks for "complete" struct initializations, and it wouldn't be too hard to include "this func contains an assignment for every field" info in that check too. Maybe it'd be worth finishing and bringing in? Incomplete-initialization has been a fairly large source of sometimes-minor, often-long-lasting data losses in my experience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I do prefer direct initialization as well. Updated it.

If we could lint on that, it would be amazing. Just wondering how it is handled when some fields are intentionally left out. Do we add annotation in the code?

@vytautas-karpavicius
Copy link
Contributor Author

You've committed the cadence-bench binary btw

Thanks for noticing. Removed.

@vytautas-karpavicius
Copy link
Contributor Author

It also begs a question why external execution does not come with domain fields, as does parent execution.

I was wondering that while looking at that message, yeah... that said, I think all of these endpoints require you to pass in a domain? So in principle none of them need domain info AFAICT (though I don't see much of a downside to including them). Though I would be interested if there are counter-examples.

That said: tbh I don't really follow what the "external" workflow implies. If anything though, given the name, it seems like it'd be more appropriate for it to have domain info than for WorkflowExecutionInfo / etc.

Either way: consistency++, and these yeah, these fields all seem strongly inter-related, a separate message container makes a lot of sense IMO. I'm not really seeing a downside, so 👍 from me. Check comments of course, as you always do, and it may be worth getting another pair of eyes / waiting for @yux0? I'll leave that up to you tho.

Thanks for all your insights.

Yep, will wait for second pair of eyes on this one. Kind of important changes happening here.

@vytautas-karpavicius vytautas-karpavicius merged commit c5d67e3 into cadence-workflow:master Feb 26, 2021
yux0 pushed a commit to yux0/cadence that referenced this pull request May 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants