-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support annotations for versioning behavior #1692
Support annotations for versioning behavior #1692
Conversation
@@ -1611,6 +1611,15 @@ func SetCurrentDetails(ctx Context, details string) { | |||
getWorkflowEnvOptions(ctx).currentDetails = details | |||
} | |||
|
|||
// SetVersioningBehavior sets the strategy to upgrade this workflow when the default Build ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we intentionally omit a getter here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, versioning behavior can be dependent on worker options so if app code branches based on it they will get an NDE. And in general, we wanted to be able to change annotations without needing patching, so no-getter helps there...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the versioning behaviour written into history?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to be able to get something you may have set. I think this should either be in workflow register options and/or this should be called SetVersioningBehaviorOverride
(and GetVersioningBehaviorOverride
will just return what you have called with the "set", not the resolved value). I assume there are good use cases for setting this do different values throughout the workflow logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Quinn-With-Two-Ns My understanding is that it is not part of the history, at least I do nothing special during replay, @ShahabT can you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the proto changes that @ShahabT did, the versioning behavior has been added to WorkflowTaskCompletedEventAttributes in history. I don't do anything with it (maybe I should?) but that would give the server a placeholder to record the versioning behavior that I send after each task...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, so it is in history, which makes sense if workflow author is expected to update it mid-workflow. I do wonder whether the getter should take that into account. I'll defer to y'all on desired behavior of the getter, but I do believe we should not have a setter without a getter (even if we have to add caveats to the behavior of the getter).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm misunderstanding, versioning behavior is more of a workflow worker level thing not workflow run/execution thing.
@cretz In general, I think of versioning behavior as primarily a workflow type level thing, with a workflow worker level default in case it's not specified at the type level (but from a product perspective, we want to encourage explicitly setting it for each workflow type, so that any developer reading the workflow definition can immediately see what the expected behavior is, and keep that in mind when they make changes to the code that could change it from short-running <-> long-running).
Still, there are some limited use-cases for overriding the versioning behavior at the workflow execution level.
-
Overriding the versioning behavior of an already-running workflow execution (
UpdateWorkflowExecutionProperties
):- User wants to pin a specific workflow execution that had a weird bug and is still being investigated, but they want to keep the workflow type as a whole unpinned because it is still long-running in general. I think our cloud release validation workflow would be an example of this, or other long-running test pipelines.
- A lot of other versioning behavior change scenarios are likely to be applied for all workflow executions of the same type, in which case the behavior change could be accomplished by changing the annotation in the workflow code, deploying a new build, then using the batch api to manually change the build id of running workflows so that they use the new workflow code and pick up the new annotation.
- I still think being able to pin a specific workflow execution without pinning the whole type is useful for debugging. Max liked this idea in my DevX presentation on
UpdateWorkflowExecutionProperties
scenarios. - I think overriding a specific wf execution from pinned --> unpinned is much less important, because the use case is less clear. If you want a specific wf execution to run on the current default build id, you can just change it's build id explicitly. Most use cases for moving to unpinned apply across the wf type.
-
Overriding the versioning behavior of a workflow execution on workflow start:
- Customers have requested that we provide a way to start a one-off workflow on a specific build id that is not yet the default build. This is the basic functionality they need to set up deployment canaries easily.
- It seems like this would involve some parameter such as
pinned_build_id_override
inStartWorkflowOptions
. I think it would be nice if this is symmetric to a concept ofpinned_build_id_override
inUpdateWorkflowExecutionProperties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The server stores the current build id and versioning behavior in Mutable State and in visibility. The previously-used build ids are also stored in visibility -- I'm not sure if previously-used build ids are also stores in Mutable State.
"getting" the current versioning behavior of a given workflow execution could be done by reading the mutable state or visibility entry.
Querying a bunch of workflow executions for a specific versioning behavior and/or build id combination would be a visibility query (we will use this for reachability).
It seems to me that to support a workflow-execution-level versioning behavior override, the server would need to store not just the (versioning behavior, build id) tuple, but also an is_override
(or some other name) bit, that basically means, if is_override, ignore whatever the sdk says about my workflow type and do task routing based on what is currently written in my mutable state as an override
. To me it seems like it could be handled on the server side, since that's where the task routing decision is made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think of versioning behavior as primarily a workflow type level thing, with a workflow worker level default in case it's not specified at the type level (but from a product perspective, we want to encourage explicitly setting it for each workflow type
If this is the case, it also needs to be added to workflow.RegisterOptions
.
"getting" the current versioning behavior of a given workflow execution could be done by reading the mutable state or visibility entry.
We don't want to get from server what's not already in history. We just need to make sure any SetWhatever
inside a workflow has a corresponding GetWhatever
. If this just returns the value that was "set" before in the same workflow code and not the actual server value, that's fine (though it does seem like task complete may have some form of the value). The docs of the getter just need to clarify.
I think the latest commit addresses most of the issues raised, let me know if it is ready to merge... |
@antlai-temporal - based on thread above, I wonder if this needs to be on |
We could have a type-level default in RegisterOptions that can be overridden by the programmatic API. That would be more consistent with the Java annotation+programmatic override I was thinking on. The Getter will probably return just the programmatic to be consistent that we can change worker defaults without patching. |
// SetVersioningBehavior sets the strategy to upgrade this workflow when the default Build ID | ||
// has changed, and Worker Versioning-3 has been enabled. | ||
// | ||
// SetVersioningBehavior should be called during the first workflow task, and the context `ctx` should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it is called, but not in the first workflow task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there was no worker level default, the server will fail the first workflow task, so there is no progress until a worker sets a default.
If tthere was a worker level default, it will change it ok.
In the comment I'm encouraging to set it in the first task to avoid the first case...
I think it's probably ok (will defer to @Quinn-With-Two-Ns) but probably want to create an issue to track the work. I do think we should be encouraging that register-options way to set worker-type level value and this should be a lesser-used in-workflow override.
This is why we should be encouraging the workflow register options approach over the inline approach (can even change the name of the inline one to something less wieldy, e.g. |
Yeah OK merging into the branch, +1 to everything Chad said. Once we are further along I would also like to see documentation as to how this workflow API is supposed to interact with versioning info set externally described here #1692 (comment). |
@cretz @Quinn-With-Two-Ns |
36167ae
into
temporalio:versioning-3
What was changed
This PR supports providing a default versioning behavior in Worker Options, and a programmatic API to override these defaults:
For example:
called during the first workflow task sets the versioning mode for that workflow to
Pinned
, eliminating the need of patching.And
sets the default versioning behavior to
AutoUpgrade
. Note thatDeploymentName
,UseBuildIDForVersioning
andBuildID
need to be set to enable versioning-3.If versioning-3 is on, but no default or programmatic versioning behavior has been set for a workflow, the server will fail its first workflow task.
Why?
A hint that a workflow is short-running, and therefore, does not need to be moved to the new Build ID on the next task, eliminates the need of patching for many workflow types.
Checklist
Add workflow annotations for versioning #1663
Integration test that uses versioning-2 and a grpc interceptor to show that versioning behaviors are propagated with responses.
The dependency on versioning-2 is just to set a default Build ID.