-
Notifications
You must be signed in to change notification settings - Fork 51
How To: Author a ForgeTree
This page provides a detailed guide how to author a ForgeTree and utilize the various features of Forge.
- ForgeTree.Tree and ForgeTree.RootTreeNodeKey
- TreeNodeType.Selection and TreeNode.ChildSelector
- TreeNodeType.Action and TreeNode.Actions
- TreeAction.Action and TreeAction.Input
- TreeNodeType.Leaf
- Native LeafNodeSummaryAction
- TreeNode.Properties
- TreeAction.RetryPolicy and TreeAction.ContinuationOnRetryExhaustion
- TreeNode.Timeout, TreeAction.Timeout and TreeAction.ContinuationOnTimeout
- TreeAction.Properties
- Overview of ForgeTree Properties
- How Forge TreeWalker Walks the ForgeTree
- ForgeTree.cs data contract
In this section we'll introduce ForgeTree properties one-by-one while going through a sample ForgeTree schema. By the end of this section, you should be able to comprehend and author basic ForgeTree schemas.
Be sure to read the other sections to learn more about Subroutines, Roslyn, and other tips, tricks, and best practices.
{
"RootTreeNodeKey": "Root",
"Tree": {
"Root": {
"Type": "Selection",
"ChildSelector": [
{
"Label": "Container",
"ShouldSelect": "C#|UserContext.ResourceType == \"Container\"",
"Child": "Container"
},
{
"Label": "Node",
"ShouldSelect": "C#|UserContext.ResourceType == \"Node\"",
"Child": "Node"
}
]
},
"Container": {
"Type": "Action",
"Actions": {
"CollectDiagnosticsAction_Container": {
"Action": "CollectDiagnosticsAction"
}
},
"ChildSelector": [
{
"Label": "Tardigrade",
"ShouldSelect": "C#|Session.GetLastActionResponse().Status == \"Success\"",
"Child": "Tardigrade"
}
]
},
"Tardigrade": {
"Type": "Action",
"Actions": {
"TardigradeAction_Tardigrade": {
"Action": "TardigradeAction",
"Input": {
"Context": "ContainerFault",
"EnableV2": true,
"DiagnosticData": "C#|Session.GetLastActionResponse().Output"
}
}
},
"ChildSelector": [
{
"Label": "Tardigrade_Success",
"ShouldSelect": "C#|(await Session.GetOutputAsync(\"Tardigrade_TardigradeAction\")).Status == \"Success\"",
"Child": "Tardigrade_Success"
},
{
"Label": "Tardigrade_Failure",
"Child": "Tardigrade_Failure"
}
]
},
"Tardigrade_Failure": {
"Type": "Leaf"
},
"Tardigrade_Success": {
"Type": "Leaf",
"Actions": {
"LeafNodeSummaryAction_Tardigrade_Success": {
"Action": "LeafNodeSummaryAction",
"Input": {
"Status": "C#|string.Format(\"{0}_{1}\", \"ContainerFaultScenario\", (await Session.GetLastActionResponseAsync()).Status)",
"StatusCode": 0,
"Output": {
"ActionOutput": "C#|(await Session.GetLastActionResponseAsync()).Output",
"DiagnosticsOutput": "C#|(await Session.GetOutputAsync(\"Container_CollectDiagnosticsAction\")).Output"
}
}
}
}
},
"Node": {
"Type": "Selection",
"Properties": {
"Notes": "Decision to Reboot or Evacuate is decided in UserContext.ShouldReboot()."
},
"ChildSelector": [
{
"Label": "Reboot",
"ShouldSelect": "C#|UserContext.ShouldReboot()",
"Child": "Reboot"
},
{
"Label": "Evacuate",
"Child": "Evacuate"
}
]
},
"Reboot": {
"Type": "Action",
"Actions": {
"RebootAction_Reboot": {
"Action": "RebootAction",
"RetryPolicy": {
"Type": "FixedCount",
"MaxRetryCount": 3
},
"ContinuationOnRetryExhaustion": true
}
}
},
"Evacuate": {
"Type": "Action",
"Actions": {
"EvacuateAction_Evacuate": {
"Action": "EvacuateAction",
"Timeout": 60000,
"RetryPolicy": {
"Type": "ExponentialBackoff",
"MinBackoffMs": 1000,
"MaxBackoffMs": 30000
},
"ContinuationOnTimeout": true
},
"NotifyCustomerAction_Evacuate": {
"Action": "NotifyCustomerAction",
"Properties": "Notify customer in parallel with the impact."
}
},
"Timeout": "C#|UserContext.GetTimeoutForEvacuatingAndNotifyingCustomer()"
}
}
}
{
"RootTreeNodeKey": "Root",
"Tree": {
"Root": {
Tree is a dictionary that maps unique TreeNodeKey strings to TreeNodes. In the example we see several TreeNodeKeys, such as: Root, Container, Node, etc..
RootTreeNodeKey defines the suggested "Root" TreeNodeKey that should be visited first when walking the tree. Without this property, the callers of WalkTree would not know which TreeNodeKey to visit first since the TreeNodes are organized in a dictionary.
The default value of RootTreeNodeKey is "Root."
"Root": {
"Type": "Selection",
"ChildSelector": [
{
"Label": "Container",
"ShouldSelect": "C#|UserContext.ResourceType == \"Container\"",
"Child": "Container"
},
{
"Label": "Node",
"ShouldSelect": "C#|UserContext.ResourceType == \"Node\"",
"Child": "Node"
}
]
},
Let's look at our first TreeNode, which is a Selection node. There are 4 TreeNodeTypes: Selection, Action, Leaf, and Subroutine. Selection type nodes have the following behavior:
- ChildSelector property is defined. Attempts to select a child node to visit next.
- Does not execute Actions.
- Does not consume the TreeNode Timeout.
- Shows up as a diamond in ForgeEditor.
ChildSelectors define the connections to child TreeNodes, as well as the conditions required to visit each child. Several TreeNodeTypes can use ChildSelectors, including: Selection, Action, and Subroutine. Let's break down each property:
- Label - This is used only in ForgeEditor for visualization purposes. It is the text that hovers above each child TreeNode. Recommend using this to describe the ShouldSelect statement. This gives context to why the child is being visited.
- ShouldSelect - A string code-snippet that can be parsed and evaluated to a boolean value. If the expression is true, visit the attached Child TreeNode. If the expression is empty, evaluate to true by default. We'll dive more into "C#|" and Roslyn further down. For now, you can read the first ShouldSelect statement as follows: If the ResourceType equals Container, then visit the Container TreeNode.
- Child - The string TreeNodeKey that will be visited if the attached ShouldSelect expression evaluates to true.
"Container": {
"Type": "Action",
"Actions": {
"CollectDiagnosticsAction_Container": {
"Action": "CollectDiagnosticsAction"
}
},
"ChildSelector": [
{
"Label": "Tardigrade",
"ShouldSelect": "C#|Session.GetLastActionResponse().Status == \"Success\"",
"Child": "Tardigrade"
}
]
},
Our next TreeNode is an Action node. Action type nodes have the following behavior:
- Executes Actions. Must contain at least one Action.
- If multiple Actions are defined, executes them all in parallel.
- Cannot execute a SubroutineAction.
- Optionally, ChildSelector can be defined. Child selection happens after executing Actions, as long as there are no unhandled exceptions/timeouts.
- Optionally, TreeNode-level Timeout can be defined. This is the timeout in milliseconds for executing all TreeActions. If the timeout is hit, a TimeoutException will be thrown and the tree walker session will be cancelled.
- Shows up as a rectangle in ForgeEditor.
Actions is a dictionary that maps unique TreeActionKey strings to TreeActions. In the example we see several TreeActionKeys, such as: CollectDiagnosticsAction_Container, TardigradeAction_Tardigrade, etc..
TreeActionKeys must be unique across each ForgeTree. This is required because Forge uses the TreeActionKey when persisting some state. The application owner could also decide to enforce globally unique TreeActionKeys. Global uniqueness allows TreeActionKeys by themselves to be a strong key, instead of having to couple it with TreeNode or TreeName.
"Tardigrade": {
"Type": "Action",
"Actions": {
"TardigradeAction_Tardigrade": {
"Action": "TardigradeAction",
"Input": {
"Context": "ContainerFault",
"EnableV2": true,
"DiagnosticData": "C#|Session.GetLastActionResponse().Output"
}
}
},
"ChildSelector": [
{
"Label": "Tardigrade_Success",
"ShouldSelect": "C#|(await Session.GetOutputAsync(\"Tardigrade_TardigradeAction\")).Status == \"Success\"",
"Child": "Tardigrade_Success"
},
{
"Label": "Tardigrade_Failure",
"Child": "Tardigrade_Failure"
}
]
},
The string name that maps to a ForgeAction. In the example we see several ActionNames, such as: CollectDiagnosticsAction, TardigradeAction, LeafNodeSummaryAction, etc.. These all map to classes that have been tagged with the ForgeActionAttribute and inherit from Forge's BaseAction class.
More details here:
The Input property becomes the ActionInput object passed to the corresponding ForgeAction. Like other dynamic properties in the ForgeTree, this can be any supported JSON type including object, string, number, dictionary, array, etc.. Forge tree walker will dynamically evaluate the Input property while walking the tree, instantiate the object as the specified Type, and pass it to the ForgeAction.
The recommended way to utilize ActionInput is for the ForgeAction to specify the desired InputType in the ForgeActionAttribute. This allows Forge tree walker to create the desired Type object, and fill its properties from the TreeAction.Input. This has the benefit of type safety for the ForgeAction author and ForgeTree author, since they are using the same data contract.
The not recommended to utilize ActionInput is for the ForgeAction to not specify any InputType, but still allow the TreeAction.Input to be used. (Note: ForgeAction authors can choose which ForgeTree properties and values are allowed by utilizing the ForgeSchemaValidationRules if the application is utilizing that feature.) In this case, Forge will create a dynamic JObject from the TreeAction.Input. This is not recommended because you lose the type safety and data contract between ForgeAction author and ForgeTree author.
ForgeActions can specify to not have any InputType, like we saw in the CollectDiagnosticsAction.
More details here:
In this example, we see the Input for TardigradeAction with 3 properties: Context, EnableV2, and DiagnosticsData. So the TardigradeInput class could be defined like this:
[ForgeAction(InputType: typeof(TardigradeInput))]
public class TardigradeAction : BaseAction { ... }
public class TardigradeInput
{
public string Context { get; set; }
public bool EnableV2 { get; set; }
public DiagnosticData DiagnosticData { get; set; }
public long PollingIntervalInMilliseconds { get; set; } = 1000;
public string AdditionalDetails { get; set; }
}
Few things to note:
- Notice the Types were aligned for the 3 properties specified in TreeAction.Input. Context given as a string, EnableV2 as a bool, and DiagnosticData given as DiagnosticData. Unexpected properties or types will likely result in an exception. E.g. An exception will be thrown if the TreeAction.Input tried to set DiagnosticData as a string, or tried to add an undefined property.
- DiagnosticData came from a Roslyn expression that gets the ActionResponse.Output object from the CollectDiagnosticsAction. Not shown in the example is the ForgeAction defining the Output object as type DiagnosticData.
- It is a common pattern in Forge to use either the results of previous ForgeActions or data from the UserContext as ActionInput.
- Not all properties of TardigradeInput were used in the TreeAction.Input. Since Forge initializes the TardigradeInput object, the default value of PollingIntervalInMilliseconds will be set without having to specify in the TreeAction.Input.
- AdditionalDetails has no default value and was not specified in the TreeAction.Input, so it will be null. The ForgeAction is expected to handle this gracefully, or the author should require the property to be specified in TreeAction.Input by utilizing the ForgeSchemaValidationRules.
"Tardigrade_Failure": {
"Type": "Leaf"
},
Our next TreeNode is a Leaf node. Leaf type nodes have the following behavior:
- Only TreeNodeType that does not allow ChildSelectors.
- Optionally can execute the native LeafNodeSummaryAction. No other ForgeActions can be executed on Leaf nodes, and only one LeafNodeSummaryAction can be executed.
- Does not consume the TreeNode Timeout.
- Shows up as a circle in ForgeEditor.
Since they cannot have children, Leaf nodes are used as a clear indicator for the end of tree paths.
"Tardigrade_Success": {
"Type": "Leaf",
"Actions": {
"LeafNodeSummaryAction_Tardigrade_Success": {
"Action": "LeafNodeSummaryAction",
"Input": {
"Status": "C#|string.Format(\"{0}_{1}\", \"ContainerFaultScenario\", (await Session.GetLastActionResponseAsync()).Status)",
"StatusCode": 0,
"Output": {
"ActionOutput": "C#|(await Session.GetLastActionResponseAsync()).Output",
"DiagnosticsOutput": "C#|(await Session.GetOutputAsync(\"Container_CollectDiagnosticsAction\")).Output"
}
}
}
}
},
The LeafNodeSummaryAction is a native ForgeAction, and can be optionally executed from Leaf type TreeNodes.
This Action takes an ActionResponse as its ActionInput, either as an object or as properties, and commits this object as its ActionResponse.
This Action is intended to give schema authors the ability to cleanly end a tree walking path with a customized summary.
The Input is of type ActionResponse, which defines a string Status, int StatusCode, and object Output. In the example, we see a few different ways to set these properties.
- Status is set as a Roslyn expression which combines the previous ActionResponse.Status with a hardcoded "scenario" string.
- StatusCode is simply set to 0.
- Output in this case is a dynamic object with 2 dynamically defined properties. ActionOutput is set as the previous ActionReponse.Output. DiagnosticsOutput is set as the ActionResponse.Output of Container_CollectDiagnosticsAction.
LeafNodeSummaryActions are particularly useful in Subroutines, since SubroutineActions return the GetLastActionResponse as its ActionResponse. So ending paths in a Subroutine Tree with a LeafNodeSummaryAction allows you to predictably set the ActionResponse of the calling SubroutineAction.
"Node": {
"Type": "Selection",
"Properties": {
"Notes": "Decision to Reboot or Evacuate is decided in UserContext.ShouldReboot()."
},
"ChildSelector": [
{
"Label": "Reboot",
"ShouldSelect": "C#|UserContext.ShouldReboot()",
"Child": "Reboot"
},
{
"Label": "Evacuate",
"Child": "Evacuate"
}
]
},
TreeNode.Properties is a dynamic object that gets evaluated by Forge and passed to the Before/AfterVisitNode callbacks. This property allows new functionality to be seamlessly piped into the tree model and consumed by the application.
In this example, the application isn't actually consuming the object. Properties is being used to add comments to the JSON tree.
More details here:
"Reboot": {
"Type": "Action",
"Actions": {
"RebootAction_Reboot": {
"Action": "RebootAction",
"RetryPolicy": {
"Type": "FixedCount",
"MaxRetryCount": 3
},
"ContinuationOnRetryExhaustion": true
}
}
},
More details here:
Our next TreeNode is an Action type node that defines a RetryPolicy. This is Forge's built-in retry functionality, and can be added to any ForgeAction. Whenever an exception is thrown while executing a ForgeAction, Forge tree walker will attempt to retry executing the Action according to the RetryPolicy and TreeAction.Timeout.
Note that there are several non-retriable exceptions that will cause Forge tree walker to immediately halt/fail, including:
- OperationCanceledException - Thrown when cancellation token is triggered.
- ActionTimeoutException - Thrown when TreeAction.Timeout is hit.
- EvaluateDynamicPropertyException - Thrown when Forge hits an exception while evaluating schema properties. This is usually thrown when evaluating TreeAction.Input or ShouldSelect statements. It is recommended to write UTs to verify schema properties can be successfully evaluated.
Before adding a RetryPolicy to a ForgeAction, be sure to check with the ForgeAction author how they would like RetryPolicy to be utilized. Find more details in the above link regarding Forge behavior.
Type - There are several types of retry policy available:
- None - Do not retry. This is the default value.
- FixedInterval - Retry at a fixed interval every MinBackoffMs.
- ExponentialBackoff - Retry with an exponential backoff. Start with MinBackoffMs, then wait Math.Min(MinBackoffMs * 2^(retryCount), MaxBackoffMs).
- FixedCount - Retry a fixed number of times based on RetryPolicy.MaxRetryCount. Wait RetryPolicy.MinBackoffMs between retries. (Note that Timeout values can be used with this retry type as well.)
-MinBackoffMs_ - Minimum backoff time in milliseconds. When retrying an action, wait at least this long before your next attempt. Set to 0 by default.
MaxBackoffMs - Maximum backoff time in milliseconds. When retrying an action, wait at most this long before your next attempt. Set to 0 by default.
MaxRetryCount - Maximum number of attempts to execute an action before failing. This property is only used when Type is RetryPolicyType.FixedCount. Default value is 1 (action runs only once and doesn't retry).
This boolean flag represents how to handle the exit of an action due to retry exhaustion. If false (default), then the tree walking session will halt/fail once retries are exhausted or no retries are specified. If true and retries are exhausted, the tree walking session will commit an ActionResponse with Status as "RetryExhaustedOnAction" before continuing on as if it were successful.
Use this flag when you expect a ForgeAction could fail, and you would like to continue walking the tree.
"Evacuate": {
"Type": "Action",
"Timeout": "C#|UserContext.GetTimeoutForEvacuatingAndNotifyingCustomer()",
"Actions": {
"EvacuateAction_Evacuate": {
"Action": "EvacuateAction",
"Timeout": 60000,
"RetryPolicy": {
"Type": "ExponentialBackoff",
"MinBackoffMs": 1000,
"MaxBackoffMs": 30000
},
"ContinuationOnTimeout": true
},
"NotifyCustomerAction_Evacuate": {
"Action": "NotifyCustomerAction",
"Properties": "Notify customer in parallel with the impact."
}
}
}
Timeout in milliseconds for executing the TreeActions. Default to -1 (infinite) if not specified. A TimeoutException is thrown when this timeout is hit, causing the tree walking session to halt/fail.
This can be useful when you have several TreeActions set, but would like to enforce an uber timeout value for the entire TreeNode. Note that the "ContinuationOn*" flags do not affect the TreeNode.Timeout. Do not use TreeNode.Timeout if you wish to continue walking the tree when TreeActions get timed out.
In this example, we also see the Timeout value is set by a Roslyn expression. This shows it is possible to dynamically decide the timeout value at runtime.
Timeout in milliseconds for executing the TreeAction. Default to -1 (infinite) if not specified.
This is Forge's built-in timeout functionality, and can be added to any ForgeAction. When the timeout is hit before the ForgeAction returns, an ActionTimeoutException will be thrown and the tree walker session will halt/fail. That is, unless the ContinuationOnTimeout flag is set.
Timeout and RetryPolicy can both be set on a TreeAction, or either can be set by itself. It is helpful to think of them as two separate concepts. In this example, we see the Action is given a 60 second timeout with retries that will continue successfully if the timeout is hit.
This boolean flag represents how to handle the exit of the action due to timeout. If false (default), then the tree walking session will halt/fail on the timeout. If true and a timeout is hit, the tree walking session will commit an ActionResponse with status as "TimeoutOnAction" before continuing on as if it were successful.
One interesting behavior to note is when Timeout is set, RetryPolicy is not set, and the Action fails before the timeout. In this case, it is considered a retry exhausted failure and not a Timeout failure. The ContinuationOnRetryExhaustion flag will be checked in this case, not the ContinuationOnTimeout flag.
Use this flag when you want to set a timeout limit on the ForgeAction execution time, and would like to continue walking the tree if the timeout is hit.
"NotifyCustomerAction_Evacuate": {
"Action": "NotifyCustomerAction",
"Properties": "Notify customer in parallel with the impact."
}
TreeAction.Properties is a dynamic object that gets evaluated by Forge and passed to the ForgeAction.
This is very similar to TreeNode.Properties except they are passed to different callbacks. There are less use-cases for TreeAction.Properties, since you will typically use TreeAction.Input to pass objects to ForgeActions.
In this example, we see Properties is set to a string. Properties can be a dynamic object, dictionary, array, string, number, bool, etc..
In this section we'll talk about authoring Subroutine Trees and how to execute them with the native "SubroutineAction" ForgeAction.
A Subroutine Tree is just a ForgeTree that gets walked by Forge instead of the application. The ForgeTree that is walked from the application is called the "RootTree," and uses that as its TreeName by default. Additional ForgeTrees are referred to as Subroutine Trees, and are called via a SubroutineAction.
The SubroutineAction is a native ForgeAction that walks a tree walking session. The SubroutineInput is passed to the application to help it instantiate a tree walking session for the desired Subroutine Tree.
- Multi-file support - Use Subroutines to break up large ForgeTree schema files into multiple ForgeTrees over multiple files. If a ForgeTree gets too large, it can become more difficult to view in ForgeEditor and reason about intuitively. Editing large JSON files are also more prone to hitting merge conflicts. Having multiple files also allows you to update the files independently, enabling independent flighting or hotpatching.
- Compartmentalization - Compartmentalize ForgeTree schema files by author or scenario. This is especially helpful when multiple contributors across multiple teams are authoring ForgeTrees. Having separate files per author or scenario helps reduce merge conflicts, and makes for simpler trees to view and comprehend.
- Uber-Actions - Use Subroutines to create uber-Actions. For example, you want to achieve some behavior and a single TreeNode/ForgeAction isn't going to cut it. Perhaps whenever you execute a particular ForgeAction, you first want to run some prechecks. If the Action fails, you want to execute a fallback Action.. By placing this scenario into a Subroutine, it becomes easier for users to simply call the Subroutine versus calling the same TreeNode path (or worse, duplicating all the TreeNodes!). The Subroutine author has an easier time maintaining the behavior in a single location.
- Parallelization - SubroutineActions can run in parallel with other ForgeActions, including other SubroutineActions.