-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Package Conditions: determining when a package is ready for deployment #3455
Comments
Here is a more drawn out description of how Conditions enable collaborative editing of the config, as well as decouple it from roll out: We have the concept of Proposed and Published packages against deployment repositories; this is the distinction between "ready for rollout" and "the intended state". With the concept of "Conditions" and "Condition Gates" for PackageRevisions, we can have multiple actors working on a package to converge on a "ready for rollout" state. The roll out controller ignores all packages prior to that. For example:
Alternatively to the "Proposed" state, we can use another Condition ReadyToApprove. But the Proposed state plays nicely with the human based workflow. |
kpt APIWe want to expose the criteria for when a package is ready to be used as part of the package. This is useful for packages in general, so we expose this through the Kptfile rather than implementing this on the porch level. The Kptfile already contains an
The structure is chosen to align with the PodReadinessGate type already used in Kubernetes. Conditions for packages are handled just like regular conditions in Kubernetes, although with its inclusion in the Kptfile it is possible that it will be set by users manually through an editor rather than by controllers. But the conditions being set by controllers through the porch API is probably the most likely use-case here too. Both the ReadinessGates and the Conditions can be updated by actors (human or controller) working to make the package ready. For example:
Porch APITo make it easy for controllers to interact with packages for automating updates to packages, we will expose the readiness gate through the porch packagerevision API. The readiness gate will be exposed as a new field in the Spec of the PackageRevision resource:
This mirrors the field in the Kptfile. Conditions will be added to the status object of the PackageRevisions resource using the standard Condition type from Kubernetes. The conditions will be stored in the status object of the Kptfile. It feels a bit unusual to have Conditions on a client-side CR like the Kptfile, so I would like to hear what others think about it. Storing the requirement gate and the conditions in the Kptfile also has the benefit that the git history will leave an audit log. ControllerWe will provide an operator as part of porch that will watch packagerevisions and automatically use the porch API to propose them when the conditions listed in the readiness gates become The absence of any conditions listed in the readiness gates means the package is ready to be published, so the controller will immediately propose the revision. This is not desirable, at least initially while we figure out the best workflow. Therefore, at least in the initial iteration of the package, users will need to use the UsageThe idea is that the readiness gates defines the conditions that must be met for a package to be deployed. If we consider that a package might be going through multiple levels of cloning and customization before being deployed, it will be common for abstract packages to be published with all or a subset of the conditions in the readiness gate still not fixed. It is also useful for cloned packages to add additional conditions as a result of either special requirements or customizations that leads to additional (or fewer) required conditions. |
I like the concepts here. I don't think we have certainty on all of this yet. We can take some first steps though: We have a well-established notion of validation functions; let's surface their results on the PackageRevision status. I like the idea of using Conditions, and if we can duck-type to metav1.Condition then this should feel very natural to users. We'll likely also want to surface more information - e.g. validation functions can surface structured error/info results, hence duck-typing rather than reuse of metav1.Conditions. I think from there we can add in human-validations and address the broader use-cases in this issue. I don't yet know how we should store those, I think it might depend on whether we end up moving PackageRevision to CRDs. |
We do need something beyond validation function status being surfaced sooner rather than later. I would like to be able to demonstrate in the Nephio workflow the use of package conditions to gate a package's publishing, where that condition is resolved by a controller. In particular, I am thinking an IPAM controller which looks at the type of workload, the region, and they type of interface and uses that to allocate an IP and insert it in the configuration. This is an important integration point into existing systems. This requires some way for controllers to set the status (there are ways we could use a function to manage the condition true/false status based on resource content in the package, but I am not sure that will always be possible). Another possible use is coordinating cross-cluster dependencies, though am not sure this is the right mechanism for that. An example is like the one @sandeepAarna did in the 9/21 Nephio SIG Automation call: one package provisions a cluster, another package provisions a workload on that cluster. The second package needs to wait for the cluster to achieve Ready state. Actually, as I write this down I do NOT think package conditions are the right place for that particular coordination. I'll have to think about how we can do that (see also #3448). Anyway, I think storing them in the status of a CR in the cluster works for porch, but storing them in a CR stored in the package works for both the CLI-only workflow and the Porch (UI or CLI) workflow. Do we think we need to support the CLI-only workflow? It does cause additional complexity. |
ContextIn an in-person discussion around package conditions, the question came up about how package conditions should be handled when a new package revision is created from a previous one. Should all/any of the package conditions be removed, or set to There are two properties involved here, the Since the new packagerevision starts out as identical to the existing one (The revision is different of course, but I don't think that should matter here), it seems like we should keep both the readinessGates and the conditions intact for the next revision (if all those conditions were true for the previous revision, they should also be true for this one since it is identical). It feels like the problem is we need some way to give controllers and users time to make changes to a newly created packagerevision (regardless of whether it is created through AlternativesWe could introduce some kind of delay between a packagerevision is created before it is proposed by the automation, but it is unclear what a reasonable delay should be. Overall, this approach seems a bit fragile. We could also automatically add an additional readinessGate for new packagerevisions (it would then be cleared on clone/copy), which would prevent a newly created packagerevision from being proposed until other actors have had a chance to update the packagerevision (for example by adding additional readinessGates). But this feels like we have only moved the problem, as we then have to decide who/what is responsible for clearing that readinessGate, and how would that controller/user know when it should happen? Next stepsI think we need to dig a bit deeper into the Nephio use-case here to see how this should work. |
This was fixed with #3614 |
Describe your problem
One issue that has come up for both the human and machine (#3452) consumption workflow is how to know when a package is ready for deployment. That is, knowing when all the required and/or expected customizations are complete; knowing when the package has moved from an
abstract
to adepoloyable
package. On the human consumption side, this relates to providing a guided experience in the GUI or CLI (@droot did some experimentation with a CLI workflow). On the machine side, we want to automatically move a package through the life cycle. One of the goals is to enable humans and machines to do iterative, collaborative editing of the package configuration. If multiple actors are operating on the config, how do we know when we can propose the package for approval into the deployment repo? How do we know when a package is ready for the next step in the life cycle (#3422)?Proposal
We face a similar issue in standard K8s controllers, and the solution has been Conditions. For example, we control whether a Pod is added to an EndpointSlice based upon its Ready condition. Can we allow packages to publish conditions in a similar way? They could be surfaced via the
kpt
CLI and via the PackageRevision CR in Porch. This can provide a "punch list" for what needs to be addressed in order for a package to be proposed for deployment.Packages would need to be able to define and publish their own Conditions. For example, if we need to allocate a value from an external source, we can use a Condition to represent if that has been done or not. It could be done manually or by a machine; either way should automatically flip the condition to "True". We then need some conventions around determining "readiness" for different parts of the life cycle; how do we aggregate a set of potentially package-specific Conditions into the one Condition that indicates readiness for the next phase? One solution would be a convention for a condition for each step in the life cycle (with flexibility to alter that life cycle), along with something like readiness gates.
Using this for rollouts
In #3348 there is a short discussion about using pinned syncs to control rollout. IIUC, that means controlling rollout at the ConfigSync step in the life cycle. This is too late in my opinion; at least it is when we have a 1:1 mapping between deployment repo and cluster. It means I can no longer just inspect the contents of the deployment repository to know what's being delivered in my cluster, but instead I have to look at the various syncs installed in the api server. Those may map to different tags and even different repos all over the place. This degrades the simplicity of the model.
Another approach would be to use status conditions on PackageRevisions. In this model, we can still have a simple 1:1 mapping between a deployment repository and a given cluster's API server. In the case of Porch, an additional condition controlled by external data provides a simple way to do rollouts that is decoupled from the config editing process. To do this, a controller can look at all "proposed" packages to a deployment repo matching some particular label selector, and can twiddle a status Condition. A different controller can then approve packages with the appropriate conditions, and that package merges. This decouples:
The pinned syncs may be the right approach if the cardinality of repo:cluster is 1:N rather than 1:1, but I think a 1:1 cardinality is simpler to understand and lends itself to this approach instead.
The text was updated successfully, but these errors were encountered: