diff --git a/docs/_design/architecture.md b/docs/_design/architecture.md deleted file mode 100644 index d370fc9..0000000 --- a/docs/_design/architecture.md +++ /dev/null @@ -1,6 +0,0 @@ ---- -title: Architecture ---- - -This document describes the design rationale for the overall -Metacontroller system architecture. diff --git a/docs/_design/caching.md b/docs/_design/caching.md deleted file mode 100644 index 00481b5..0000000 --- a/docs/_design/caching.md +++ /dev/null @@ -1,6 +0,0 @@ ---- -title: Caching ---- - -This document describes the design rationale for the use of shared -informers in Metacontroller. diff --git a/docs/_design/map-controller.md b/docs/_design/map-controller.md new file mode 100644 index 0000000..675e84a --- /dev/null +++ b/docs/_design/map-controller.md @@ -0,0 +1,448 @@ +--- +title: MapController +--- +This is a design proposal for an API called MapController. + +## Background + +Metacontroller APIs are meant to represent common controller patterns. +The goal of these APIs as a group is to strike a balance between being flexible +enough to handle unforeseen use cases and providing strong enough "rails" to +avoid pushing the hard parts onto users. +The initial strategy is to target controller patterns that are analogous to +proven design patterns in functional or object-oriented programming. + +For example, CompositeController lets you define the canonical relationship +between some object (the parent node) and the objects that are directly under it +in an ownership tree (child nodes). +This is analogous to the [Composite pattern][] in that it lets you manage a +group of child objects as if were one object (by manipulating only the parent +object). + +Similarly, DecoratorController lets you add new child nodes to a parent node +that already has some other behavior. +This is analogous to the [Decorator pattern][] in that it lets you dynamically +wrap new behavior around select instances of an existing object type without +having to create a new type. + +[Composite pattern]: https://en.wikipedia.org/wiki/Composite_pattern +[Decorator pattern]: https://en.wikipedia.org/wiki/Decorator_pattern + +## Problem Statement + +The problem that MapController addresses is that neither CompositeController nor +DecoratorController allow you to make decisions based on objects that +aren't owned by the particular parent object being processed. +That's because in the absence of a parent-child relationship, there are +arbitrarily many ways you could pick what other objects you want to look at. + +To avoid having to send every object in a given resource (e.g. every Pod) +on every hook invocation, there must be some way to tell Metacontroller which +objects you need to see (that you don't own) to compute your desired state. +Rather than try to embed various options for declaring these relationships +(object name? label selector? field selector?) into each existing Metacontroller +API, the goal of MapController is to provide a solution that's orthogonal to the +existing APIs. + +In other words, we attempt to separate the task of looking at non-owned objects +(MapController) from the task of defining objects that are composed of other +objects (CompositeController) so that users can mix and match these APIs +(and future APIs) as needed without being limited to the precise scenarios we're +able to anticipate. + +## Proposed Solution + +MapController lets you define a collection of objects owned by a parent object, +where each child object is generated by some mapping from a non-owned object. +This is analogous to the general concept of a [map function][] in that it calls +your hook for each object in some input list (of non-owned objects), and creates +an output list (of child objects) containing the results of each call. + +A single `sync` pass for a MapController roughly resembles this pseudocode: + +``` +def sync_map_controller(): + input_list = get_matching_objects(input_resource, input_selector) + output_list = list() + + foreach input_object in input_list: + output_list.append(map_hook(input_object)) + + reconcile_objects(output_list) +``` + +where `map_hook()` is the only code that the MapController user writes, +as a lambda hook. + +In general, MapController addresses use cases that can be described as, +"For every matching X object that already exists, I want to create some number +of Y objects according to the parameters stored in the parent object." + +[map function]: https://en.wikipedia.org/wiki/Map_(higher-order_function) + +## Alternatives Considered + +### Friend Resources + +Add a new type of "non-child" resource to CompositeController called +"friend resources". +Along with all the matching children, we would also send all matching objects of +the friend resource types to the sync hook request. + +Matching would be determined with the parent's selector, just like for children. +However, we would not require friends to have a ControllerRef pointing to the +parent (the parent-friend relationship is non-exclusive), and the parent will +not attempt to adopt friends. + +The sync hook response would not contain friends, because we don't want to force +you to list a desired state for all your friends every time. +This means you cannot edit or delete your friends. + +This approach was not chosen because: + +1. We have to send the entire list of matching friends as one big hook request. + This complicates the user's hook code because they probably need to loop over + each friend. + It's also inefficient for patterns like *"for every X (where there are a lot + of X's), create a Y"* since we have to sync every X if any one of them + changes, and we can't process any of them in parallel. +1. It's tied in with the CompositeController API, and doing something similar + for other APIs like DecoratorController would require both duplicated and + different effort (see [Decorator Resources](#decorator-resources)). +1. It either forces you to use the same selector to find friends as you use to + claim children, or it complicates the API with multiple selectors for + different resources, which becomes difficult to reason about. +1. If we force the same selector to apply to both friends and children, + we also force you to explicitly specify a meaningful set of labels. + You can't use selector generation (`controller-uid: ###`) for cases when you + don't need orphaning and adoption; your friends won't match that selector. + +### Decorator Resources + +Add a new type of resource to DecoratorController called a decorator resource, +which contains objects that inform the behavior of the decorator. +This would allow controllers that look at non-owned resources as part of +computing the desired state of their children. + +In particular, you could use DecoratorController to create attachments +(extra children) on a parent object, while basing your desired state on +information in another object (the decorator resource) that is not owned by that +parent. + +This approach was not chosen because: + +1. It's unclear how we would "link" objects of the decorator resource to + particular parent objects being processed. + Would we apply the parent selector to find decorator objects? + Or apply a selector inside the decorator object to determine if it matches + the parent object? + Whatever we choose, it will likely be unintuitive and confusing for users. +1. It's unclear what should happen if multiple decorator objects match a single + parent object. + We could send multiple decorator objects to the hook, but that just passes + the complexity on to the user. +1. It's unclear whether decorator objects are expected to take part in ownership + of the objects created. + Depending on the use case, users might want attachments to be owned by just + the parent, just the decorator, or both. + This configuration adds to the cognitive overhead of using the API, + and there's no one default that's more intuitive than the others. + +## Example + +The example use case we'll consider in this doc is a controller called +SnapshotSchedule that creates periodic backups of PVCs with the VolumeSnapshot +API. +Notice that it's natural to express this in the form we defined above: +"For every matching PVC, I want to create some VolumeSnapshot objects." + +CompositeController doesn't fit this use case because the PVCs are created and +potentially owned by something other than the SnapshotSchedule object. +For example, the PVCs might have been created by a StatefulSet. +Instead of creating PVCs, we want to look at all the PVCs that already exist and +take action on certain ones. + +DecoratorController doesn't fit this use case because it doesn't make sense for +the VolumeSnapshots we create to be owned by the PVC from which the snapshot was +taken. +The lifecycle of a VolumeSnapshot has to be separate from the PVC because the +whole point is that you should be able to recover the data if the PVC goes away. +Since the PVC doesn't own the VolumeSnapshots, it doesn't make sense to think of +the snapshots as a decoration on PVC (an additional feature of the PVC API). + +An instance of SnapshotSchedule might look like this: + +```yaml +apiVersion: snapshot.k8s.io/v1 +kind: SnapshotSchedule +metadata: + name: my-app-snapshots +spec: + snapshotInterval: 6h + snapshotTTL: 10d + selector: + matchLabels: + app: my-app +``` + +It contains a selector that determines which PVCs this schedule applies to, +and some parameters that determine how often to take snapshots, as well as when +to retire old snapshots. + +## API + +Below is a sample MapController spec that could be used to implement the +SnapshotSchedule controller: + +```yaml +apiVersion: metacontroller.k8s.io/v1alpha1 +kind: MapController +metadata: + name: snapshotschedule-controller +spec: + parentResource: + apiVersion: snapshot.k8s.io/v1 + resource: snapshotschedules + inputResources: + - apiVersion: v1 + resource: persistentvolumeclaims + outputResources: + - apiVersion: volumesnapshot.external-storage.k8s.io/v1 + resource: volumesnapshots + resyncPeriodSeconds: 5 + hooks: + map: + webhook: + url: http://snapshotschedule-controller.metacontroller/map + tombstone: + webhook: + url: http://snapshotschedule-controller.metacontroller/tombstone +``` + +### Parent Resource + +The parent resource is the SnapshotSchedule itself, and anything this controller +creates will be owned by this parent. +The schedule thus acts like a bucket containing snapshots: if you delete the +schedule, the snapshots inside it will go away too, unless you specify to orphan +them as part of the delete operation (e.g. with `--cascade=false` when using +`kubectl delete`). +Notably, this ties the lifecycles of snapshots to the reason they exist +(the backup policy that the user defined), rather than tying them to the entity +that they are about (the PVC). + +### Input Resources + +The input resources (in this case just PVC) are the inputs to the conceptual +"map" function. +We allow multiple input resources because users might want to write a controller +that performs the same action for several different input types. +We shouldn't force them to create multiple MapControllers with largely identical +behavior. + +The duck-typed `spec.selector` field (assumed to be `metav1.LabelSelector`) in +the parent object is used to filter which input objects to process. +If the selector is empty, we will process all objects of the input types in the +same namespace as the parent. + +We will also ignore input objects whose controllerRef points to the particular +parent object being processed. +That would imply that the same resource (e.g. ConfigMap) is listed as both an +input and an output in a given MapController spec. +This allows use cases such as generating ConfigMaps from other ConfigMaps by +doing some transformation on the data, while protecting against accidental +recursion if the label selector is chosen poorly. + +If there are multiple input resources, they are processed independently, with no +attempt to correlate them. +That is, the [map hook][] will still be called with only a single input object +each time, although the kind of that object might be different from one call to +the next. + +[map hook]: #map-hook + +### Output Resources + +The output resources (in this case just VolumeSnapshot) are the types of objects +that the user intends to create and hold in the conceptual "bucket" that the +parent object represents. +We allow multiple output resources because users might think of their controller +as spitting out a few different things. +We shouldn't force them to create a CompositeController too just so they can +emit multiple outputs, especially if those outputs are not conceptually part of +one larger whole. + +For a given input object, the user can generate any number of output objects. +We will tag those output objects in some way to associate them with +the object that we sent as input. +The tag makes it possible to group those objects and send them along with future +[map hook requests](#map-hook-request). + +In pseudocode, a `sync` pass could be thought of like the following: + +```go +// Get all matching objects from all input resources. +inputObjects := []Object{} +for _, inputResource := range inputResources { + inputObjects = append(inputObjects, getMatchingObjects(inputResource, parentSelector)...) +} +// Call the once hook for each input object. +for _, inputObject := range inputObjects { + // Compute some opaque string identifying this input object. + mapKey := makeMapKey(inputObject) + + // Gather observed objects of the output resources that are tagged with this key. + observedOutputs := []Object{} + for _, outputResource := range outputResources { + // Gather all outputs owned by this parent. + allOutputs := getOwnedObjects(outputResource, parent) + // Filter to only those tagged for this input. + observedOutputs = append(observedOutputs, filterByMapKey(allOutputs, mapKey)...) + } + + // Call user's map hook, passing observed state. + mapResult := mapHook(parent, inputObject, observedOutputs) + for _, obj := range mapResult.Outputs { + // Tag outputs to identify which input they came from. + setMapKey(obj, mapKey) + } + // Manage child objects by reconciling observed and desired outputs. + manageChildren(observedOutputs, mapResult.Outputs) +} +``` + +### Detached Outputs + +If an input object disappears, we may find that the parent owns one or more +output objects that are tagged as having been generated from an input object +that no longer exists. +Note that this does not mean these objects have been orphaned, in the sense of +having no ownerRef/controllerRef; the controllerRef will still point to the +parent object. +It's only our MapController-specific "tag" that has become a broken link. + +By default, we will delete any such *detached outputs* so that controller +authors don't have to think about them. +However, the SnapshotSchedule example shows that sometimes it will be important +to give users control over what happens to these objects. +In that example, the user would want to keep detached VolumeSnapshots since they +might be needed to restore the now-missing PVC. + +We could offer a declarative knob to either always delete detached outputs, +or always keep them, but that would be awkwardly restrictive. +The controller author would have fine-grained control over the lifecycle of +"live" outputs, but would suddenly lose that control when the outputs become +detached. + +Instead, we propose to define an optional [tombstone hook][] that sends +information about a particular group of detached outputs (belonging to a +particular input object that is now gone), and asks the user to decide which +ones to keep. +For example, SnapshotSchedule would likely want to keep detached VolumeSnapshots +around until the usual expiry timeout. + +For now, we will not allow the hook to edit detached outputs because we don't +want to commit to sending the body of the missing input object, since it +may not be available. +Without that input object, the hook author presumably wouldn't have enough +information to decide on an updated desired state anyway. +We can reexamine this if users come up with compelling use cases. + +[tombstone hook]: #tombstone-hook + +### Status Aggregation + +One notable omission from the map hook, as compared with the sync hook from +CompositeController, is that the user does not return any status object. +That's because each map hook invocation only sends enough context to process a +single input object and its associated output objects. +The hook author therefore doesn't have enough information to compute the overall +status of the parent object. + +We could define another hook to which we send all inputs and outputs for a given +parent, and ask the user to return the overall status. +However, that would defeat one of the main goals of MapController because such a +monolithic hook request could get quite large for the type of use cases we +expect for a controller that says, "do this for every X," and also because that +would place the burden of aggregating status across the whole collection onto +the user. + +Instead, Metacontroller will compute an aggregated status for the collection +based on some generic rules: + +For each input resource, we will report the number of matching objects we +observed as a status field on the parent object, named after the plural +resource name. + +The exact format will be an implementation detail, but for example it might +look like: + +```yaml +status: + inputs: + persistentvolumeclaims: + total: 20 + ... +``` + +For each output resource, we will report the total number of objects owned by +this parent across all map keys. +In addition, we will automatically aggregate conditions found on output objects, +and report how many objects we own with that condition set to `True`. + +For example: + +```yaml +status: + ... + outputs: + volumesnapshots: + total: 100 + ready: 97 + ... +``` + +## Hooks + +### Map Hook + +We call the map hook to translate an input object into zero or more output +objects. + +#### Map Hook Request + +| Field | Description | +| ----- | ----------- | +| `controller` | The whole MapController object, like what you might get from `kubectl get mapcontroller -o json`. | +| `parent` | The parent object, like what you might get from `kubectl get -o json`. | +| `mapKey` | An opaque string that uniquely identifies the group of outputs that belong to this input object. | +| `input` | The input object, like what you might get from `kubectl get -o json`. | +| `outputs` | An associative array of output objects that the parent already created for the given input object. | + +#### Map Hook Response + +| Field | Description | +| ----- | ----------- | +| `outputs` | A list of JSON objects representing all the desired outputs for the given input object. | + +### Tombstone Hook + +We call the tombstone hook, if defined, to ask whether we should keep any of a +group of output objects whose corresponding input object is gone. +If no tombstone hook is defined, we will always delete any such orphans as soon +as the input object disappears. + +#### Tombstone Hook Request + +| Field | Description | +| ----- | ----------- | +| `controller` | The whole MapController object, like what you might get from `kubectl get mapcontroller -o json`. | +| `parent` | The parent object, like what you might get from `kubectl get -o json`. | +| `mapKey` | An opaque string that uniquely identifies the group of outputs that belong to this input object. | +| `outputs` | An associative array of output objects that the parent already created for the given input object. | + +#### Tombstone Hook Response + +| Field | Description | +| ----- | ----------- | +| `outputs` | A list of output objects to keep, even though the associated input object is gone. All other outputs belonging to this input will be deleted. |