Skip to content
This repository has been archived by the owner on Jul 22, 2022. It is now read-only.

MapController Design #59

Merged
merged 1 commit into from
Jul 9, 2018
Merged

Conversation

enisoc
Copy link

@enisoc enisoc commented Jun 15, 2018

This is a proposal for a new Metacontroller API that attempts to fill a gap in the set of supported controller patterns that has been identified based on user feedback. Namely, the proposed API offers a way to observe objects you don't own and tie them into your custom behavior without trying to adopt (take ownership of) those objects.

as a lambda hook.

In general, MapController addresses use cases that can be described as,
"For every matching X that already exists, I want to create a Y according to the
Copy link
Contributor

@crimsonfaith91 crimsonfaith91 Jun 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, does the parent object = X = non-owned object, and Y = child object? If so, perhaps ... I want to create a list of Ys according ... is clearer (I think it should be one-to-many mapping).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've updated the wording to clarify there can be any number of Y objects for each X. Thanks!

creates will by owned by this parent.
The schedule thus acts like a bucket containing snapshots: if you delete the
schedule, the snapshots inside it will go away too, unless you use
`--cascade=false` to orphan them.
Copy link
Member

@janetkuo janetkuo Jun 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this flag is specific to kubectl, but not everyone is using kubectl

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded to be more general.

one larger whole.

For a given input object, the user can generate any number of output objects.
We will [tag](#map-key) those output objects in some way to associate them with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link doesn't work

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! Removed the link for now. I was originally going to write a section about how to actually do the tagging (label vs annotation? name vs uid? etc.), but then I decided that's more of an implementation detail that isn't needed to decide whether MapController is the right abstraction for the set of problems we're targeting.

We will [tag](#map-key) those output objects in some way to associate them with
the object that we sent as input.
The tag makes it possible to group those objects and send them along with future
[map hook requests](#map-hook-request).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand the MxN input/output case. IIUC, we tag each output project on which input object they'll be generated from?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the input side, the fact that there can be multiple input resources is only used for selecting the pool of input objects. In pseudocode, something like this:

inputObjects := []Object{}
for _, inputResource := range inputResources {
  inputObjects = append(inputObjects, getMatchingObjects(inputResource, parentSelector)...)
}

Then we go through each of the input objects and call the user's hook to generate the output objects for that one input. We tag each output with the input that it came from, which also allows us to select the observed outputs belonging to this input. Continuing in pseudocode:

outputObjects := []Object{}
for _, inputObject := range inputObjects {
  // Compute some opaque string identifying this input object.
  mapKey := makeMapKey(inputObject)

  // Gather observed objects of the output resources that are tagged with this key.
  observedOutputs := []Object{}
  for _, outputResource := range outputResources {
    // Gather all outputs owned by this parent.
    allOutputs := getOwnedObjects(outputResource, parent)
    // Filter to only those tagged for this input.
    observedOutputs = append(observedOutputs, filterByMapKey(allOutputs, mapKey)...)
  }

  // Call user's map hook, passing observed state.
  mapResult := mapHook(parent, inputObject, observedOutputs)
  for _, obj := range mapResult.Outputs {
    // Tag outputs to identify which input they came from.
    setMapKey(obj, mapKey)
    outputObjects = append(outputObjects, obj)
  }
}
reconcileObjects(outputObjects)

Does that help? If so, I can add this to the doc somewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this helps.

[map hook requests](#map-hook-request).

If the associated input object is gone, we will detect that there is a group of
output objects that have become orphaned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand this part. Who orphaned the output objects?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some clarification that this doesn't refer to orphaning in the ownerRef/controllerRef sense. I wish there was another word I could use that doesn't have a collision with another common API machinery term, but I can't think of one yet. Any ideas?

Copy link
Contributor

@crimsonfaith91 crimsonfaith91 Jun 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will disconnected or disassociated be a good word? Perhaps can set up a new notion of connection or association (I found the later wording is used frequently in the doc).

Copy link
Contributor

@crimsonfaith91 crimsonfaith91 Jun 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another suggestion is the notion of decay, which may have a more vivid tone. If an input object is removed, the output objects that are not deleted are said to be decaying (tends to be deleted in future).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I did confuse "orphaning" with ownerRef orphaning. Because parent/child metaphor is used for parent & outputs, the use of the term "orphan" could cause confusion. The clarification helps.

Proposing some more alternative terms: "lost", "untraceable".

docs/_design/map-controller.md Show resolved Hide resolved
tombstone:
webhook:
url: http://snapshotschedule-controller.metacontroller/tombstone
```
Copy link
Member

@janetkuo janetkuo Jun 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, parent owns outputs (label/selector + ownerRef), and selects inputs (label/selector only)?

Can a user make map controller enter crazy creation loop by having the same resource in input and output?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point!

Since we put ownerRef on the outputs like you mentioned, we could ignore our own outputs when looking at inputs. Do you think that's enough to prevent the most likely class of mistakes?

Can you think of any other invariants we can enforce -- things that we're pretty sure the user would never actually want?

If possible, I'd rather not go as far as saying you can't have an output resource that's the same type as your input. I can imagine some legitimate uses for that, as long as it's not recursive (which we can help prevent with ownerRefs). We'll never prevent all possible hot loops (you can set up multiple controllers, for example) so I just see our responsibility as providing reasonable guardrails against common mistakes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any example of use cases involving same type of input and output resources? I think it will be a very infrequent use case for MapController. IIUC, both input and output resources will be different even when they are of same type (because the output resources are new resources created by the MapController). Hence, I think this use case may be better served by other meta-controllers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As synced by person, there are some use cases involving same type of input and output resources like:

  1. pod as input, another pod as output
  2. service as input, companion service as output
  3. ConfigMap with data A as input, ConfigMap with data B as output

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we put ownerRef on the outputs like you mentioned, we could ignore our own outputs when looking at inputs.

This stops resources from using their own children as inputs, which seems fine.

We'll never prevent all possible hot loops (you can set up multiple controllers, for example) so I just see our responsibility as providing reasonable guardrails against common mistakes.

Agree. Adding a note/caveat is probably enough.


## Example

The example use case we'll consider in this doc is a controller called
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has the potential to be integrated to stateful operators that we are working on. :)

### Parent Resource

The parent resource is the SnapshotSchedule itself, and anything this controller
creates will by owned by this parent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: will be owned

| ----- | ----------- |
| `outputs` | A list of JSON objects representing all the desired outputs for the given input object. |

### Tombstone Hook
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tombstone hook is a more flexible design (e.g., user wants to retain type A output object, but not other types of output objects), but wish to verify whether its role can be technically achieved by existing propagation policy and finalizer feature of garbage collection.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean, could we use finalizers as an alternative to the tombstone hook? If so, I think the problem is that the objects would go into a "pending deletion" state, which cannot be undone if the controller later decides to keep the object (e.g. a matching input PVC gets recreated, so the SnapshotSchedule controller decides to keep the "detached" VolumeSnapshot output objects after all).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. Tombstone hook is necessary because "pending deletion" state cannot be undone.

@crimsonfaith91
Copy link
Contributor

/lgtm
The design is very nicely done. :)

@barney-s
Copy link

/lgtm

@janetkuo
Copy link
Member

janetkuo commented Jul 3, 2018

I replied to some of the comments above. Looks good to me in general.

Copy link
Author

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've incorporated the results of our discussions into the doc.

| ----- | ----------- |
| `outputs` | A list of JSON objects representing all the desired outputs for the given input object. |

### Tombstone Hook
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean, could we use finalizers as an alternative to the tombstone hook? If so, I think the problem is that the objects would go into a "pending deletion" state, which cannot be undone if the controller later decides to keep the object (e.g. a matching input PVC gets recreated, so the SnapshotSchedule controller decides to keep the "detached" VolumeSnapshot output objects after all).

@enisoc enisoc merged commit 5948d63 into GoogleCloudPlatform:master Jul 9, 2018
@enisoc enisoc deleted the map-design branch July 9, 2018 22:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants