Skip to content
This repository has been archived by the owner on Jul 15, 2024. It is now read-only.

Add duck type generator support to ApplicationSets #231

Merged
merged 18 commits into from
Jun 10, 2021

Conversation

jnpacker
Copy link
Contributor

  • Initial Commit
  • Adds a Duck Type Generator to ApplicationSets that allows the controller to read an external Resource Status.Decisions (array) to determine the list of Argo CD clusters to deploy to.
  • This approach uses the duck type Reference, that allows the user to define what resource will provide placement recommendations for target Argo CD clusters
  • Duck Type Status definition:
status:
  decisions:
  - clusterName: my-cluster01
  - clusterName: my-cluster02
  • Duck Type Reference:
spec:
  ducktType:
    apiVersion: my-group/v1beta1
    kind: MyKind
    name: my-resource-name
    requeueAfterSeconds: 60         # This is optional, default is 3 min

@CLAassistant
Copy link

CLAassistant commented May 13, 2021

CLA assistant check
All committers have signed the CLA.

@jnpacker
Copy link
Contributor Author

@jgwest jgwest self-requested a review May 13, 2021 17:42
@OmerKahani
Copy link
Contributor

Thanks for this PR, is it a general purpose generator or a specific for Open Cluster Management? both are welcome, and will be very usfaul! but we might want to clarify it in the docs

Copy link
Contributor

@OmerKahani OmerKahani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/generators/duck_type.go Show resolved Hide resolved
pkg/generators/duck_type.go Outdated Show resolved Hide resolved
@jgwest
Copy link
Member

jgwest commented May 17, 2021

Thanks @jnpacker, I really like the idea of a generator based on using duck-types to provide values to ApplicationSet instances 🎉.

One concern I have, which it sounds like is echoed by @OmerKahani as well, is it may be too constrained to one specific 'shape' (eg that of OCM's placement rule).

If we wanted to implement a generator that was a fully generic duck-type based generator instance (that could be attached to ApplicationSets in order provide generic values) I would think we would want to change the expected shape...

For example, changing the expected status field from:

status:
  decisions:                # Duck Type that is expected on the referenced resource
  - clusterName: cluster-01

to:

status:
  parameters: 
    # parameters, rather than decisions, and untyped.
    # A generic []map[string]string, eg array of key-values 
    - name: cluster-01
      server: https://kubernetes.default.svc # note that requires the duck type's controller to read argo cd secrets to find the server url

and this same shape would also match:

status:
  parameters:
    - someOtherKey: some other value
      anotherKey: another value
    - someOtherKey2: some other value
      anotherKey2: another value

(because parameters is just an array of generic key-value pairs, not typed to any particular schema)

In this model, a CR can contribute any set of key-value parameters that they want, which would be useful for replacing not only the built-in cluster generator, but also for replacing the other generators as well. (Granted, this is not EXACTLY the same as what you are proposing, because your PR does some additional work on the values eg checks that the clusters exist, and gets the URL for them from Argo CD, but I think these are not 100% essential functions)

So that would be how to generically use duck types to provide values to an ApplicationSet.

However, if the goal of the PR is primarily to connect to OCM PlacementRules, then it makes more sense to change the generator name to eg ocmPlacementRule, unless a case can be made that placement rules are themselves conceptual generic enough to fully represent other cluster management tools/user scenarios. For example, SkyScanner how has their own custom ApplicationSet-based controller which handles cluster selection: https://github.com/Skyscanner/applicationset-progressive-sync

I'm 👍 with adding explicit support for OCM-style PlacementRule, and it sounds like @OmerKahani is as well, but we would probably want to bring it up at the Argo CD community meeting to ensure there is agreement on enabling integration with a specific project (which would be a first for the ApplicationSet project: previously 'generators' have been tools that perform a specific function for a specific scenario).

What do you think, @jnpacker? 🤔

jnpacker and others added 2 commits May 17, 2021 13:51
I split this into multiple lines out of habit. I'm used to Sonar Cloud marking a single line that exceeds 120 characters as a code smell.

Co-authored-by: Omer Kahani <8766193+OmerKahani@users.noreply.github.com>
@jnpacker
Copy link
Contributor Author

So being mindful of making the generator definition to complex, I might propose a default:

status:
  generated:                # Pay homage to its roots
    - name: a value
      url: a value
    - name: some other value
      url: yet another value

Then allow an override using dynamic keys, in the generator DuckType spec:

spec:
  ducktType:
    apiVersion: my-group/v1beta1
    kind: MyKind
    name: my-resource-name
    requeueAfterSeconds: 60         # This is optional, default is 3 min
    overrideParams:                 # OPTIONAL: 
      listKey: decisions             # OPTIONAL: Override for generated key
      nameKey: clusterName           # OPTIONAL: Override for name key
      urlKey: ""                     # OPTIONAL: Override for url key

Using something like this spec, you could define an ApplicationSet with generator DuckType and produce the target list of clusters in the ProgressiveSync Customer Resource status.

Its a little busy for the CLI, but should be fine for someone creating a UI that drives ApplicationSets.

@jnpacker
Copy link
Contributor Author

jnpacker commented May 17, 2021

Thinking about it a little more though, maybe to make it easier to consume:

spec:
  ducktType:
    configMapRef: my-duck-type
    name: my-resource-name
---
kind: ConfigMap
metadata:
  name: my-duck-type
data:
  apiVersion: my-group/v1beta1
  kind: MyKind
  listKey: decisions
  nameKey: clusterName
  urlKey: ""
  requeueAfterSeconds: 60

This way, you don't have to keep typing out the duckType parameters. since everything lives in the argocd namespace, you could have a configMap for each duckType you wanted to leverage.

It also means creating application sets only requires two parameters to leverage the external resource.

I would need to add Get to the serviceAccount role for ConfigMaps

What do you think @jgwest ?

Signed-off-by: Joshua Packer <jpacker@redhat.com>
Signed-off-by: Joshua Packer <jpacker@redhat.com>
Signed-off-by: Joshua Packer <jpacker@redhat.com>
Signed-off-by: Joshua Packer <jpacker@redhat.com>
@jnpacker
Copy link
Contributor Author

So I wrote a new branch using a ConfigMap to identify the DuckType resource. This actually works out to be very clean, as it now means you can dynamically choose a resource by specifying the resource name and the configMapRef (ConfigMap name) in the ApplicationSet.Spec.Generators.DuckType. This is really nice, as it means you can have multiple resources, and you just have to change the name you use for configMapRef.

spec:
  generators:
    - duckType:
        configMapRef: ocm-placement
        name: test-placement
        requeueAfterSeconds: 30

So with this in mind, I could have a configMapRef: ocm-placement and configMapRef: skyscanner, and create as many applicationSets as I want, re-using just those two configMaps.

apiVersion: v1
kind: ConfigMap
metadata:
  name: ocm-placement
  namespace: argocd
data:
  apiVersion: apps.open-cluster-management.io/v1
  kind: placementrules
  statusListKey: parameters
  matchKey: name

The statusListKey and the matchKey allow support of more dynamic name choices, the statusListKey is clusters by default, but could be parameters or decisions

...
status:
  parameters:
  - name: cluster1
  - name: cluster2

I did think for a while to allow skipping the validation code that compares the generated name to the ArgoCD clusters, but I noticed that if the Application created from the template is bad, the reconcile stops. So I opted to make sure the cluster coming from the duck resource always matches to an ArgoCD cluster by name. This makes sure the reconcile always continues, since we never have an invalid cluster name in the Application template.

Last part, is I decided to include any Key/Values in the cluster list element from the resource to the template. This allowed me to use clusterName from the OCM placement decision directly in the template, and will allow other resources to injected to the template.
https://github.com/argoproj-labs/applicationset/compare/master...jnpacker:duck-type-gen-cm?expand=1

Not sure if you'll have access to view the compare link above. Let me know what you think and I can replace this Pull with that one.

The other nice offshoot is someone tailoring a resource for use with this feature, just needs to include the ConfigMap with their resource bundle and make sure its created in the ArgoCD namespace. At that point someone could start using it with ApplicationSets.

Signed-off-by: Joshua Packer <jpacker@redhat.com>
Signed-off-by: Joshua Packer <jpacker@redhat.com>
@jnpacker
Copy link
Contributor Author

I'm also interested in your thoughts on using the ConfigMap vs included the 4x key / values in the duckType direcctly.

statusListKey and matchKey could be optional

statusListKey: clusters
matchKey: name

@jgwest
Copy link
Member

jgwest commented May 19, 2021

Looking good @jnpacker, a couple questions:

Q: I am correct that the principal difference between the ConfigMap option, and the ApplicationSet 'all fields in CR' option, is less reptition of fields (eg better DRY)?

I presume both options have the same expressive 'power' (because the configmap just contains most of the same values that would otherwise be in the applicationset CR) but it's just less repitition. The ConfigMap approach feels more elegant to me, but I'm also curious if there was any other flexibility it grants.

Q: Even with the changes, I think we should still declare that OCM is the only current existing consumer of this feature, but (paraphrasing) 'we hope it is flexible enough for other custom resources implementors that wish to define a set of clusters to target with their custom resource'.

So that would be:

  • Mention OCM as the current consumer in the docs
  • Mention how we expect it should be suitable for others to use, and sufficiently document how they can use it
  • I don't think we need to rename the generator to reference ocm... I had previously mentioned renaming the generator to explicitly mention ocm (eg ocmPlacementRule or something like that), but the generator is now generic enough that is not necessary. But, see my name ideas below.

Sound good?

If so, I will double check with Argo CD contributors that this integration is kosher (I don't expect any issues), and we should be good. It's good that the type has been made more generic, and hopefully this will help other teams implementing custom cluster-targeting CRs 😎 .

And comments:

Since this generator is principally for targeting Applications to Clusters (vs some of the other generators that are not cluster focused, like Git, and scm), I think we should change the name to reflect that, keeping in mind that naming things is hard. I'm thinking a cluster prefix, then something describing the fact that we are referencing a custom resource for the data:

  • clusterResourceRef
  • clusterCustomResource
  • clusterDuck 😛 😄 🤔
  • other ideas?

Last part, is I decided to include any Key/Values in the cluster list element from the resource to the template. This allowed me to use clusterName from the OCM placement decision directly in the template, and will allow other resources to injected to the template.

Sounds good.

I did think for a while to allow skipping the validation code that compares the generated name to the ArgoCD clusters, but I noticed that if the Application created from the template is bad, the reconcile stops.

Ah, this is a bug that others have reported, which I've targeted for the current milestone: #219
You can see the reasoning on that issue, but TLDR the controller should keep trying to reconcile on validation failure, and not give up.
(But, this is a moot point, as what you have in the PR is fine, keeping this check as is sounds good... I only brought this up in the context of exploring the idea of a more generic duck-type generator, which is a separate feature for another day)

Finally, I just want you to know that I LOL-ed at the mallard.io apiVersion and 'quacks' in the docs section of the PR 🦆 🦆 🦆 😄 .

@jnpacker
Copy link
Contributor Author

jnpacker commented May 19, 2021

A1. @jgwest, you are correct, it really just cuts down the repetition, and lowers the chance of a typo, as you only have to create one configMap for your target resource type per ArgoCD instance.

I thought it also made reading the ApplicationSet more elegant as you can leverage different resources with a simple descriptive name.

For the name.

generators:
  - clusterResourceList:
       configMapRef:
       name: 

By calling the generator Resource, I think name works much better. or clusterListResource or clusterCustomResource work. (If I thought we could get away with ClusterDuck, I would use it 😜)

A2. I'll add some OCM references to the generators.md

@jgwest
Copy link
Member

jgwest commented May 19, 2021

@jnpacker Good suggestions, personally I like clusterListResource generator, as it's a resource containing a cluster list (eg rather than a list of cluster resources, as in clusterResourceList).

…to include open-cluster-management.io reference

Signed-off-by: Joshua Packer <jpacker@redhat.com>
@jnpacker
Copy link
Contributor Author

@jgwest all questions and suggestions above incorporated and merged into this pull request.
Let me know if you feel more detail or call outs to how or that its a generalized approach are needed.
Also a quick thank you for all the time and help spent on this with me.

Signed-off-by: Joshua Packer <jpacker@redhat.com>
@jnpacker
Copy link
Contributor Author

@OmerKahani, included e2e tests. Was thinking I might try to fix the URL I used for the Resource CRD that is read by the duckType to a specific commit, as a bit of a gurantee, otherwise the e2e ran fine.

Signed-off-by: Joshua Packer <jpacker@redhat.com>
metadata:
name: guestbook
spec:
generators:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you think it will be more consistent to add clusterListResource under the existing cluster field? This looks more natural to me because resource is another source of clusters same as selector .

So the generator definition would look as follows:

 generators:
 - clusterList:
     resource:
       configMapRef: my-configmap
       name: quak

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point around the name choice. The name choice for the generator does overlap, and does not infer the capability I was trying to incorporate. There were some alternate names suggested from @jgwest maybe clusterDecisionResource is more true to what the duckType status on the targeted resource offers.

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @alexmt, were you suggesting that we move this into the existing Cluster generator? (you used clusterList in your example, which isn't an existing field, so it was a bit ambigous).

But, presuming so, my preference is to keep them separate:

  • Once concern I have is that the learning curve for this generator is necessarily steeper (duck types, CRDs, external controllers) than the existing cluster generator. I want to keep the learning curve on the cluster generator as simple as possible, because the existing Cluster generator will the first generator, and likely most popular, that users use.
    • For example, re: generator complexity and learning curve, I carefully ordered the list of generators on the documentation page, from simplest to most complex, to gradually introduce users to the generator functionality. (Admittedly, it's a small thing, but I think has an outsized impact on user engagement.)
  • Generators thus far have been named for the source of the template parameters that they generate, rather than for their target: list provides a set of parameters from a literal list, cluster uses the argo cd cluster list as a source, git (from git, either file or directory), scmProvider (from an SCM provider like GitHub/GitLab) and so on.
    • Thus here it makes sense that this new generator would be named for it's data source, rather than it's target. It's datasouce is the fields from custom CRs (with the contents of those fields being mutable and potentially based on fancy, custom logic).
  • IMHO this generator's functionality is more than just a simple selector: the purpose is to support custom resources which decide (using the implementation of another controller) where Application resources should exist in a cluster fleet. With this approach the other controller can use quite fancy custom logic, which is more powerful than a simple selector.

However, as @jnpacker mentioned, the generator name being proposed could be misleading. His proposal of clusterDecisionResource I think better captures that purpose of the custom resource, which is to implement complex decision-making re: where cluster resources should live (rather than a simple, immutable list of clusters that the old name might imply.)

Signed-off-by: Joshua Packer <jpacker@redhat.com>
Copy link
Member

@jgwest jgwest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @jnpacker! Just a few quick tweaks then good to go.

Makefile Show resolved Hide resolved
test/e2e/fixture/applicationsets/actions.go Outdated Show resolved Hide resolved
examples/clusterDecisionResource/README.md Outdated Show resolved Hide resolved
pkg/generators/duck_type.go Outdated Show resolved Hide resolved
Signed-off-by: Joshua Packer <jpacker@redhat.com>
@jnpacker
Copy link
Contributor Author

The failure seems to be a timeout in the Martix Generator e2e. Is there a way for me to restart the test without a NOOP?

Copy link
Member

@jgwest jgwest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks again @jnpacker 🎉 ! I'll be making some tweaks to the generator documentation (all the generators, not just this one), but I'll handle that in a separate PR.

@jgwest jgwest merged commit ea3aa7b into argoproj:master Jun 10, 2021
@jgwest jgwest changed the title Add duck type generator support to ApplicationSets (initial commit) Add duck type generator support to ApplicationSets Jun 10, 2021
@jnpacker jnpacker deleted the duck-type-generator branch June 10, 2021 18:06
@sbose78
Copy link
Contributor

sbose78 commented Jun 16, 2021

Thank you @jgwest and @jnpacker !

Have we presented this in the community meeting yet ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants