-
Notifications
You must be signed in to change notification settings - Fork 294
Support the use-case to manage multiple kube-aws clusters' configurations optionally inheriting organization-specific customizations with a version control system like Git #238
Comments
@redbaron We've split templates:
Also, would you mind sharing me example(s) of the "insignificant code duplication" you've mentioned? Anyways I guess we could fix that separately without embedded/nested CF stack. |
Could you also share your thoughts on how would the whole files and directories would be structured after the change you've wanted, or does fixing just |
OK, lets discuss that:
|
Well, having one |
Thanks for the additional info!
Sorry for missing the context here but anyways I'd like to inform that user-data/cloud-config-worker is managed separately among a main stack and node pools to allow customization(s) per main/node pool. |
However, for example if duplicated blocks in |
Just to be clear but I'm rather eager to make a step change like you've suggested if it does improve kube-aws but that should be done as long as the end result keeps our requirements IMHO, or the discussion to narrow down our reqs would be the first step. |
Ah, true. Even more headache. See,you cater for customization, but it is not customization friendly. Once you modify render output you are on your own, all kube-aws can do is to wipe out your changes on next render. So we are developing a tool which uses git branching to keep track of customizations, where kube-aws renders into "vanilla" branches, then it got merged into "tailored" branch where all organization-wide customizations go to be shared across all the clusters, then it got merged into individual "kluster" branches were per-cluster adjustments are possible. It is all fine and dandy when paths are stable, once there are random files poping up per cluster, they can't benefit from this workflow as git can't keep track of changes "remounted" to different paths without clunky tricks like subtree merging or even more complicated branching. Hence my request to keep paths stable :) |
Yes 👍 Therefore the limit is now 460,800 bytes for a cfn stack template fed via S3 hence kube-aws provides approximately 400KB left for user-customization per cfn stack today. Also beware of my guess that a resulting stack template rendered with
Yes, we'd tried hard to reduce duplication among them; the result includes I'm now observing how would the code for the main and the node pool changes over time to plan further refactorings to make them less duplicated. |
Hm, then having a single
I am really not concerned about |
@redbaron I'm interested to your use-case 👍 |
Sounds good 👍 but could you clarify a bit? I guess the single root template to nest all the stacks including the ones for main and node pools itself can be achieved out of kube-aws. The issue is that the way to "propagate" outputs from a stack(probably a "main" stack a inline resources defined in the single root template) to a node-pool stack inside a "root" template is missing. It can be resolved via #195, right? |
If all following branches:
Have same paths, then we can use power of git to track updates in kube-aws without loosing customizations on all levels. If anywhere on the the way from |
Thanks for your replies!
ok then we can discuss that separately 👍 |
@redbaron I assume that your use-case is: "how to efficiently manage multiple kube-aws cluster(kcluster)s which may or may not read/inherit/apply organization-wide common configuration(including settings fed to If so, what kind of information is contained in the "set of common changes" in the I'm considering if it could be possible for kube-aws to provide an out-of-box way to provide "common configuration" to e.g. |
Yes, then
Yes, it is not the way kube-aws currently work, but it is doable. Looks like #195 has much wider scope than just template nesing. Fundamentally for that to work we'd need:
|
yes and still be able to benefit from improvements which new version of kube-aws provides without labor intensive cherry-picking individual changes.
It would be nice, but I am not sure it is possible without becoming another Git :) Kind of customizations we are going to have are mostly those which are not supported by kube-aws and some of them will never be: existing subnets, multiple EFS, Vault secrets and PKI, non-standard company DNS, monitoring & audit daemons etc |
How are you exactly going to apply those customizations?
or anything else? |
We won't version-control kube-aws generated assets but could provide a framework to accept common customizations to these assets 😃 Then, you can define your customizations entirely in dedicated files(script to modify kube-aws assets? diff patches?) and put those customizations among all the klusters under version control of Git repo(s)/branch(s). |
The framework mentioned above could be something like @c-knowles suggested before in #96 (comment); hooks |
Script simplifies all the dance around git, but it is doing nothing more than a sequence of checkouts, kube-aws calls and merges. Vault integration is done by adding/modifying systemd unit files in userdata. It doesn't matter what changes are, there are always going to be features which general purpose tool doesn't support. Things like recent
It is possible, but managing them will be unimaginable pain. Right now central team can do |
I guessed you've been managing the customizations in your own form inside the I'd imagined that with the framework, you could generate each |
no, why? first commit on
when new |
Yes, but I guess you apply your further customizations on the contents of
That way, I guess that your workflow could be implemented a bit more easily plus kube-aws could support it via the framework to apply (1) (2) (3) while running |
This makes it hard to maintain mainly because of (3). preparing/testing/appying these patches are going to be bad experience for everybody. Imagine if new version of In our workflow
each Then it is super easy:
Managing bunch of .patch files or transform hooks can't beat it. |
Aha, so now I guess what you'd need is: In the
Then in a A node pool's stack name is name-spaced under a main's stack name therefore you can choose stable names for node pools today and it won't prevent you from creating node pools for each |
...and if you care about duplication among the main and the node pool's |
That is one way to do it, one which I wanted to avoid because these nodepools not only separated by function, they also better be one per AZ right? Now imagine we need to add new daemon to all workers, how many files need to be edited with same copy paste? Doing #217 I had to make same changes to 2 files and that repetitive work already killed me :) That is not to mention that if in There are workarounds I am exploring to have Hence my request whether it is possible to have set of fixed paths even for "dynamic" ammount of nodepools |
A bit of duplication between 2-4 files is fine, no need to normalize everything to absolute extreme :) |
I can't stop being impressed to your deep understanding of the problem 👍 Thanks for your patience here! ok then introducing helpers would be a solution for not this but for another problems. How about the following in the
plus somehow adding kube-aws a functionality to support This way I guess you won't need to deal with variable directories with dynamic names under the Or I'd appreciate it if you could share me an example structure of contents inside the |
Or I'd suggest adding options in |
Ideal layout for me would be:
where cluster.yaml controls both "main" setup and nodepools. Then we scrap all Individual nodepool config in cluster.yaml may choose to override path to userdata files, but by default it points to the same set. If there was a |
Sounds good! Several things I'd like to add:
I'd also like to hear from the community for more feedback before a PR is actually merged but I'd be happy to review WIP PRs. |
@redbaron Would you mind changing the subject of this issue to something like "Support the use-case to manage multiple kube-aws clusters' configurations optionally inheriting organization-specific customizations with a version control system like Git"? |
@mumoshu that sounds perfect for my usecase! I'll prepare something over weekend to show |
@redbaron Thanks for the discussion! I'm really looking forward to it 👍 |
Hi @redbaron, I began to believe that making |
yes, it solves that and also eliminates need to keep params like "sshAuthorizedKeys" in sync between all |
@mumoshu I think the above means the top two remaining points from #44 (comment) will be dealt with here. Do you agree? |
I'm still wanting to leave the |
@c-knowles Those are not requirements which must be addressed before a future PR from @redbaron is merged anyways. Those are definitely what I'd appreciate if he could include in a PR but I'd recommend not to do so at least in the initial PR to make each PR reviewable hence able to be merged quickly! Those can be even addressed elsewhere today because kube-aws can anyway read |
All that command will do is modify |
@redbaron Agreed. I'm going to drop |
@redbaron Btw, do you have any plan to publish the tool you're developing as an OSS? 😃 |
This is an implementation of kubernetes-retired#238 from @redbaron especially what I've described in my comment there kubernetes-retired#238 (comment), and an answer to the request "**3. Node pools should be more tightly integrated**" of kubernetes-retired#271 from @Sasso . I believe this also achieves what was requested by @andrejvanderzee in kubernetes-retired#176 (comment). After applying this change: 1. All the `kube-aws node-pools` sub-commands are dropped 2. You can now bring up a main cluster and one or more node pools at once with `kube-aws up` 3. You can now update all the sub-clusters including a main cluster and node pool(s) by running `kube-aws update` 4. You can now destroy all the AWS resources spanning main and node pools at once with `kube-aws destroy` 5. You can configure node pools by defining a `worker.nodePools` array in cluster.yaml` 6. `workerCount` is dropped. Please migrate to `worker.nodePools[].count` 7. `node-pools/` and hence `node-pools/<node pool name>` directories, `cluster.yaml`, `stack-template.json`, `user-data/cloud-config-worker` for each node pool are dropped. 8. A typical local file tree would now look like: - `cluster.yaml` - `stack-templates/` (generated on `kube-aws render`) - `root.json.tmpl` - `control-plane.json.tmpl` - `node-pool.json.tmpl` - `userdata/` - `cloud-config-worker` - `cloud-config-controller` - `cloud-config-etcd` - `credentials/` - *.pem(generated on `kube-aws render`) - *.pem.enc(generated on `kube-aws validate` or `kube-aws up`) - `exported/` (generated on `kube-aws up --export --s3-uri <s3uri>`) - `stacks/` - `control-plane/` - `stack.json` - `user-data-controller` - `<node pool name = stack name>/` - `stack.json` - `user-data-worker` 9. A typical object tree in S3 would now look like: - `<bucket and directory from s3URI>`/ - kube-aws/ - clusters/ - `<cluster name>`/ - `exported`/ - `stacks`/ - `control-plane/` - `stack.json` - `cloud-config-controller` - `<node pool name = stack name>`/ - `stack.json` Implementation details: Under the hood, kube-aws utilizes CloudFormation nested stacks to delegate management of multiple stacks as a whole. kube-aws now creates 1 root stack and nested stacks including 1 main(or currently named "control plane") stack and 0 or more node pool stacks. kube-aws operates on S3 to upload all the assets required by all the stacks(root, main, node pools) and then on CloudFormation to create/update/destroy a root stack. An example `cluster.yaml` I've been used to test this looks like: ```yaml clusterName: <your cluster name> externalDNSName: <your external dns name> hostedZoneId: <your hosted zone id> keyName: <your key name> kmsKeyArn: <your kms key arn> region: ap-northeast-1 createRecordSet: true experimental: waitSignal: enabled: true subnets: - name: private1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.1.0/24" private: true - name: private2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.2.0/24" private: true - name: public1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.3.0/24" - name: public2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.4.0/24" controller: subnets: - name: public1 - name: public2 loadBalancer: private: false etcd: subnets: - name: public1 - name: public2 worker: nodePools: - name: pool1 subnets: - name: asgPublic1a - name: pool2 subnets: # former `worker.subnets` introduced in v0.9.4-rc.1 via kubernetes-retired#284 - name: asgPublic1c instanceType: "c4.large" # former `workerInstanceType` in the top-level count: 2 # former `workerCount` in the top-level rootVolumeSize: ... rootVolumeType: ... rootVolumeIOPs: ... autoScalingGroup: minSize: 0 maxSize: 10 waitSignal: enabled: true maxBatchSize: 2 - name: spotFleetPublic1a subnets: - name: public1 spotFleet: targetCapacity: 1 unitRootVolumeSize: 50 unitRootvolumeIOPs: 100 rootVolumeType: gp2 spotPrice: 0.06 launchSpecifications: - spotPrice: 0.12 weightedCapacity: 2 instanceType: m4.xlarge rootVolumeType: io1 rootVolumeIOPs: 200 rootVolumeSize: 100 ```
This is an implementation of kubernetes-retired#238 from @redbaron especially what I've described in my comment there kubernetes-retired#238 (comment), and an answer to the request "**3. Node pools should be more tightly integrated**" of kubernetes-retired#271 from @Sasso . I believe this also achieves what was requested by @andrejvanderzee in kubernetes-retired#176 (comment). After applying this change: 1. All the `kube-aws node-pools` sub-commands are dropped 2. You can now bring up a main cluster and one or more node pools at once with `kube-aws up` 3. You can now update all the sub-clusters including a main cluster and node pool(s) by running `kube-aws update` 4. You can now destroy all the AWS resources spanning main and node pools at once with `kube-aws destroy` 5. You can configure node pools by defining a `worker.nodePools` array in cluster.yaml` 6. `workerCount` is dropped. Please migrate to `worker.nodePools[].count` 7. `node-pools/` and hence `node-pools/<node pool name>` directories, `cluster.yaml`, `stack-template.json`, `user-data/cloud-config-worker` for each node pool are dropped. 8. A typical local file tree would now look like: - `cluster.yaml` - `stack-templates/` (generated on `kube-aws render`) - `root.json.tmpl` - `control-plane.json.tmpl` - `node-pool.json.tmpl` - `userdata/` - `cloud-config-worker` - `cloud-config-controller` - `cloud-config-etcd` - `credentials/` - *.pem(generated on `kube-aws render`) - *.pem.enc(generated on `kube-aws validate` or `kube-aws up`) - `exported/` (generated on `kube-aws up --export --s3-uri <s3uri>`) - `stacks/` - `control-plane/` - `stack.json` - `user-data-controller` - `<node pool name = stack name>/` - `stack.json` - `user-data-worker` 9. A typical object tree in S3 would now look like: - `<bucket and directory from s3URI>`/ - kube-aws/ - clusters/ - `<cluster name>`/ - `exported`/ - `stacks`/ - `control-plane/` - `stack.json` - `cloud-config-controller` - `<node pool name = stack name>`/ - `stack.json` Implementation details: Under the hood, kube-aws utilizes CloudFormation nested stacks to delegate management of multiple stacks as a whole. kube-aws now creates 1 root stack and nested stacks including 1 main(or currently named "control plane") stack and 0 or more node pool stacks. kube-aws operates on S3 to upload all the assets required by all the stacks(root, main, node pools) and then on CloudFormation to create/update/destroy a root stack. An example `cluster.yaml` I've been used to test this looks like: ```yaml clusterName: <your cluster name> externalDNSName: <your external dns name> hostedZoneId: <your hosted zone id> keyName: <your key name> kmsKeyArn: <your kms key arn> region: ap-northeast-1 createRecordSet: true experimental: waitSignal: enabled: true subnets: - name: private1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.1.0/24" private: true - name: private2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.2.0/24" private: true - name: public1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.3.0/24" - name: public2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.4.0/24" controller: subnets: - name: public1 - name: public2 loadBalancer: private: false etcd: subnets: - name: public1 - name: public2 worker: nodePools: - name: pool1 subnets: - name: asgPublic1a - name: pool2 subnets: # former `worker.subnets` introduced in v0.9.4-rc.1 via kubernetes-retired#284 - name: asgPublic1c instanceType: "c4.large" # former `workerInstanceType` in the top-level count: 2 # former `workerCount` in the top-level rootVolumeSize: ... rootVolumeType: ... rootVolumeIOPs: ... autoScalingGroup: minSize: 0 maxSize: 10 waitSignal: enabled: true maxBatchSize: 2 - name: spotFleetPublic1a subnets: - name: public1 spotFleet: targetCapacity: 1 unitRootVolumeSize: 50 unitRootvolumeIOPs: 100 rootVolumeType: gp2 spotPrice: 0.06 launchSpecifications: - spotPrice: 0.12 weightedCapacity: 2 instanceType: m4.xlarge rootVolumeType: io1 rootVolumeIOPs: 200 rootVolumeSize: 100 ```
This is great! Will port our config to new structure soon and test it. |
This is an implementation of kubernetes-retired#238 from @redbaron especially what I've described in my comment there kubernetes-retired#238 (comment), and an answer to the request "**3. Node pools should be more tightly integrated**" of kubernetes-retired#271 from @Sasso . I believe this also achieves what was requested by @andrejvanderzee in kubernetes-retired#176 (comment). After applying this change: 1. All the `kube-aws node-pools` sub-commands are dropped 2. You can now bring up a main cluster and one or more node pools at once with `kube-aws up` 3. You can now update all the sub-clusters including a main cluster and node pool(s) by running `kube-aws update` 4. You can now destroy all the AWS resources spanning main and node pools at once with `kube-aws destroy` 5. You can configure node pools by defining a `worker.nodePools` array in cluster.yaml` 6. `workerCount` is dropped. Please migrate to `worker.nodePools[].count` 7. `node-pools/` and hence `node-pools/<node pool name>` directories, `cluster.yaml`, `stack-template.json`, `user-data/cloud-config-worker` for each node pool are dropped. 8. A typical local file tree would now look like: - `cluster.yaml` - `stack-templates/` (generated on `kube-aws render`) - `root.json.tmpl` - `control-plane.json.tmpl` - `node-pool.json.tmpl` - `userdata/` - `cloud-config-worker` - `cloud-config-controller` - `cloud-config-etcd` - `credentials/` - *.pem(generated on `kube-aws render`) - *.pem.enc(generated on `kube-aws validate` or `kube-aws up`) - `exported/` (generated on `kube-aws up --export --s3-uri <s3uri>`) - `stacks/` - `control-plane/` - `stack.json` - `user-data-controller` - `<node pool name = stack name>/` - `stack.json` - `user-data-worker` 9. A typical object tree in S3 would now look like: - `<bucket and directory from s3URI>`/ - kube-aws/ - clusters/ - `<cluster name>`/ - `exported`/ - `stacks`/ - `control-plane/` - `stack.json` - `cloud-config-controller` - `<node pool name = stack name>`/ - `stack.json` Implementation details: Under the hood, kube-aws utilizes CloudFormation nested stacks to delegate management of multiple stacks as a whole. kube-aws now creates 1 root stack and nested stacks including 1 main(or currently named "control plane") stack and 0 or more node pool stacks. kube-aws operates on S3 to upload all the assets required by all the stacks(root, main, node pools) and then on CloudFormation to create/update/destroy a root stack. An example `cluster.yaml` I've been used to test this looks like: ```yaml clusterName: <your cluster name> externalDNSName: <your external dns name> hostedZoneId: <your hosted zone id> keyName: <your key name> kmsKeyArn: <your kms key arn> region: ap-northeast-1 createRecordSet: true experimental: waitSignal: enabled: true subnets: - name: private1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.1.0/24" private: true - name: private2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.2.0/24" private: true - name: public1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.3.0/24" - name: public2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.4.0/24" controller: subnets: - name: public1 - name: public2 loadBalancer: private: false etcd: subnets: - name: public1 - name: public2 worker: nodePools: - name: pool1 subnets: - name: asgPublic1a - name: pool2 subnets: # former `worker.subnets` introduced in v0.9.4-rc.1 via kubernetes-retired#284 - name: asgPublic1c instanceType: "c4.large" # former `workerInstanceType` in the top-level count: 2 # former `workerCount` in the top-level rootVolumeSize: ... rootVolumeType: ... rootVolumeIOPs: ... autoScalingGroup: minSize: 0 maxSize: 10 waitSignal: enabled: true maxBatchSize: 2 - name: spotFleetPublic1a subnets: - name: public1 spotFleet: targetCapacity: 1 unitRootVolumeSize: 50 unitRootvolumeIOPs: 100 rootVolumeType: gp2 spotPrice: 0.06 launchSpecifications: - spotPrice: 0.12 weightedCapacity: 2 instanceType: m4.xlarge rootVolumeType: io1 rootVolumeIOPs: 200 rootVolumeSize: 100 ```
We use git branching to track customizations on top of kube-aws generated CF/userdata. Having nodepools as a separate level with arbitrary paths complicates things for us. It also leads to not insignificant code duplication in
nodepool/
.What was justification for such split? would you consider PR which manages nodepools as part of stack-template.json? Either embedded or as a nested CF stack.
The text was updated successfully, but these errors were encountered: