-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Multi-cluster architecture to increase resiliency and reduce inter-az data transfer charges #1802
Conversation
data "aws_availability_zones" "available" {} | ||
|
||
locals { | ||
cluster_name = format("%s-%s", basename(path.cwd), "shared") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets follow the current norm from what is used in other patterns:
cluster_name = format("%s-%s", basename(path.cwd), "shared") | |
name = basename(path.cwd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
azs = slice(data.aws_availability_zones.available.names, 0, 3) | ||
|
||
tags = { | ||
Blueprint = local.cluster_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blueprint = local.cluster_name | |
Blueprint = local.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
source = "terraform-aws-modules/vpc/aws" | ||
version = "~> 5.0" | ||
|
||
name = local.cluster_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name = local.cluster_name | |
name = local.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -0,0 +1,47 @@ | |||
provider "aws" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe the cluster-per-AZ design requires splitting up the Terraform configurations into multiple directories. We should collapse this back down to a single directory, but have multiple cluster definitions - one for each AZ used. This can be shown with a set of definitions split into multiple files - for example:
az1.tf
az1.yaml
az2.tf
az2.yaml
az3.tf
az3.yaml
Within each of these AZ specific Terraform files we'll have:
- EKS cluster definition
- Addons definition
- Kubernetes and Helm aliased providers scoped to that cluster and addon definition
Then within each of the AZ specific YAML files will be the Karpenter specific manifests for that AZ and cluster within that AZ.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, made changes as suggested. We were following the istio multi-cluster pattern structure before.
|
||
This example shows how to provision a cell based Amazon EKS cluster. | ||
|
||
* Deploy EKS Cluster with one managed node group in a VPC and AZ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the motivation for mixing Fargate, managed nodegroup, and Karpenter in this design?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was about showing how to use them in single AZ pattern. Removed the Fargate and using 1 managed node group + Karpenter now.
3. [terraform](https://learn.hashicorp.com/tutorials/terraform/install-cli) | ||
4. [helm](https://helm.sh/docs/helm/helm_install/) | ||
|
||
## Deploy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the other pattern readmes for the "standard" README structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
} | ||
|
||
provider "kubectl" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the use of kubectl provider and added instructions in README.md to install the karpenter and sample app.
# Karpenter | ||
################################################################################ | ||
|
||
resource "aws_security_group" "karpenter_sg" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these Karpenter security group resources required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, cleaned them up.
|
||
resource "kubectl_manifest" "karpenter_provisioner" { | ||
yaml_body = <<-YAML | ||
apiVersion: karpenter.sh/v1alpha5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll want to use v0.32 and v1beta1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -0,0 +1,12 @@ | |||
|
|||
variable "name" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't provide variables unless we absolutely require something from users to deploy (i.e. - a domain name or hosted zone) - these are not consumed in place, they are references
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the references
This PR has been automatically marked as stale because it has been open 30 days |
I'll work on the changes and get back with an updated PR. |
Updates made based on the above feedback. |
This PR has been automatically marked as stale because it has been open 30 days |
A gentle reminder to review this. |
This PR has been automatically marked as stale because it has been open 30 days |
Pull request closed due to inactivity. |
Another gentle reminder to re-open this PR and review the latest changes. |
Hi @ashoksrirama, so in this pattern we create one EKS cluster per az and you deploy an app in each cluster. I am not sure to understand what new here and what needs another pattern? The title is multi-cluster with objective to reduce inter-az costs, so basically you do it just by having 3 clusters. What about the complexity of managing 3 agains 1 cluster ? What if you deploy microservices that needs to talk each other, how you ensure high availability in case of problem on 1 az ? I think I’m missing something here |
Description
Motivation and Context
How was this change tested?
pre-commit run -a
with this PRAdditional Notes