Skip to content
This repository was archived by the owner on Jun 10, 2024. It is now read-only.

Conversation

@tshaynik
Copy link
Contributor

@tshaynik tshaynik commented Sep 28, 2023

This PR is to keep track of the work that @ce0la and I have done to create the new Kubernetes 1.26 cluster.

In addition to @ce0la's earlier commits, I've attempted to make the minimal changes necessary to deploy a new cluster here, but there are more substantial changes that need to happen, that will follow in subsequent PRs, which will aim to resolve terraform issues, issues of cluster functionality, and incorporate aspects of the deployment process that are currently imperative and undocumented.

Summary by CodeRabbit

  • New Feature: Added support for deploying and managing postgres-operator and otel-operator in a Kubernetes cluster, enhancing the system's capabilities.
  • New Feature: Introduced a Helm release for "cardano", enabling more efficient deployment and management of Cardano-related resources.
  • New Feature: Added new variables for customizing the deployment of kubevela and default Helm behavior, providing users with more control over their deployments.
  • New Feature: Added configurations for various addons for an EKS cluster, including Cluster Autoscaler, External-DNS, Cert-Manager, Traefik Load Balancer, and KubeVela Controller, enhancing the functionality and flexibility of the EKS cluster.
  • New Feature: Introduced a Terraform configuration for creating an EKS cluster and an RDS database instance, streamlining the infrastructure setup process.
  • Update: Upgraded the "kubevela" Go module from v1.7.4 to v1.9.6, ensuring users have access to the latest features and improvements.
  • Chore: Updated .gitignore to exclude certain file patterns, improving the cleanliness of the codebase.

@tshaynik tshaynik requested review from ce0la and shlevy September 28, 2023 06:12
@tshaynik tshaynik marked this pull request as ready for review October 3, 2023 15:03
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 3, 2023

Walkthrough

This pull request introduces significant changes to the infrastructure setup, primarily focusing on the deployment and configuration of Kubernetes resources using Terraform. It includes updates to the EKS cluster setup, addition of various EKS addons, and the creation of an RDS database instance. The changes also involve updates to the kubevela Go module.

Changes

File(s) Summary
.gitignore Added new file patterns to be ignored by version control.
infra/account.hcl, infra/terragrunt.hcl Updated Terraform configuration variables and added new ones.
infra/modules/kubevela/..., infra/modules/kubevela-addons/... Updated Kubernetes resources and added new ones, including Helm releases and operators.
infra/modules/eks/outputs.tf, infra/modules/eks/variables.tf Added output declarations and variables for EKS cluster configuration.
infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl, infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl Introduced Terraform configurations for EKS cluster and addons setup.
infra/us-east-1/prod/rds/prod-marlowe-runtime-db/terragrunt.hcl Added a Terraform configuration for creating an RDS database instance.
packages/kubevela.nix Updated the version of the kubevela Go module and renamed the generated binaries.

"In the land of code, where the shadows lie,

We've made some changes, oh me, oh my! 🐰

Kubernetes resources, now neatly arrayed,

With Terraform's magic, the foundation is laid. 🎩✨

EKS and RDS, standing tall and proud,

In the cloud they whisper, silent yet loud. ☁️🗣️

So here's to the changes, may they run smooth and quick,

In the world of DevOps, that's the ultimate trick! 🥂🎉"


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.json

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 12

Commits Files that changed from the base of the PR and between 6153c38 and e7af10e.
Files ignored due to filter (4)
  • infra/prod-new-us-east-1/k8s/eks-addons/gateway_crds.yaml
  • infra/prod-new-us-east-1/k8s/eks-addons/letsencrypt_issuer.yaml
  • infra/prod-new-us-east-1/secrets.yaml
  • infra/prod-us-east-1/k8s/rds-postgres/key-pair/bastion-key-enc.yaml
Files selected for processing (18)
  • .gitignore (1 hunks)
  • infra/account.hcl (1 hunks)
  • infra/modules/kubevela/addons.tf (1 hunks)
  • infra/modules/kubevela/definitions.tf (5 hunks)
  • infra/prod-new-us-east-1/encryption-config/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/env.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks/max-pods-calculator.sh (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/grafana-agent/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/kubevela/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/vpc/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/bastion/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/key-pair/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/rds/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/security-group/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/vpc/terragrunt.hcl (2 hunks)
  • infra/provider-configs/k8s.hcl (1 hunks)
Files skipped from review due to trivial changes (7)
  • .gitignore
  • infra/account.hcl
  • infra/modules/kubevela/addons.tf
  • infra/prod-new-us-east-1/k8s/kubevela/terragrunt.hcl
  • infra/prod-us-east-1/k8s/rds-postgres/key-pair/terragrunt.hcl
  • infra/prod-us-east-1/k8s/rds-postgres/security-group/terragrunt.hcl
  • infra/provider-configs/k8s.hcl
Additional comments (Suppressed): 50
infra/prod-new-us-east-1/env.hcl (1)
  • 1-10: The new hunk introduces local variables for the Terraform configuration. These variables include aws_region, environment, project, namespaces, clustername, and domain. The clustername is constructed using the project, environment, and aws_region variables. This setup seems to be well-structured and follows best practices for managing Terraform configurations.

However, please ensure that the domain "dapps.aws.iohkdev.io" is correct and available for use in this context. Also, verify that the namespaces "preprod-prod" and "mainnet-prod" are correctly defined and match with your Kubernetes cluster's namespace configuration.

infra/prod-new-us-east-1/encryption-config/terragrunt.hcl (3)
  • 1-16: The use of locals to define and manage common variables is a good practice. It enhances code readability and maintainability by avoiding hard-coded values and repetition. The naming convention used for the variables is clear and consistent.

  • 18-20: The source URL for the Terraform module points to a specific version (v1.2.0) of the terraform-aws-kms repository. This is a good practice as it ensures that the infrastructure is built using a known, stable version of the module. However, make sure that this version of the module supports all the features required by your infrastructure and is compatible with your current Terraform version.

  • 26-31: The inputs block provides values for the Terraform module's input variables. The use of string interpolation (${local.name}) to generate dynamic values based on local variables is a good practice. However, ensure that the policy_flavor value ("eks_root_volume_encryption") is valid and corresponds to a policy defined within the terraform-aws-kms module.

infra/prod-new-us-east-1/vpc/terragrunt.hcl (2)
  • 1-10: The use of locals to define and reuse common variables is a good practice. It enhances readability and maintainability of the code.

  • 55-59: Good use of tags to identify resources related to Terraform, environment, and project.

infra/prod-us-east-1/vpc/terragrunt.hcl (3)
  • 14-16: The source of the Terraform configurations has been updated from version v3.14.4 to v5.1.1. Ensure that this new version is compatible with the rest of your infrastructure and that it supports all the features you are using in your configuration.

  • 50-53: New settings have been introduced to manage the default network ACL, route table, and security group. By setting these to false, you're opting out of letting the VPC module manage these resources. Make sure this is intended and that these resources are managed elsewhere in your infrastructure setup.

  • 55-57: The tags block has been moved under a new set of configurations. This should not cause any issues as long as the tags are still applied to the correct resources. However, verify that this change does not affect the tagging of your resources.

infra/prod-new-us-east-1/k8s/grafana-agent/terragrunt.hcl (6)
  • 1-3: The include block is used to include the root terragrunt.hcl configuration. This is a common practice in Terragrunt to avoid duplicating Terraform configurations.

  • 5-15: Local variables are being defined here, which includes reading provider configs, environment variables, and secrets. The use of sops_decrypt_file function ensures that encrypted secrets are decrypted before use. However, ensure that the decryption keys are securely managed and not exposed.

  • 17-18: The generate block merges the provider blocks from k8s.hcl and helm.hcl. This is a good way to manage multiple providers in a single Terragrunt configuration.

  • 20-22: The source attribute points to the grafana-agent module. Ensure that this path is correct and the module exists at this location.

  • 24-31: A dependency on the eks module is declared here. Mock outputs are provided for testing purposes. In production, these values should be replaced with actual outputs from the eks module.

  • 33-40: Inputs for the grafana-agent module are defined here. The cluster-name and k8s-cluster-name are fetched from the outputs of the eks module. The grafana-password is fetched from the decrypted secrets. Ensure that the grafana-username is correct and valid.

infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl (5)
  • 1-22: The use of read_terragrunt_config and find_in_parent_folders functions to load environment-level variables and account-specific variables is a good practice. It helps in maintaining modularity and reusability of the code. However, ensure that the paths provided are correct and the required .hcl files exist at those locations.

  • 24-24: The merge function is used to combine the generate blocks from different provider configurations. This is a good way to consolidate configurations for different providers. Ensure that there are no conflicting keys in these blocks.

  • 26-44: Dependencies on eks and vpc modules are well defined with mock outputs. These mock outputs will be useful during testing when actual outputs are not available. Make sure that the actual outputs from these modules match the structure and type of the mock outputs.

  • 46-136: The inputs block contains configuration for various Kubernetes addons. Each addon has its own set of configurations which are well structured. However, make sure that all the values provided are valid and compatible with the versions of the addons being used. For example, the version of cluster-autoscaler is set to v1.25.0. Verify if this version is compatible with your Kubernetes cluster version.

  • 138-156: The generate blocks are used to create Terraform resources from YAML files. This is a good way to manage Kubernetes resources using Terraform. However, ensure that the YAML files (letsencrypt_issuer.yaml and gateway_crds.yaml) exist at the specified location and they contain valid Kubernetes resource definitions.

infra/prod-us-east-1/k8s/rds-postgres/bastion/terragrunt.hcl (2)
  • 28-28: The Terraform AWS EC2 instance module version v5.3.1 is being used here. Please ensure that this version is compatible with the current Terraform version and other modules in use. Also, consider using a more recent stable version if available to benefit from any bug fixes or improvements.

  • 49-57: The inputs for the EC2 instance are well defined and clear. However, please ensure that the t2.micro instance type (line 50) is sufficient for the intended workload of this bastion host. If the workload is expected to be high, consider choosing a larger instance type.

infra/modules/kubevela/definitions.tf (5)
  • 1-12: No changes have been made in this hunk. The traitdefinition_https_route resource definition remains the same.

  • 27-38: No changes have been made in this hunk. The traitdefinition_http_route resource definition remains the same.

  • 54-65: No changes have been made in this hunk. The traitdefinition_resource resource definition remains the same.

  • 81-94: No changes have been made in this hunk. The workflowtdefinition_build_nix_image resource definition remains the same.

  • 103-110: No changes have been made in this hunk. The componentdefinition_helmrelease resource definition remains the same.

infra/prod-new-us-east-1/k8s/eks/max-pods-calculator.sh (6)
  • 1-161: This script is well-written and follows good practices for bash scripting. It includes error handling, usage help, and argument parsing. The logic seems sound, and it uses AWS CLI and jq to fetch and parse data from AWS. However, there are a few points that could be improved or verified.

  • 84-86: These lines use the EC2 instance metadata service to fetch the instance type and region. Ensure that this script is intended to run on an EC2 instance and that the necessary permissions are in place to access the metadata service.

  • 121-130: These lines use the AWS CLI to fetch information about the specified instance type. Ensure that the necessary permissions are in place to make these API calls.

  • 137-139: This block reduces the number of ENIs used for pods if custom networking is enabled. Verify that this behavior is expected and correctly reflects your networking setup.

  • 141-145: This block calculates the maximum number of pods based on whether prefix delegation is supported and enabled. Ensure that these calculations are correct for your specific use case.

  • 157-161: This block limits the total number of pods based on the CPU count of the instance type. Verify that these limits are appropriate for your use case.

infra/prod-us-east-1/k8s/rds-postgres/rds/terragrunt.hcl (6)
  • 29-31: Ensure that the referenced Terraform module version (v6.1.1) is compatible with your Terraform version and meets your requirements.

  • 38-64: The RDS instance configuration looks fine, but ensure that the chosen instance_class and max_allocated_storage meet your performance and storage requirements. Also, verify that the major_engine_version and family are compatible with your application's PostgreSQL requirements.

  • 51-54: Ensure that the snapshot_identifier exists and is accessible. The skip_final_snapshot is set to false, which means a final DB snapshot will be created before the DB instance is deleted. Make sure this is intended. The backup_retention_period is set to 21 days, verify if this aligns with your backup policy.

  • 56-57: Performance insights are enabled with a retention period of 31 days. Ensure this aligns with your monitoring needs and cost considerations.

  • 59-60: The ca_cert_identifier is set to "rds-ca-rsa2048-g1". Verify that this certificate is available and valid for your RDS instances.

  • 61-61: apply_immediately is set to true, which means that any changes will be applied immediately instead of waiting for the maintenance window. This could cause downtime if not handled properly. Ensure this is intended.

infra/prod-new-us-east-1/k8s/eks/terragrunt.hcl (11)
  • 1-46: The local variables and dependencies are well defined. The use of read_terragrunt_config and find_in_parent_folders functions ensures that the configuration is modular and reusable. The extraction of common variables for reuse improves readability and maintainability.

  • 41-43: This command calculates the maximum number of pods that can run on a Kubernetes worker node. Ensure that the script max-pods-calculator.sh exists in the directory ${get_terragrunt_dir()} and has the correct permissions to execute. Also, verify that the instance type t3a.large and CNI version 1.11.4 are appropriate for your setup.

  • 49-56: The source URL points to a specific version (v19.16.0) of the Terraform AWS EKS module. Make sure this version is compatible with your infrastructure. The after_hook command updates the kubeconfig file after applying the changes. This is a good practice as it ensures that the kubeconfig file is always up-to-date.

  • 65-71: Dependencies on VPC and encryption_config are declared. This ensures that these resources are created before the EKS cluster. It's a good practice to declare dependencies explicitly to avoid any race conditions during resource creation.

  • 77-288: The inputs block contains all the necessary configurations for the EKS cluster, including the cluster name, version, endpoint access settings, encryption config, addons, VPC and subnet IDs, security group rules, managed node groups, and aws-auth configmap. These configurations seem comprehensive and cover most aspects of an EKS cluster setup. However, ensure that these configurations align with your infrastructure requirements.

  • 81-82: The Kubernetes cluster version is set to 1.26. Ensure that this version is compatible with your applications and other infrastructure components.

  • 103-105: The VPC and subnet IDs are fetched from the outputs of the VPC dependency. This ensures that the EKS cluster is created in the correct VPC and subnets.

  • 107-178: These ingress and egress rules seem to be well defined with specific ports and protocols. However, some of the ingress rules allow traffic from all IP addresses (0.0.0.0/0 and ::/0). Review these rules to ensure they meet your security requirements.

  • 181-218: The default settings for the EKS managed node groups are defined here. These include tags, desired size, min/max size, capacity type, platform, AMI release version, additional IAM role policies, EBS optimization, update config, and block device mappings. Ensure that these settings align with your infrastructure requirements.

  • 220-276: The configurations for individual managed node groups are defined here. Each node group has its own settings for AMI type, instance types, subnet IDs, bootstrap user data, labels, and taints. The use of local.bootstrap_cmd for bootstrap_extra_args ensures that the maximum number of pods is set correctly for each node group. However, one of the node groups (cardano-128) is commented out. If this node group is not needed, consider removing it to keep the code clean.

  • 278-288: The aws-auth configmap is managed by Terraform, which is a good practice as it allows you to control access to the EKS cluster. The KMS key owners and administrators are set to the list of users, and the aws_auth_users are mapped from the list of users. The tags are set to local.tags. Ensure that these settings meet your access control and tagging requirements.

Comment on lines 14 to 16
terraform {
source = "git::https://github.com/terraform-aws-modules/terraform-aws-vpc.git//.?ref=v5.1.1"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source URL for the Terraform module is hardcoded. Consider parameterizing the version (v5.1.1) to make it easier to upgrade or downgrade the module in the future.

Comment on lines 12 to 23
function print_help {
echo "usage: $0 <instance(s)> [options]"
echo "Calculates maxPods value to be used when starting up the kubelet."
echo "-h,--help print this help."
echo "--instance-type Specify the instance type to calculate max pods value."
echo "--instance-type-from-imds Use this flag if the instance type should be fetched from IMDS."
echo "--cni-version Specify the version of the CNI (example - 1.7.5)."
echo "--cni-custom-networking-enabled Use this flag to indicate if CNI custom networking mode has been enabled."
echo "--cni-prefix-delegation-enabled Use this flag to indicate if CNI prefix delegation has been enabled."
echo "--cni-max-eni specify how many ENIs should be used for prefix delegation. Defaults to using all ENIs per instance."
echo "--show-max-allowed Use this flag to show max number of Pods allowed to run in Worker Node. Otherwise the script will show the recommended value"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The print_help function provides a useful summary of the script's usage and options. However, it would be beneficial to include more detailed descriptions for each option, especially for those that might not be immediately clear to users (e.g., --cni-max-eni, --show-max-allowed).

Comment on lines 35 to 60
INSTANCE_TYPE=$2
shift
shift
;;
--instance-type-from-imds)
INSTANCE_TYPE_FROM_IMDS=true
shift
;;
--cni-version)
CNI_VERSION=$2
shift
shift
;;
--cni-custom-networking-enabled)
CNI_CUSTOM_NETWORKING_ENABLED=true
shift
;;
--cni-prefix-delegation-enabled)
CNI_PREFIX_DELEGATION_ENABLED=true
shift
;;
--cni-max-eni)
CNI_MAX_ENI=$2
shift
shift
;;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script assumes that if an option requires a value, that value will be provided in the next argument ($2). This could lead to unexpected behavior if a user forgets to provide a value. Consider adding checks to ensure that a value is provided for these options.

Comment on lines 94 to 95
echo "You must specify a CNI Version to use. Example - 1.7.5"
exit 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines exit the script if certain required values are not provided. However, they do not provide a non-zero exit code. In bash, a non-zero exit code typically indicates an error. Consider providing a non-zero exit code when exiting due to an error.

-  exit 1
+  exit 1

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Commits Files that changed from the base of the PR and between e7af10e and aea3721.
Files selected for processing (1)
  • infra/prod-new-us-east-1/rds-postgres/terragrunt.hcl (1 hunks)
Additional comments (Suppressed): 5
infra/prod-new-us-east-1/rds-postgres/terragrunt.hcl (5)
  • 1-18: The use of locals to define common variables is a good practice for code reusability and maintainability. However, ensure that the account.hcl file exists in one of the parent folders, as it's being referenced in line 4.

  • 20-26: Dependencies on "vpc" and "eks" are well defined. Ensure that the relative paths provided are correct and the required outputs are available in the respective modules.

  • 28-30: The source for the Terraform module is specified correctly. However, ensure that the version of the module (v6.1.1) is compatible with your Terraform version and meets your requirements.

  • 32-35: The inclusion of settings from the root terragrunt.hcl file is a good practice for maintaining consistency across modules. Ensure that the root terragrunt.hcl file exists in one of the parent folders.

  • 37-75: The input variables for the RDS module are well defined. However, ensure that the monitoring_role_arn in line 47 exists and has the necessary permissions for RDS monitoring. Also, verify that the snapshot_identifier in line 60 is correct and the snapshot exists if you intend to create the RDS instance from a snapshot.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Commits Files that changed from the base of the PR and between aea3721 and f17b272.
Files ignored due to filter (3)
  • infra/modules/before_addons/gateway_crds.yaml
  • infra/modules/before_addons/gateway_system.yaml
  • infra/modules/before_addons/letsencrypt_issuer.yaml
Files selected for processing (4)
  • infra/modules/before_addons/main.tf (1 hunks)
  • infra/modules/before_addons/versions.tf (1 hunks)
  • infra/prod-new-us-east-1/k8s/before_addons/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl (1 hunks)
Files skipped from review due to trivial changes (1)
  • infra/modules/before_addons/versions.tf
Additional comments (Suppressed): 12
infra/modules/before_addons/main.tf (1)
  • 1-23: The Terraform code looks good and follows best practices. It reads YAML files and applies them as Kubernetes manifests. However, ensure that the YAML files (letsencrypt_issuer.yaml, gateway_crds.yaml, gateway_system.yaml) exist at the specified path and are valid Kubernetes manifests. Also, verify that the order of resource creation doesn't cause any dependencies issues.
infra/prod-new-us-east-1/k8s/before_addons/terragrunt.hcl (1)
  • 1-30: The new hunk seems to be setting up the Terraform configuration for Kubernetes addons. It includes the parent configuration, sets the source to the before_addons module, and reads in several configuration files for Kubernetes providers and environment variables. It also sets up a dependency on the eks module. The mock outputs for the eks dependency might need to be updated to reflect actual values during the actual deployment.

Overall, the code seems to follow Terraform best practices and is well-structured. However, ensure that the paths provided in the read_terragrunt_config function calls are correct and the required .hcl files exist at those locations. Also, verify that the eks module is correctly set up and provides the outputs that this configuration expects.

infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl (10)
  • 1-22: The code is using Terragrunt's read_terragrunt_config function to read configuration from parent directories. This is a good practice for reusing configuration and keeping the code DRY. However, ensure that the paths provided are correct and the required .hcl files exist in the parent directories.

  • 26-33: The eks dependency is mocked with a cluster_id and cluster_oidc_issuer_url. Ensure that these mock values are replaced with actual values in the production environment.

  • 35-44: The vpc dependency is mocked with private_subnets_cidr_blocks. Similar to the eks dependency, ensure that these mock values are replaced with actual values in the production environment.

  • 46-49: The before_addons dependency is set to skip outputs. Ensure that this is the intended behavior and that no other modules depend on the outputs of before_addons.

  • 51-141: The inputs block is configuring various Kubernetes addons. Ensure that the versions specified for these addons are compatible with the Kubernetes version being used. Also, ensure that the extra_values provided for each addon are correct and meet the requirements of your Kubernetes cluster.

  • 65-73: The cluster-autoscaler addon is enabled with a scale-down-utilization-threshold of 0.7. This means that the autoscaler will start to scale down when the CPU utilization falls below 70%. Ensure that this threshold meets your requirements.

  • 75-78: The metrics-server addon is enabled with allowed_cidrs set to 10.10.0.0/16. Ensure that this CIDR block is correct and that it includes all the IP addresses that should be allowed to access the metrics server.

  • 80-88: The external-dns addon is enabled with domainFilters set to scdev-test.aws.iohkdev.io and play-test.marlowe.iohk.io. Ensure that these domains are correct and that the DNS records for these domains are properly configured.

  • 91-107: The traefik addon is enabled with an image tag of 3.0 and a service annotation for external-dns.alpha.kubernetes.io/hostname. Ensure that the image tag is correct and that the hostname annotation is properly configured.

  • 122-137: The cert-manager addon is enabled with a chart version of v1.9.1 and acme_dns01_enabled and acme_http01_enabled set to false. Ensure that the chart version is correct and that the ACME DNS-01 and HTTP-01 challenges are not required for your use case.

@tshaynik tshaynik force-pushed the infra/eks-cluster-migration branch from a7c7c78 to 31c61f3 Compare October 10, 2023 13:58
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Commits Files that changed from the base of the PR and between f17b272 and a7c7c78.
Files ignored due to filter (1)
  • infra/modules/before_addons/letsencrypt_issuer.yaml
Files selected for processing (2)
  • infra/modules/kubevela/definitions/http-route.cue (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl (1 hunks)
Files skipped from review due to trivial changes (1)
  • infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl
Additional comments (Suppressed): 1
infra/modules/kubevela/definitions/http-route.cue (1)
  • 14-20: The namespace for the gateway has been changed from "vela-system" to "traefik". Ensure that the "traefik" namespace exists and is the correct namespace for the gateway. Also, verify that this change does not affect other parts of the system that may be using the old namespace.
-				namespace: "vela-system"
+				namespace: "traefik"

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 8

Commits Files that changed from the base of the PR and between f7ee4bb and d4c2670.
Files ignored due to filter (6)
  • flake.lock
  • infra/modules/before_addons/gateway_crds.yaml
  • infra/modules/before_addons/gateway_system.yaml
  • infra/modules/before_addons/letsencrypt_issuer.yaml
  • infra/prod-new-us-east-1/secrets.yaml
  • infra/prod-us-east-1/k8s/rds-postgres/key-pair/bastion-key-enc.yaml
Files selected for processing (24)
  • .gitignore (1 hunks)
  • flake.nix (1 hunks)
  • infra/kubeconfig-dapps-prod-us-east-1 (3 hunks)
  • infra/modules/before_addons/main.tf (1 hunks)
  • infra/modules/before_addons/versions.tf (1 hunks)
  • infra/modules/kubevela/addons.tf (1 hunks)
  • infra/modules/kubevela/definitions.tf (5 hunks)
  • infra/modules/kubevela/definitions/http-route.cue (1 hunks)
  • infra/prod-new-us-east-1/encryption-config/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/env.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/before_addons/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks/max-pods-calculator.sh (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/grafana-agent/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/kubevela/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/rds-postgres/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/vpc/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/bastion/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/key-pair/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/rds/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/security-group/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/vpc/terragrunt.hcl (2 hunks)
  • infra/provider-configs/k8s.hcl (1 hunks)
Files skipped from review due to trivial changes (10)
  • .gitignore
  • flake.nix
  • infra/kubeconfig-dapps-prod-us-east-1
  • infra/modules/before_addons/versions.tf
  • infra/modules/kubevela/addons.tf
  • infra/prod-new-us-east-1/env.hcl
  • infra/prod-us-east-1/k8s/rds-postgres/bastion/terragrunt.hcl
  • infra/prod-us-east-1/k8s/rds-postgres/key-pair/terragrunt.hcl
  • infra/prod-us-east-1/k8s/rds-postgres/security-group/terragrunt.hcl
  • infra/provider-configs/k8s.hcl
Additional comments (Suppressed): 52
infra/modules/before_addons/main.tf (1)
  • 1-23: The Terraform code looks good and follows best practices. It uses the kubectl_manifest resource and kubectl_file_documents data source to manage Kubernetes resources. However, ensure that the YAML files referenced (letsencrypt_issuer.yaml, gateway_crds.yaml, gateway_system.yaml) are correctly formatted and contain the expected Kubernetes resources. Also, verify that these resources should indeed be created before the EKS addons, as the filename suggests.
infra/prod-new-us-east-1/encryption-config/terragrunt.hcl (4)
  • 1-16: The use of locals to define and reuse common variables is a good practice. It enhances readability and maintainability of the code.

  • 18-20: The source for the Terraform module is a specific version (v1.2.0) from a GitHub repository. Ensure that this version is compatible with your current Terraform version and other modules.

  • 22-24: The include block is used to inherit settings from the root terragrunt.hcl file. Make sure that the root file contains all the necessary configurations and that there are no conflicts with the current file.

  • 26-31: The inputs block is used to pass variables to the Terraform module. Ensure that these variables are correctly used in the module. The policy_flavor is set to "eks_root_volume_encryption", verify that this policy is suitable for your use case.

infra/prod-new-us-east-1/k8s/before_addons/terragrunt.hcl (3)
  • 1-3: Ensure that the parent folders contain the necessary configuration files for the include directive to work correctly. This is important for the correct functioning of the Terragrunt configuration.

  • 9-21: The use of read_terragrunt_config and find_in_parent_folders functions to load environment-level variables and account variables is a good practice. It helps in maintaining modularity and reusability of the code. However, ensure that the referenced files (env.hcl and account.hcl) exist in the parent folders and contain the required variables.

  • 23-30: The dependency block is used to express the dependencies between Terraform modules. The mock_outputs are used to provide default values during the plan phase of Terraform execution. Ensure that the cluster_name and cluster_oidc_issuer_url are replaced with actual values before applying the configuration.

infra/prod-new-us-east-1/rds-postgres/terragrunt.hcl (4)
  • 1-18: The local variables are well defined and the naming conventions are consistent. The use of read_terragrunt_config and find_in_parent_folders functions to load environment-level variables is a good practice. The extraction of common variables for reuse improves code maintainability.

  • 20-26: Dependencies on the VPC and EKS modules are clearly defined. This ensures that the RDS instance is created after the VPC and EKS cluster.

  • 28-35: The source for the Terraform module is clearly defined and the version is pinned, which is a good practice for reproducibility. The inclusion of settings from the root terragrunt.hcl file ensures that any global configurations are applied here as well.

  • 37-75: The input variables for the RDS instance are well defined. The use of dependency to reference outputs from other modules is a good practice. However, ensure that the snapshot identifier in line 60 is correct and up-to-date. Also, verify that the monitoring role ARN in line 47 has the necessary permissions.

infra/prod-new-us-east-1/k8s/kubevela/terragrunt.hcl (6)
  • 1-3: The include block is used to include the root terragrunt.hcl configuration. This is a common practice in Terragrunt to avoid duplicating Terragrunt configuration in each module.

  • 5-13: The locals block is used to define local variables. Here, it reads the configuration from the parent directory and assigns it to local variables. This is a good practice as it promotes code reusability and maintainability.

  • 15-16: The generate block is used to generate provider blocks for Kubernetes, Helm, and Kubectl. This is a good practice as it promotes code reusability and maintainability.

  • 18-20: The terraform block is used to specify the source of the Terraform module. Ensure that the path to the module is correct.

  • 22-29: The dependency block is used to specify the dependency on the eks module. The mock_outputs block is used to provide mock outputs for testing. Ensure that the mock outputs match the actual outputs of the eks module.

  • 31-36: The inputs block is used to specify the inputs to the Terraform module. Ensure that the inputs match the expected inputs of the module.

infra/prod-new-us-east-1/k8s/eks/terragrunt.hcl (4)
  • 1-46: The local variables are well defined and the use of read_terragrunt_config and find_in_parent_folders functions ensures that the configuration is modular and reusable. The use of string interpolation to construct the name and kubeconfigPath is also a good practice. However, ensure that the users variable in line 14 is an array of user names as expected by the list_users and map_users constructs in lines 19-27.

  • 42-43: The bootstrap_cmd is running a shell script to calculate the maximum number of pods allowed on a Kubernetes worker node. Ensure that the script max-pods-calculator.sh exists in the get_terragrunt_dir() directory and that it has the correct permissions to be executed. Also, verify that the AWS_REGION and AWS_PROFILE environment variables are correctly set before running the script.

  • 50-55: The source attribute in the terraform block is set to a specific version of the terraform-aws-eks module. This is a good practice as it ensures that the infrastructure is created using a known and tested version of the module. The after_hook block is used to update the kubeconfig file after the EKS cluster is created, which is a necessary step to be able to interact with the cluster using kubectl.

  • 77-288: The inputs block is well defined and covers a wide range of configurations for the EKS cluster, including the cluster version, endpoint access, encryption config, addons, VPC and subnet IDs, security group rules, managed node groups, and AWS auth configmap. The use of dependency to reference outputs from other Terraform modules is a good practice. However, ensure that the referenced modules are correctly configured and that they output the expected values.

infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl (11)
  • 1-22: The code is using Terragrunt's read_terragrunt_config function to read configuration from other files. Ensure that these files exist and are in the correct location. Also, verify that the variables being read are correctly defined in those files.

  • 26-33: The mock_outputs for the eks dependency are hardcoded. If these values are meant to be placeholders, ensure they are replaced with actual values during runtime or testing.

  • 35-44: The mock_outputs for the vpc dependency are hardcoded. If these values are meant to be placeholders, ensure they are replaced with actual values during runtime or testing.

  • 46-49: The before_addons dependency does not have any mock_outputs defined. If this is intentional, ignore this comment. Otherwise, consider adding mock_outputs for testing purposes.

  • 51-141: The inputs block contains configurations for various Kubernetes addons. Ensure that these configurations are correct and meet your requirements. Also, verify that the versions of the addons specified are compatible with your Kubernetes version.

  • 65-73: The cluster-autoscaler is configured with a scale-down-utilization-threshold of 0.7. This means that the autoscaler will start to scale down when the CPU utilization falls below 70%. Ensure that this value is appropriate for your use case.

  • 75-78: The metrics-server is configured to allow connections from the CIDR block 10.10.0.0/16. Ensure that this CIDR block is correct and that it includes all the nodes that need to connect to the metrics server.

  • 80-88: The external-dns addon is configured with specific domain filters. Ensure that these domains are correct and that they are owned by you.

  • 91-107: The traefik addon is configured with specific service annotations and a redirect rule. Ensure that these configurations are correct and meet your requirements.

  • 114-120: The aws-ebs-csi-driver addon is configured to use encryption and KMS. Ensure that the necessary KMS keys are available and that the IAM roles used by the EKS cluster have the necessary permissions to use KMS.

  • 122-137: The cert-manager addon is configured with specific feature gates and issuer settings. Ensure that these configurations are correct and meet your requirements.

infra/prod-new-us-east-1/vpc/terragrunt.hcl (5)
  • 1-10: The use of locals to define and reuse common variables is a good practice. It enhances readability and maintainability of the code.

  • 14-16: The source of the Terraform module is specified correctly with a version tag. This ensures that the same version of the module is used across all environments, providing consistency.

  • 40-48: The subnet tags are correctly set for Kubernetes. This allows Kubernetes to discover and use these subnets.

  • 50-53: The default network ACL, route table, and security group are not being managed by this configuration. Ensure that these resources are managed elsewhere to maintain security and routing rules.

  • 55-59: The use of tags is a good practice for resource management and cost tracking.

infra/prod-new-us-east-1/k8s/grafana-agent/terragrunt.hcl (6)
  • 1-3: The include block is used to include the configuration from the parent directory. This is a common practice in Terragrunt to avoid duplicating common configuration.

  • 5-15: The locals block is used to define local variables. The read_terragrunt_config function is used to read the configuration from the specified HCL files. The sops_decrypt_file function is used to decrypt the secrets file. Ensure that the secrets file is properly encrypted and that the decryption key is securely managed.

  • 17-18: The generate block is used to generate the provider blocks for Kubernetes and Helm. This is a good practice as it avoids duplicating the provider configuration.

  • 20-22: The source attribute in the terraform block specifies the path to the Terraform module for the Grafana agent. Ensure that the module exists at the specified path.

  • 24-31: The dependency block is used to express the dependency on the EKS module. The mock_outputs attribute is used to provide mock outputs for the dependency during testing. Ensure that the actual outputs of the EKS module match the mock outputs.

  • 33-40: The inputs block is used to provide inputs to the Terraform module. The dependency.eks.outputs.cluster_name expression is used to get the output of the EKS module. Ensure that the EKS module provides this output. The grafana-password input is set to the decrypted Grafana API key. Ensure that the API key is valid.

infra/modules/kubevela/definitions.tf (5)
  • 1-12: No changes detected in the new hunk compared to the old hunk. The traitdefinition_https_route resource definition remains the same.

  • 27-38: No changes detected in the new hunk compared to the old hunk. The traitdefinition_http_route resource definition remains the same.

  • 54-65: No changes detected in the new hunk compared to the old hunk. The traitdefinition_resource resource definition remains the same.

  • 81-94: No changes detected in the new hunk compared to the old hunk. The workflowtdefinition_build_nix_image resource definition remains the same.

  • 103-110: No changes detected in the new hunk compared to the old hunk. The componentdefinition_helmrelease resource definition remains the same.

infra/prod-us-east-1/vpc/terragrunt.hcl (2)
  • 14-16: The source of the Terraform module has been updated to a newer version (v5.1.1). Ensure that this version is compatible with the rest of your infrastructure and that it doesn't introduce breaking changes.

  • 50-53: New settings have been introduced to manage the default network ACL, route table, and security group. By setting these to false, you're opting out of Terraform managing these resources. Ensure that these resources are managed elsewhere or that they don't need to be managed.

infra/modules/kubevela/definitions/http-route.cue (1)
  • 14-20: The namespace for the gateway has been changed from "vela-system" to "traefik". Ensure that the "traefik" namespace exists and is the correct one for the gateway. Also, verify that this change does not affect other parts of the system that may be using the old namespace.
-				namespace: "vela-system"
+				namespace: "traefik"

Comment on lines +13 to +14
vpc_subnet_group_name = "default-vpc-0a68a5d196ce5e1d1"
vpc_security_group_ids = ["sg-0c2ff87114e5f6ec1"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vpc_subnet_group_name and vpc_security_group_ids are hardcoded. This could lead to issues if the VPC or security group configuration changes. Consider parameterizing these values or retrieving them dynamically from your infrastructure.

Comment on lines +29 to +31
terraform {
source = "github.com/terraform-aws-modules/terraform-aws-rds//.?ref=v6.1.1"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Terraform module source is hardcoded to a specific version (v6.1.1). While this ensures consistency, it may prevent you from benefiting from updates, bug fixes, or security patches in newer versions. Consider a strategy for updating this reference periodically.

@tshaynik tshaynik force-pushed the infra/eks-cluster-migration branch from d4c2670 to 84ada86 Compare October 16, 2023 17:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 15

Commits Files that changed from the base of the PR and between 1a43596 and 84ada86.
Files ignored due to filter (6)
  • flake.lock
  • infra/modules/before_addons/gateway_crds.yaml
  • infra/modules/before_addons/gateway_system.yaml
  • infra/modules/before_addons/letsencrypt_issuer.yaml
  • infra/prod-new-us-east-1/secrets.yaml
  • infra/prod-us-east-1/k8s/rds-postgres/key-pair/bastion-key-enc.yaml
Files selected for processing (24)
  • .gitignore (1 hunks)
  • flake.nix (1 hunks)
  • infra/kubeconfig-dapps-prod-us-east-1 (3 hunks)
  • infra/modules/before_addons/main.tf (1 hunks)
  • infra/modules/before_addons/versions.tf (1 hunks)
  • infra/modules/kubevela/addons.tf (1 hunks)
  • infra/modules/kubevela/definitions.tf (5 hunks)
  • infra/modules/kubevela/definitions/http-route.cue (1 hunks)
  • infra/prod-new-us-east-1/encryption-config/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/env.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/before_addons/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks/max-pods-calculator.sh (1 hunks)
  • infra/prod-new-us-east-1/k8s/eks/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/grafana-agent/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/k8s/kubevela/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/rds-postgres/terragrunt.hcl (1 hunks)
  • infra/prod-new-us-east-1/vpc/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/bastion/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/key-pair/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/rds/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/k8s/rds-postgres/security-group/terragrunt.hcl (1 hunks)
  • infra/prod-us-east-1/vpc/terragrunt.hcl (2 hunks)
  • infra/provider-configs/k8s.hcl (1 hunks)
Files skipped from review due to trivial changes (6)
  • .gitignore
  • flake.nix
  • infra/modules/before_addons/versions.tf
  • infra/modules/kubevela/addons.tf
  • infra/prod-new-us-east-1/k8s/eks-addons/terragrunt.hcl
  • infra/provider-configs/k8s.hcl
Additional comments (Suppressed): 68
infra/prod-new-us-east-1/vpc/terragrunt.hcl (5)
  • 1-10: The use of locals to define and reuse common variables is a good practice. It enhances readability and maintainability of the code. However, ensure that the files env.hcl and account.hcl exist in the parent folders and contain the required variables.

  • 14-16: The source for the Terraform module is a specific version (v5.1.1) from the GitHub repository. This is a good practice as it ensures that the code is not broken by future updates to the module. However, verify that this version of the module supports all the features you are using in your configuration.

  • 40-48: The subnet tags are correctly configured for integration with Kubernetes. However, ensure that the Kubernetes cluster is configured to recognize these tags.

  • 50-53: The default network ACL, route table, and security group are not being managed by this configuration. Ensure that these resources are being managed appropriately elsewhere.

  • 55-59: The use of tags is a good practice for managing and identifying resources. However, ensure that these tags are used consistently across all resources.

infra/prod-new-us-east-1/k8s/grafana-agent/terragrunt.hcl (4)
  • 1-15: The use of read_terragrunt_config and find_in_parent_folders functions to fetch configuration from parent directories is a good practice for modularity and reusability. However, ensure that the paths provided are correct and the required files exist in the parent directories. Also, verify that the secrets.yaml file is properly encrypted and the decryption key is securely managed.

  • 17-18: The merge function is used to combine the generate blocks from k8s.hcl and helm.hcl. Ensure that there are no conflicting configurations between these two files that could cause issues when merged.

  • 24-31: The dependency block is used to fetch outputs from the eks module. The mock_outputs are used for testing and should be replaced with actual outputs in the production environment. Ensure that the cluster_name and cluster_oidc_issuer_url are correctly set in the eks module.

  • 33-40: The inputs block is used to pass variables to the Terraform module. Ensure that the cluster-name and k8s-cluster-name are correctly set from the eks module outputs. Also, verify that the grafana-username and grafana-password are correct and securely managed.

infra/prod-new-us-east-1/k8s/eks/max-pods-calculator.sh (1)
  • 1-161: The script is well-structured and follows good practices for bash scripting. However, there are a few areas that could be improved for better maintainability and error handling.
infra/prod-new-us-east-1/k8s/kubevela/terragrunt.hcl (5)
  • 1-13: The code reads configuration from parent directories and assigns them to local variables. This is a common pattern in Terragrunt to promote code reuse and maintainability. However, ensure that the paths to the configuration files are correct and that the files exist.

  • 16-16: The generate block merges the provider configurations from k8s, helm, and kubectl. This is a good practice as it allows for a single source of truth for provider configurations.

  • 18-20: The source attribute in the terraform block points to the kubevela module. Ensure that the relative path is correct and the module exists at the specified location.

  • 22-29: The dependency block declares a dependency on the eks module. The mock_outputs are used for testing and should match the actual outputs of the eks module. Ensure that the mock_outputs are updated if the eks module's outputs change.

  • 31-36: The inputs block sets the input variables for the kubevela module. The cluster-name and k8s-cluster-name are set to the cluster_name output of the eks module. The namespace is hardcoded to "vela-system". If the namespace is expected to change across environments, consider making it a variable.

infra/modules/kubevela/definitions/http-route.cue (1)
  • 14-20: The namespace for the gateway has been changed from "vela-system" to "traefik". Ensure that the "traefik" namespace exists and is the correct one for the gateway. Also, verify that all references to the old namespace "vela-system" have been updated accordingly.
-				namespace: "vela-system"
+				namespace: "traefik"
infra/prod-new-us-east-1/env.hcl (1)
  • 1-10: The new hunk introduces environment-specific variables for the new Kubernetes cluster. The variables include AWS region, environment name, project name, namespaces, cluster name, and domain. The cluster name is constructed using the project, environment, and AWS region variables. Ensure that the domain name is correct and that the namespaces are appropriate for the new environment. Also, verify that the AWS region is the intended one for the deployment of the new cluster.
infra/prod-new-us-east-1/k8s/before_addons/terragrunt.hcl (1)
  • 1-30: The new hunk seems to be well-structured and follows the Terraform best practices. It includes the parent configuration, sets the Terraform source, defines local variables, and sets up a dependency on the eks module. However, the mock_outputs in the eks dependency are hardcoded. If these values are meant to be placeholders, ensure they are replaced with actual values during runtime or consider using Terraform variables instead of hardcoded values.
infra/prod-new-us-east-1/rds-postgres/terragrunt.hcl (14)
  • 1-18: The account_vars are loaded from the parent folder's account.hcl file. The project and app variables are defined for reuse. The database_name is constructed using the project and app variables. Tags are defined for the resources.

  • 20-26: Dependencies on the vpc and eks modules are declared. Ensure that these modules are correctly configured and their outputs are as expected.

  • 28-30: The source for the Terraform module is specified as a GitHub repository. Ensure that the specified version (v6.1.1) of the module is compatible with your Terraform version and meets your requirements.

  • 32-35: The settings from the root terragrunt.hcl file are included. Ensure that the root file contains the necessary and correct configurations.

  • 37-75: The inputs for the Terraform module are defined. These include the RDS instance configuration, monitoring settings, database settings, and tags. Ensure that these settings meet your requirements and are correctly configured.

  • 39-40: The instance_class is set to db.m6g.2xlarge and multi_az is set to true. This means that the database will be a large instance and will be deployed across multiple availability zones for high availability.

  • 42-43: The allocated_storage is set to 1500GB and the max_allocated_storage is set to 3000GB. Ensure that these storage settings meet your requirements and are within your budget.

  • 45-47: The publicly_accessible is set to false, which means the database will not be accessible from the internet. The monitoring_interval is set to 60 seconds and a specific IAM role is used for monitoring. Ensure that the IAM role has the necessary permissions for RDS monitoring.

  • 49-54: The username is set to postgres. A new DB subnet group is created and the subnets from the vpc module are used. The security group from the eks module is used. Ensure that these settings are correctly configured and meet your requirements.

  • 56-59: The engine is set to postgres and the major_engine_version is set to 15. The family is set to postgres15. Ensure that this PostgreSQL version meets your requirements and is compatible with your application.

  • 60-64: A specific snapshot is used to create the database. The skip_final_snapshot is set to false, which means a final snapshot will be created when the database is deleted. The backup_retention_period is set to 21 days. Ensure that these settings meet your requirements.

  • 65-67: Performance insights are enabled and the retention period is set to 31 days. Ensure that these settings meet your requirements and are within your budget.

  • 68-73: The ca_cert_identifier is set to rds-ca-rsa2048-g1. The auto_minor_version_upgrade is set to false, which means that minor version upgrades will not be applied automatically. The apply_immediately is set to true, which means that changes will be applied immediately instead of during the maintenance window. The deletion_protection is set to true, which means that the database cannot be deleted without modifying this option. Ensure that these settings meet your requirements.

  • 74-75: The tags defined earlier are applied to the resources. Ensure that these tags meet your requirements.

infra/prod-new-us-east-1/encryption-config/terragrunt.hcl (3)
  • 1-16: The use of locals to define and manage variables is a good practice in Terraform. It enhances readability and maintainability of the code. However, ensure that the files env.hcl and account.hcl exist in the parent folders as expected, and they contain the required variables.

  • 18-20: The source for the Terraform module is a GitHub repository. Ensure that the version v1.2.0 of the module terraform-aws-kms is compatible with your Terraform version and meets your requirements.

  • 26-31: The inputs block is used to pass arguments into the Terraform module. Ensure that the policy_flavor value "eks_root_volume_encryption" is a valid option for the terraform-aws-kms module and meets your encryption requirements.

infra/modules/before_addons/main.tf (1)
  • 1-23: The code looks good overall. However, ensure that the files letsencrypt_issuer.yaml, gateway_crds.yaml, and gateway_system.yaml exist at the specified paths and that they contain valid Kubernetes manifests. Also, verify that the order of resource creation is correct and that dependencies between resources are properly managed.
infra/prod-us-east-1/vpc/terragrunt.hcl (3)
  • 15-16: The Terraform source version has been updated from v3.14.4 to v5.1.1. Ensure that this version is compatible with the rest of your infrastructure and that it supports all the features you're using. Also, verify that all necessary changes have been made to accommodate any breaking changes introduced between these versions.

  • 50-53: New configurations have been added to manage the default network ACL, route table, and security group. By setting these to false, you're opting out of Terraform managing these resources. Ensure that these resources are being managed appropriately elsewhere in your infrastructure.

  • 55-57: The tags block has been moved. This change doesn't affect the functionality, but ensure that it aligns with your team's coding style and conventions.

infra/prod-us-east-1/k8s/rds-postgres/key-pair/terragrunt.hcl (2)
  • 27-29: The source for the Terraform module is a GitHub repository with a specific version reference. This is a good practice as it ensures the code is using a stable version of the module. However, verify that the version v2.0.2 of the terraform-aws-key-pair module is compatible with the current Terraform and AWS provider versions being used.

  • 36-41: The inputs for the Terraform module are well defined. The create_private_key is set to true which means a new private key will be created if it does not exist. This could potentially overwrite an existing key. Ensure that this is the intended behavior and that it won't cause any issues with existing infrastructure.

infra/prod-us-east-1/k8s/rds-postgres/bastion/terragrunt.hcl (2)
  • 27-29: The Terraform module source is pinned to a specific version (v5.3.1). This is good for ensuring consistent behavior, but make sure to regularly update this to benefit from bug fixes and new features in the module.

  • 48-57: The inputs block is well-structured and makes good use of dependencies to fetch values dynamically. This enhances maintainability and reduces the risk of configuration errors.

infra/prod-new-us-east-1/k8s/eks/terragrunt.hcl (5)
  • 1-46: The local variables are well defined and the use of read_terragrunt_config and find_in_parent_folders functions ensures that the configuration is modular and reusable. The use of list and map comprehensions for list_users and map_users is efficient and readable. However, ensure that the users variable in account.hcl is correctly defined and contains the expected data.

  • 41-43: The bootstrap_cmd is running a shell script max-pods-calculator.sh with some parameters. Ensure that this script exists at the expected location and that it's executable. Also, verify that the script correctly handles the parameters and that the output is as expected for the max-pods value.

  • 50-56: The source for the Terraform module is a GitHub repository with a specific version. Ensure that this version is compatible with your Terraform version and that it contains all the necessary resources for your configuration. The after_hook is used to update the kubeconfig after applying the changes, which is a good practice to ensure that the local kubeconfig is always up-to-date.

  • 64-71: Dependencies on vpc and encryption_config are correctly defined. Ensure that these configurations exist and that they output the expected values.

  • 77-288: The inputs for the EKS module are well defined and cover a wide range of configurations, including cluster version, endpoint access, encryption config, addons, VPC and subnet IDs, security group rules, and managed node groups. Ensure that all these configurations meet your requirements and that the values are correctly defined in the local variables or dependencies. The commented out cardano-128 managed node group could be removed if it's not needed.

infra/prod-us-east-1/k8s/rds-postgres/rds/terragrunt.hcl (6)
  • 30-31: The Terraform AWS RDS module is being sourced from a specific version (v6.1.1). Ensure that this version is compatible with the rest of your infrastructure and that it supports all the features you need.

  • 39-64: The RDS instance configuration seems to be well defined with appropriate settings for a production environment. However, ensure that the instance_class and max_allocated_storage are suitable for your workload. Also, verify that the major_engine_version and family are compatible with your application.

  • 51-54: The snapshot settings seem to be well defined. However, ensure that the snapshot_identifier exists and that the backup_retention_period is in line with your data retention policy.

  • 56-57: Performance insights are enabled with a retention period of 31 days. Ensure that this is in line with your performance monitoring needs and that the associated costs are acceptable.

  • 59-60: The ca_cert_identifier is hardcoded. Ensure that this certificate exists and is valid. Consider fetching this value dynamically from the respective AWS resource.

  • 61-61: apply_immediately is set to true. This means that any changes to the RDS instance will be applied immediately instead of during the maintenance window. Ensure that this is acceptable for your production environment.

infra/modules/kubevela/definitions.tf (5)
  • 1-12: No significant changes detected in this hunk. The code remains the same as before.

  • 27-38: No significant changes detected in this hunk. The code remains the same as before.

  • 54-65: No significant changes detected in this hunk. The code remains the same as before.

  • 81-94: No significant changes detected in this hunk. The code remains the same as before.

  • 103-110: No significant changes detected in this hunk. The code remains the same as before.

infra/kubeconfig-dapps-prod-us-east-1 (5)
  • 1-9: The new hunk introduces a new Kubernetes cluster configuration. The certificate-authority-data and server fields have been updated with new values. Ensure that the new certificate-authority-data is valid and the server URL is correct and accessible. Also, verify that the new cluster name is unique and does not conflict with existing cluster names.

  • 12-19: The context section has been updated to include a new context for the new Kubernetes cluster. Ensure that the cluster and user fields match the name field in the clusters section. Also, verify that the new context name is unique and does not conflict with existing context names.

  • 20-20: The current-context has been updated to point to the new Kubernetes cluster. Ensure that this change is intentional and that the new context is ready to be used as the current context.

  • 24-43: A new user section has been added for the new Kubernetes cluster. The exec command has been configured to use the AWS CLI to get a token for the new cluster. Ensure that the AWS_PROFILE environment variable is set to the correct AWS profile. Also, verify that the cluster-name argument matches the name field in the clusters section.

  • 51-61: The exec command for the existing Kubernetes cluster has been updated. The --output and json arguments have been added to the command. Ensure that this change is intentional and that the AWS CLI supports these arguments.

infra/prod-us-east-1/k8s/rds-postgres/security-group/terragrunt.hcl (4)
  • 1-25: The locals block is used to define local values that can be used throughout the module. The values are well-structured and the use of read_terragrunt_config and find_in_parent_folders functions to load environment-level variables is a good practice. The naming convention for bastion_name and bastion_key_pair_name is consistent and clear. The tags are well-defined and follow a consistent naming convention.

  • 27-29: The Terraform source is set to a specific version of the terraform-aws-security-group module. This is a good practice as it ensures that the infrastructure is built using a known, stable version of the module.

  • 31-34: The include block is used to include all settings from the root terragrunt.hcl file. This is a good practice as it allows for code reuse and ensures that all modules follow the same base configuration.

  • 36-38: The dependency block is used to define a dependency on the vpc module. This is a good practice as it ensures that the vpc module is applied before this module.

Comment on lines 1 to 161
#!/usr/bin/env bash

set -o pipefail
set -o nounset
set -o errexit

err_report() {
echo "Exited with error on line $1"
}
trap 'err_report $LINENO' ERR

function print_help {
echo "usage: $0 <instance(s)> [options]"
echo "Calculates maxPods value to be used when starting up the kubelet."
echo "-h,--help print this help."
echo "--instance-type Specify the instance type to calculate max pods value."
echo "--instance-type-from-imds Use this flag if the instance type should be fetched from IMDS."
echo "--cni-version Specify the version of the CNI (example - 1.7.5)."
echo "--cni-custom-networking-enabled Use this flag to indicate if CNI custom networking mode has been enabled."
echo "--cni-prefix-delegation-enabled Use this flag to indicate if CNI prefix delegation has been enabled."
echo "--cni-max-eni specify how many ENIs should be used for prefix delegation. Defaults to using all ENIs per instance."
echo "--show-max-allowed Use this flag to show max number of Pods allowed to run in Worker Node. Otherwise the script will show the recommended value"
}

POSITIONAL=()

while [[ $# -gt 0 ]]; do
key="$1"
case $key in
-h | --help)
print_help
exit 1
;;
--instance-type)
INSTANCE_TYPE=$2
shift
shift
;;
--instance-type-from-imds)
INSTANCE_TYPE_FROM_IMDS=true
shift
;;
--cni-version)
CNI_VERSION=$2
shift
shift
;;
--cni-custom-networking-enabled)
CNI_CUSTOM_NETWORKING_ENABLED=true
shift
;;
--cni-prefix-delegation-enabled)
CNI_PREFIX_DELEGATION_ENABLED=true
shift
;;
--cni-max-eni)
CNI_MAX_ENI=$2
shift
shift
;;
--show-max-allowed)
SHOW_MAX_ALLOWED=true
shift
;;
*) # unknown option
POSITIONAL+=("$1") # save it in an array for later
shift # past argument
;;
esac
done

CNI_VERSION="${CNI_VERSION:-}"
CNI_CUSTOM_NETWORKING_ENABLED="${CNI_CUSTOM_NETWORKING_ENABLED:-false}"
CNI_PREFIX_DELEGATION_ENABLED="${CNI_PREFIX_DELEGATION_ENABLED:-false}"
CNI_MAX_ENI="${CNI_MAX_ENI:-}"
INSTANCE_TYPE="${INSTANCE_TYPE:-}"
INSTANCE_TYPE_FROM_IMDS="${INSTANCE_TYPE_FROM_IMDS:-false}"
SHOW_MAX_ALLOWED="${SHOW_MAX_ALLOWED:-false}"

PREFIX_DELEGATION_SUPPORTED=false
IPS_PER_PREFIX=16

if [ "$INSTANCE_TYPE_FROM_IMDS" = true ]; then
TOKEN=$(curl -m 10 -X PUT -H "X-aws-ec2-metadata-token-ttl-seconds: 600" -s "http://169.254.169.254/latest/api/token")
export AWS_DEFAULT_REGION=$(curl -s --retry 5 -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/dynamic/instance-identity/document | jq .region -r)
INSTANCE_TYPE=$(curl -m 10 -H "X-aws-ec2-metadata-token: $TOKEN" -s http://169.254.169.254/latest/meta-data/instance-type)
elif [ -z "$INSTANCE_TYPE" ]; then
# There's no reasonable default for an instanceType so force one to be provided to the script.
echo "You must specify an instance type to calculate max pods value."
exit 1
fi

if [ -z "$CNI_VERSION" ]; then
echo "You must specify a CNI Version to use. Example - 1.7.5"
exit 1
fi

calculate_max_ip_addresses_prefix_delegation() {
enis=$1
instance_max_eni_ips=$2
echo $((enis * ((instance_max_eni_ips - 1) * IPS_PER_PREFIX) + 2))
}

calculate_max_ip_addresses_secondary_ips() {
enis=$1
instance_max_eni_ips=$2
echo $((enis * (instance_max_eni_ips - 1) + 2))
}

min_number() {
printf "%s\n" "$@" | sort -g | head -n1
}

VERSION_SPLIT=(${CNI_VERSION//./ })
CNI_MAJOR_VERSION="${VERSION_SPLIT[0]}"
CNI_MINOR_VERSION="${VERSION_SPLIT[1]}"
if [[ $CNI_MAJOR_VERSION -gt 1 ]] || ([[ $CNI_MAJOR_VERSION == 1 ]] && [[ $CNI_MINOR_VERSION -gt 8 ]]); then
PREFIX_DELEGATION_SUPPORTED=true
fi

DESCRIBE_INSTANCES_RESULT=$(aws ec2 describe-instance-types --instance-type $INSTANCE_TYPE --query 'InstanceTypes[0].{Hypervisor: Hypervisor, EniCount: NetworkInfo.MaximumNetworkInterfaces, PodsPerEniCount: NetworkInfo.Ipv4AddressesPerInterface, CpuCount: VCpuInfo.DefaultVCpus'} --output json)

HYPERVISOR_TYPE=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.Hypervisor')
IS_NITRO=false
if [[ $HYPERVISOR_TYPE == "nitro" ]]; then
IS_NITRO=true
fi
INSTANCE_MAX_ENIS=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.EniCount')
INSTANCE_MAX_ENIS_IPS=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.PodsPerEniCount')

if [ -z "$CNI_MAX_ENI" ]; then
enis_for_pods=$INSTANCE_MAX_ENIS
else
enis_for_pods="$(min_number $CNI_MAX_ENI $INSTANCE_MAX_ENIS)"
fi

if [ "$CNI_CUSTOM_NETWORKING_ENABLED" = true ]; then
enis_for_pods=$((enis_for_pods - 1))
fi

if [ "$IS_NITRO" = true ] && [ "$CNI_PREFIX_DELEGATION_ENABLED" = true ] && [ "$PREFIX_DELEGATION_SUPPORTED" = true ]; then
max_pods=$(calculate_max_ip_addresses_prefix_delegation $enis_for_pods $INSTANCE_MAX_ENIS_IPS)
else
max_pods=$(calculate_max_ip_addresses_secondary_ips $enis_for_pods $INSTANCE_MAX_ENIS_IPS)
fi

# Limit the total number of pods that can be launched on any instance type based on the vCPUs on that instance type.
MAX_POD_CEILING_FOR_LOW_CPU=110
MAX_POD_CEILING_FOR_HIGH_CPU=250
CPU_COUNT=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.CpuCount')

if [ "$SHOW_MAX_ALLOWED" = true ]; then
echo $max_pods
exit 0
fi

if [ "$CPU_COUNT" -gt 30 ]; then
echo $(min_number $MAX_POD_CEILING_FOR_HIGH_CPU $max_pods)
else
echo $(min_number $MAX_POD_CEILING_FOR_LOW_CPU $max_pods)
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script uses a lot of global variables which can lead to unexpected behavior and make the script harder to maintain. Consider encapsulating the logic in functions to limit the scope of variables.

Comment on lines 1 to 161
#!/usr/bin/env bash

set -o pipefail
set -o nounset
set -o errexit

err_report() {
echo "Exited with error on line $1"
}
trap 'err_report $LINENO' ERR

function print_help {
echo "usage: $0 <instance(s)> [options]"
echo "Calculates maxPods value to be used when starting up the kubelet."
echo "-h,--help print this help."
echo "--instance-type Specify the instance type to calculate max pods value."
echo "--instance-type-from-imds Use this flag if the instance type should be fetched from IMDS."
echo "--cni-version Specify the version of the CNI (example - 1.7.5)."
echo "--cni-custom-networking-enabled Use this flag to indicate if CNI custom networking mode has been enabled."
echo "--cni-prefix-delegation-enabled Use this flag to indicate if CNI prefix delegation has been enabled."
echo "--cni-max-eni specify how many ENIs should be used for prefix delegation. Defaults to using all ENIs per instance."
echo "--show-max-allowed Use this flag to show max number of Pods allowed to run in Worker Node. Otherwise the script will show the recommended value"
}

POSITIONAL=()

while [[ $# -gt 0 ]]; do
key="$1"
case $key in
-h | --help)
print_help
exit 1
;;
--instance-type)
INSTANCE_TYPE=$2
shift
shift
;;
--instance-type-from-imds)
INSTANCE_TYPE_FROM_IMDS=true
shift
;;
--cni-version)
CNI_VERSION=$2
shift
shift
;;
--cni-custom-networking-enabled)
CNI_CUSTOM_NETWORKING_ENABLED=true
shift
;;
--cni-prefix-delegation-enabled)
CNI_PREFIX_DELEGATION_ENABLED=true
shift
;;
--cni-max-eni)
CNI_MAX_ENI=$2
shift
shift
;;
--show-max-allowed)
SHOW_MAX_ALLOWED=true
shift
;;
*) # unknown option
POSITIONAL+=("$1") # save it in an array for later
shift # past argument
;;
esac
done

CNI_VERSION="${CNI_VERSION:-}"
CNI_CUSTOM_NETWORKING_ENABLED="${CNI_CUSTOM_NETWORKING_ENABLED:-false}"
CNI_PREFIX_DELEGATION_ENABLED="${CNI_PREFIX_DELEGATION_ENABLED:-false}"
CNI_MAX_ENI="${CNI_MAX_ENI:-}"
INSTANCE_TYPE="${INSTANCE_TYPE:-}"
INSTANCE_TYPE_FROM_IMDS="${INSTANCE_TYPE_FROM_IMDS:-false}"
SHOW_MAX_ALLOWED="${SHOW_MAX_ALLOWED:-false}"

PREFIX_DELEGATION_SUPPORTED=false
IPS_PER_PREFIX=16

if [ "$INSTANCE_TYPE_FROM_IMDS" = true ]; then
TOKEN=$(curl -m 10 -X PUT -H "X-aws-ec2-metadata-token-ttl-seconds: 600" -s "http://169.254.169.254/latest/api/token")
export AWS_DEFAULT_REGION=$(curl -s --retry 5 -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/dynamic/instance-identity/document | jq .region -r)
INSTANCE_TYPE=$(curl -m 10 -H "X-aws-ec2-metadata-token: $TOKEN" -s http://169.254.169.254/latest/meta-data/instance-type)
elif [ -z "$INSTANCE_TYPE" ]; then
# There's no reasonable default for an instanceType so force one to be provided to the script.
echo "You must specify an instance type to calculate max pods value."
exit 1
fi

if [ -z "$CNI_VERSION" ]; then
echo "You must specify a CNI Version to use. Example - 1.7.5"
exit 1
fi

calculate_max_ip_addresses_prefix_delegation() {
enis=$1
instance_max_eni_ips=$2
echo $((enis * ((instance_max_eni_ips - 1) * IPS_PER_PREFIX) + 2))
}

calculate_max_ip_addresses_secondary_ips() {
enis=$1
instance_max_eni_ips=$2
echo $((enis * (instance_max_eni_ips - 1) + 2))
}

min_number() {
printf "%s\n" "$@" | sort -g | head -n1
}

VERSION_SPLIT=(${CNI_VERSION//./ })
CNI_MAJOR_VERSION="${VERSION_SPLIT[0]}"
CNI_MINOR_VERSION="${VERSION_SPLIT[1]}"
if [[ $CNI_MAJOR_VERSION -gt 1 ]] || ([[ $CNI_MAJOR_VERSION == 1 ]] && [[ $CNI_MINOR_VERSION -gt 8 ]]); then
PREFIX_DELEGATION_SUPPORTED=true
fi

DESCRIBE_INSTANCES_RESULT=$(aws ec2 describe-instance-types --instance-type $INSTANCE_TYPE --query 'InstanceTypes[0].{Hypervisor: Hypervisor, EniCount: NetworkInfo.MaximumNetworkInterfaces, PodsPerEniCount: NetworkInfo.Ipv4AddressesPerInterface, CpuCount: VCpuInfo.DefaultVCpus'} --output json)

HYPERVISOR_TYPE=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.Hypervisor')
IS_NITRO=false
if [[ $HYPERVISOR_TYPE == "nitro" ]]; then
IS_NITRO=true
fi
INSTANCE_MAX_ENIS=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.EniCount')
INSTANCE_MAX_ENIS_IPS=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.PodsPerEniCount')

if [ -z "$CNI_MAX_ENI" ]; then
enis_for_pods=$INSTANCE_MAX_ENIS
else
enis_for_pods="$(min_number $CNI_MAX_ENI $INSTANCE_MAX_ENIS)"
fi

if [ "$CNI_CUSTOM_NETWORKING_ENABLED" = true ]; then
enis_for_pods=$((enis_for_pods - 1))
fi

if [ "$IS_NITRO" = true ] && [ "$CNI_PREFIX_DELEGATION_ENABLED" = true ] && [ "$PREFIX_DELEGATION_SUPPORTED" = true ]; then
max_pods=$(calculate_max_ip_addresses_prefix_delegation $enis_for_pods $INSTANCE_MAX_ENIS_IPS)
else
max_pods=$(calculate_max_ip_addresses_secondary_ips $enis_for_pods $INSTANCE_MAX_ENIS_IPS)
fi

# Limit the total number of pods that can be launched on any instance type based on the vCPUs on that instance type.
MAX_POD_CEILING_FOR_LOW_CPU=110
MAX_POD_CEILING_FOR_HIGH_CPU=250
CPU_COUNT=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.CpuCount')

if [ "$SHOW_MAX_ALLOWED" = true ]; then
echo $max_pods
exit 0
fi

if [ "$CPU_COUNT" -gt 30 ]; then
echo $(min_number $MAX_POD_CEILING_FOR_HIGH_CPU $max_pods)
else
echo $(min_number $MAX_POD_CEILING_FOR_LOW_CPU $max_pods)
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script does not validate the input arguments. For example, it does not check if the instance type provided by the user is valid or if the CNI version is in the correct format. Consider adding input validation to prevent errors and improve user experience.

Comment on lines 1 to 161
#!/usr/bin/env bash

set -o pipefail
set -o nounset
set -o errexit

err_report() {
echo "Exited with error on line $1"
}
trap 'err_report $LINENO' ERR

function print_help {
echo "usage: $0 <instance(s)> [options]"
echo "Calculates maxPods value to be used when starting up the kubelet."
echo "-h,--help print this help."
echo "--instance-type Specify the instance type to calculate max pods value."
echo "--instance-type-from-imds Use this flag if the instance type should be fetched from IMDS."
echo "--cni-version Specify the version of the CNI (example - 1.7.5)."
echo "--cni-custom-networking-enabled Use this flag to indicate if CNI custom networking mode has been enabled."
echo "--cni-prefix-delegation-enabled Use this flag to indicate if CNI prefix delegation has been enabled."
echo "--cni-max-eni specify how many ENIs should be used for prefix delegation. Defaults to using all ENIs per instance."
echo "--show-max-allowed Use this flag to show max number of Pods allowed to run in Worker Node. Otherwise the script will show the recommended value"
}

POSITIONAL=()

while [[ $# -gt 0 ]]; do
key="$1"
case $key in
-h | --help)
print_help
exit 1
;;
--instance-type)
INSTANCE_TYPE=$2
shift
shift
;;
--instance-type-from-imds)
INSTANCE_TYPE_FROM_IMDS=true
shift
;;
--cni-version)
CNI_VERSION=$2
shift
shift
;;
--cni-custom-networking-enabled)
CNI_CUSTOM_NETWORKING_ENABLED=true
shift
;;
--cni-prefix-delegation-enabled)
CNI_PREFIX_DELEGATION_ENABLED=true
shift
;;
--cni-max-eni)
CNI_MAX_ENI=$2
shift
shift
;;
--show-max-allowed)
SHOW_MAX_ALLOWED=true
shift
;;
*) # unknown option
POSITIONAL+=("$1") # save it in an array for later
shift # past argument
;;
esac
done

CNI_VERSION="${CNI_VERSION:-}"
CNI_CUSTOM_NETWORKING_ENABLED="${CNI_CUSTOM_NETWORKING_ENABLED:-false}"
CNI_PREFIX_DELEGATION_ENABLED="${CNI_PREFIX_DELEGATION_ENABLED:-false}"
CNI_MAX_ENI="${CNI_MAX_ENI:-}"
INSTANCE_TYPE="${INSTANCE_TYPE:-}"
INSTANCE_TYPE_FROM_IMDS="${INSTANCE_TYPE_FROM_IMDS:-false}"
SHOW_MAX_ALLOWED="${SHOW_MAX_ALLOWED:-false}"

PREFIX_DELEGATION_SUPPORTED=false
IPS_PER_PREFIX=16

if [ "$INSTANCE_TYPE_FROM_IMDS" = true ]; then
TOKEN=$(curl -m 10 -X PUT -H "X-aws-ec2-metadata-token-ttl-seconds: 600" -s "http://169.254.169.254/latest/api/token")
export AWS_DEFAULT_REGION=$(curl -s --retry 5 -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/dynamic/instance-identity/document | jq .region -r)
INSTANCE_TYPE=$(curl -m 10 -H "X-aws-ec2-metadata-token: $TOKEN" -s http://169.254.169.254/latest/meta-data/instance-type)
elif [ -z "$INSTANCE_TYPE" ]; then
# There's no reasonable default for an instanceType so force one to be provided to the script.
echo "You must specify an instance type to calculate max pods value."
exit 1
fi

if [ -z "$CNI_VERSION" ]; then
echo "You must specify a CNI Version to use. Example - 1.7.5"
exit 1
fi

calculate_max_ip_addresses_prefix_delegation() {
enis=$1
instance_max_eni_ips=$2
echo $((enis * ((instance_max_eni_ips - 1) * IPS_PER_PREFIX) + 2))
}

calculate_max_ip_addresses_secondary_ips() {
enis=$1
instance_max_eni_ips=$2
echo $((enis * (instance_max_eni_ips - 1) + 2))
}

min_number() {
printf "%s\n" "$@" | sort -g | head -n1
}

VERSION_SPLIT=(${CNI_VERSION//./ })
CNI_MAJOR_VERSION="${VERSION_SPLIT[0]}"
CNI_MINOR_VERSION="${VERSION_SPLIT[1]}"
if [[ $CNI_MAJOR_VERSION -gt 1 ]] || ([[ $CNI_MAJOR_VERSION == 1 ]] && [[ $CNI_MINOR_VERSION -gt 8 ]]); then
PREFIX_DELEGATION_SUPPORTED=true
fi

DESCRIBE_INSTANCES_RESULT=$(aws ec2 describe-instance-types --instance-type $INSTANCE_TYPE --query 'InstanceTypes[0].{Hypervisor: Hypervisor, EniCount: NetworkInfo.MaximumNetworkInterfaces, PodsPerEniCount: NetworkInfo.Ipv4AddressesPerInterface, CpuCount: VCpuInfo.DefaultVCpus'} --output json)

HYPERVISOR_TYPE=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.Hypervisor')
IS_NITRO=false
if [[ $HYPERVISOR_TYPE == "nitro" ]]; then
IS_NITRO=true
fi
INSTANCE_MAX_ENIS=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.EniCount')
INSTANCE_MAX_ENIS_IPS=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.PodsPerEniCount')

if [ -z "$CNI_MAX_ENI" ]; then
enis_for_pods=$INSTANCE_MAX_ENIS
else
enis_for_pods="$(min_number $CNI_MAX_ENI $INSTANCE_MAX_ENIS)"
fi

if [ "$CNI_CUSTOM_NETWORKING_ENABLED" = true ]; then
enis_for_pods=$((enis_for_pods - 1))
fi

if [ "$IS_NITRO" = true ] && [ "$CNI_PREFIX_DELEGATION_ENABLED" = true ] && [ "$PREFIX_DELEGATION_SUPPORTED" = true ]; then
max_pods=$(calculate_max_ip_addresses_prefix_delegation $enis_for_pods $INSTANCE_MAX_ENIS_IPS)
else
max_pods=$(calculate_max_ip_addresses_secondary_ips $enis_for_pods $INSTANCE_MAX_ENIS_IPS)
fi

# Limit the total number of pods that can be launched on any instance type based on the vCPUs on that instance type.
MAX_POD_CEILING_FOR_LOW_CPU=110
MAX_POD_CEILING_FOR_HIGH_CPU=250
CPU_COUNT=$(echo $DESCRIBE_INSTANCES_RESULT | jq -r '.CpuCount')

if [ "$SHOW_MAX_ALLOWED" = true ]; then
echo $max_pods
exit 0
fi

if [ "$CPU_COUNT" -gt 30 ]; then
echo $(min_number $MAX_POD_CEILING_FOR_HIGH_CPU $max_pods)
else
echo $(min_number $MAX_POD_CEILING_FOR_LOW_CPU $max_pods)
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script does not handle potential errors from the aws and curl commands. For example, if the aws command fails due to incorrect AWS credentials, the script will continue to execute and may produce incorrect results. Consider adding error handling for these commands to improve robustness.


bastion_name = "${local.project}-${local.app}-database-bastion"

subnet_id = "subnet-087617570cd189b3e"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding the subnet_id can lead to issues in case the subnet ID changes or if the infrastructure needs to be replicated in a different environment. Consider fetching the subnet ID dynamically from the VPC configuration or passing it as a variable.

Comment on lines +1 to +27
locals {

# Automatically load environment-level variables
account_vars = read_terragrunt_config(find_in_parent_folders("account.hcl"))

# Extract out common variables for reuse
project = local.account_vars.locals.project
app = "marlowe-runtime"

database_name = "${local.project}-${local.app}-database"
snapshot_identifier = "rds:dapps-marlowe-runtime-database-2023-08-26-08-06"

vpc_subnet_group_name = "default-vpc-0a68a5d196ce5e1d1"
vpc_security_group_ids = ["sg-0c2ff87114e5f6ec1"]

tags = {
organization = "iog"
tribe = "smart-contracts"
environment = "prod"
function = "database"
Environment = "prod"
Terraform = "true"
Project = local.project
Resource = local.database_name
}

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of locals to define and manage variables is a good practice for code readability and maintainability. However, the vpc_subnet_group_name and vpc_security_group_ids are hardcoded. This could lead to issues if the VPC or security group configuration changes. Consider fetching these values dynamically from the respective AWS resources.

Comment on lines +10 to +11
database_name = "${local.project}-${local.app}-database"
snapshot_identifier = "rds:dapps-marlowe-runtime-database-2023-08-26-08-06"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The database_name and snapshot_identifier are constructed with hardcoded strings. This could lead to issues if the project or app name changes. Consider using variables for these values to make the code more flexible.

Comment on lines +40 to +46
inputs = {
vpc_id = dependency.vpc.outputs.vpc_id

computed_ingress_cidr_blocks = ["0.0.0.0/0"]
ingress_cidr_blocks = ["0.0.0.0/0"]
tags = local.tags
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inputs block is used to define input variables for the module. The vpc_id is correctly sourced from the vpc module's outputs. However, the computed_ingress_cidr_blocks and ingress_cidr_blocks are set to allow all IP addresses (0.0.0.0/0), which could pose a security risk. Consider restricting access to a specific IP range or using a security group to control access.

-  computed_ingress_cidr_blocks = ["0.0.0.0/0"]
-  ingress_cidr_blocks          = ["0.0.0.0/0"]
+  computed_ingress_cidr_blocks = ["<restricted-ip-range>"]
+  ingress_cidr_blocks          = ["<restricted-ip-range>"]

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Commits Files that changed from the base of the PR and between f4fb19e and 074d416.
Files selected for processing (2)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (1 hunks)
  • infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (1 hunks)
Additional comments (Suppressed): 11
infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (4)
  • 1-10: The use of locals to define and manage environment-specific variables is a good practice. It enhances code readability and maintainability. However, ensure that the files env.hcl and account.hcl exist in the parent folders as expected.

  • 16-18: The Terraform module source is referenced from a GitHub repository. Ensure that the referenced version (v0.0.2) of the module is stable and has all the necessary features for your use case.

  • 34-39: The inputs block is used to pass values into the Terraform module. The values are correctly sourced from the eks module outputs and local variables.

  • 41-75: The eks_addons block is used to enable and configure various addons for the EKS cluster. The use of templatefile function to render values from a template file is a good practice. However, ensure that the template files cert-manager.tpl and traefik.tpl exist and are correctly formatted.

infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (7)
  • 1-32: The use of locals to define and manage variables is a good practice. It improves readability and maintainability of the code. However, the tags variable at line 31 is defined but not assigned any value. If it's not used anywhere else, consider removing it to avoid confusion. If it's used, ensure it's assigned the correct value.

  • 34-38: The source for the Terraform module is a GitHub repository. Ensure that the specified version (v0.0.2) of the module is the intended one and that it's compatible with the rest of your infrastructure code.

  • 46-54: The dependency block is used to express the dependency on the VPC. The mock_outputs are used for testing and should be replaced with actual outputs in the production environment. Ensure that this is done before deploying to production.

  • 69-140: The security group rules are well defined, covering a wide range of ports and protocols. However, the ingress rules are open to the entire internet (0.0.0.0/0 and ::/0). This could potentially expose your cluster to unnecessary risks. Consider limiting the CIDR ranges to only those necessary.

  • 143-178: The configuration for the EKS managed node group is well defined, including the use of Bottlerocket as the platform and the configuration of the EBS volumes. However, ensure that the specified AMI release version (1.15.1-264e294c) is the intended one and that it's compatible with your EKS cluster version (1.26).

  • 181-189: The configuration for the worker node group is well defined. However, ensure that the instance type (t3a.xlarge) is suitable for your workload requirements.

  • 192-198: The aws-auth ConfigMap is managed by the module, and the KMS key owners and administrators are set to the list of users. This is a good practice for managing access to the EKS cluster. However, ensure that the list of users is correctly defined and that they have the necessary permissions.

Comment on lines 20 to 30
dependency "eks" {
config_path = "../eks"

mock_outputs = {
cluster_name = "quick_brown_fox"
cluster_version = "1.26"
cluster_endpoint = "https://abcdef.gr7.us-east-1.eks.amazonaws.com"
cluster_certificate_authority_data = "bGF6eS1icm93bi1mb3gK"
oidc_provider_arn = "arn:aws:iam::677160962006:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/abcdef"
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependency block is used to express the dependencies of the current configuration on other configurations. The mock_outputs are used for testing and development. In a production environment, these should be replaced with actual outputs from the eks module.

- 23:   mock_outputs = {
- 24:     cluster_name                       = "quick_brown_fox"
- 25:     cluster_version                    = "1.26"
- 26:     cluster_endpoint                   = "https://abcdef.gr7.us-east-1.eks.amazonaws.com"
- 27:     cluster_certificate_authority_data = "bGF6eS1icm93bi1mb3gK"
- 28:     oidc_provider_arn                  = "arn:aws:iam::677160962006:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/abcdef"
- 29:   }
+ 23:   # mock_outputs should be removed in production environment

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Commits Files that changed from the base of the PR and between 074d416 and ca0b18a.
Files selected for processing (2)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (1 hunks)
  • infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (1 hunks)
Files skipped from review due to trivial changes (1)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl
Additional comments (Suppressed): 7
infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (7)
  • 1-32: The extraction of common variables for reuse is a good practice. It enhances code readability and maintainability. However, the tags variable is declared but not assigned any value. If it's not used, consider removing it to avoid confusion. If it's intended to be used, ensure it's assigned the appropriate value.

  • 37-38: The Terraform source is being pulled from a specific GitHub repository and tag. Ensure that the repository and tag are correct and that the Terraform configurations in that repository are compatible with your infrastructure setup.

  • 49-54: The mock_outputs are hardcoded values. These are typically used for testing or development purposes. Ensure that these values are replaced with actual outputs in the production environment.

  • 59-60: The Kubernetes cluster version is hardcoded to "1.26". Ensure that this version is compatible with your applications and other infrastructure components. Consider parameterizing this value for flexibility.

  • 69-140: The security group rules are well defined, covering a wide range of ports and protocols. However, ensure that these rules align with your security requirements. Be cautious about rules that allow all traffic (0.0.0.0/0 and ::/0), as they can expose your resources to potential threats.

  • 143-178: The configuration for the EKS managed node group is well defined, including the instance type, capacity type, and block device mappings. However, ensure that these configurations align with your workload requirements and cost considerations.

  • 192-198: The aws_auth_users and kms_key_owners are set to local.map_users and local.list_users respectively. Ensure that these users have the necessary permissions and that their access aligns with your security policies.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Commits Files that changed from the base of the PR and between ca0b18a and 81c4012.
Files selected for processing (2)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (1 hunks)
  • infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (1 hunks)
Files skipped from review due to trivial changes (1)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl
Additional comments (Suppressed): 8
infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (8)
  • 1-32: The use of locals to define and manage variables is a good practice. It improves readability and maintainability of the code. However, it's important to ensure that the files env.hcl and account.hcl exist in the parent folders and contain the necessary variables. If not, read_terragrunt_config will fail.

  • 37-38: The source for the Terraform configurations is a specific version of a GitHub repository. This is a good practice as it ensures that the same version of the code is used across different environments. However, it's important to verify that the version v0.0.3 of the repository github.com/renebarbosafl/terraform-aws-eks.git is the intended one and it contains all the necessary modules and resources.

  • 46-48: The VPC is defined as a dependency. This is a good practice as it ensures that the VPC is created before the EKS cluster. However, it's important to verify that the path ../../../vpc is correct and the VPC is defined there.

  • 52-60: The cluster configuration looks good. However, it's important to verify that the VPC and the subnets are correctly defined and they have the necessary resources and permissions for the EKS cluster.

  • 62-133: The security group rules are well defined. However, it's important to verify that these rules meet the security requirements of the EKS cluster and they don't expose any unnecessary ports or protocols.

  • 136-171: The configuration for the EKS managed node groups looks good. However, it's important to verify that the instance types, AMI version, and other configurations meet the requirements of the EKS cluster.

  • 174-182: The configuration for the worker node group looks good. However, it's important to verify that the instance type t3a.xlarge and the private network meet the requirements of the EKS cluster.

  • 185-191: The configuration for the aws-auth configmap and the KMS key looks good. However, it's important to verify that the users have the necessary permissions to manage the aws-auth configmap and the KMS key.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 2
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 81c4012 and 736416e.
Files selected for processing (1)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (1 hunks)
Additional comments (Suppressed): 5
infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (5)
  • 20-20: The Terraform module source is pointing to a specific version (v0.0.3). Ensure that this version is compatible with your Terraform version and AWS provider version. Also, consider the implications of using a fixed version. If the module is updated with security patches or new features, you won't get those updates unless you manually change the version.

  • 46-59: The External-DNS addon is enabled and configured to use a specific Route53 hosted zone. Ensure that the necessary permissions are in place for External-DNS to manage records in this hosted zone. Also, verify that the EXTERNAL_DNS_DRY_RUN environment variable is set to the desired value.

  • 62-75: The Cert-Manager addon is enabled with a default issuer of letsencrypt. Ensure that the necessary permissions and configurations are in place for Cert-Manager to request certificates from Let's Encrypt. Also, the ExperimentalGatewayAPISupport feature gate is enabled, which might have implications for stability or compatibility.

  • 77-108: The Traefik Load Balancer addon is enabled with specific annotations for the AWS Load Balancer. Ensure that these annotations are correct and that the necessary permissions are in place for the Load Balancer to function correctly. Also, verify that the image tag (v3.0) is available and compatible with your cluster.

  • 110-112: The KubeVela Controller addon is enabled. Ensure that the necessary permissions and configurations are in place for the KubeVela Controller to function correctly.

Comment on lines +1 to +13
locals {
# Automatically load environment-level variables
environment_vars = read_terragrunt_config(find_in_parent_folders("env.hcl"))
account_vars = read_terragrunt_config(find_in_parent_folders("account.hcl"))
# Extract out common variables for reuse
env = local.environment_vars.locals.environment
region = local.environment_vars.locals.aws_region
hostnames = local.environment_vars.locals.hostnames
profile = local.account_vars.locals.aws_profile

# Hosted Zone ARN for scdev-test.aws.iohkdev.io
hostedzone_arn = "arn:aws:route53:::hostedzone/Z10147571DRRDCJXSER5Y"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded ARN for the hosted zone (line 12) might cause issues if the hosted zone changes or if this code is used in a different environment. Consider making this a variable that can be passed in, or retrieving it dynamically using a data source.

-   hostedzone_arn = "arn:aws:route53:::hostedzone/Z10147571DRRDCJXSER5Y"
+   hostedzone_arn = var.hostedzone_arn

Comment on lines 36 to 112
eks_addons = {

# Cluster Autoscaler
cluster_autoscaler = {
set = [{
name = "extraArgs.scale-down-utilization-threshold"
value = "0.7"
}]
}

# External-DNS
enable_external_dns = true
external_dns_route53_zone_arns = [local.hostedzone_arn]
external_dns = {
values = [
<<-EOT
env:
# Don't change anything, useful for debugging purposes.
- name: EXTERNAL_DNS_DRY_RUN
value: "0"
txtOwnerId: "${dependency.eks.outputs.cluster_name}"
EOT
]
}

# Cert-Manager
enable_cert_manager = true
cert_manager = {
values = [
<<-EOT
ingressShim:
defaultIssuerName: letsencrypt
defaultIssuerKind: ClusterIssuer
defaultIssuerGroup: cert-manager.io

extraArgs:
- --feature-gates=ExperimentalGatewayAPISupport=true
EOT
]
}

# Traefik Load Balancer
enable_traefik_load_balancer = true
traefik_load_balancer = {
values = [
<<-EOT
image:
tag: "v3.0"

experimental:
kubernetesGateway:
enabled: true
namespacePolicy: All

ports:
web:
redirectTo:
port: websecure
priority: 10

service:
annotations:
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": "external"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": "instance"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": "traefik"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing"
"external-dns.alpha.kubernetes.io/hostname": "${join(",", local.hostnames)}"
"external-dns.alpha.kubernetes.io/ttl": "60"
"external-dns.alpha.kubernetes.io/aws-weight": "100"
"external-dns.alpha.kubernetes.io/set-identifier": "traefik-blue"
EOT
]
}

# KubeVela Controller
enable_kubevela_controller = true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration for various Kubernetes addons is hardcoded. This might not be flexible enough for different environments or use cases. Consider making these configurations more dynamic, for example by using variables or data sources. Also, ensure that sensitive data is not hardcoded and is instead securely stored and accessed.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 2
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 736416e and fc5d7cc.
Files selected for processing (1)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (1 hunks)
Additional comments (Suppressed): 2
infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (2)
  • 1-13: The hardcoded ARN for the hosted zone (line 12) might still cause issues if the hosted zone changes or if this code is used in a different environment. Consider making this a variable that can be passed in, or retrieving it dynamically using a data source.
-   hostedzone_arn = "arn:aws:route53:::hostedzone/Z10147571DRRDCJXSER5Y"
+   hostedzone_arn = var.hostedzone_arn
  • 20-21: Ensure that the Terraform module source is reliable and maintained. Using a personal GitHub repository might introduce risks in terms of stability and security. Consider using a module from the Terraform Registry or a well-maintained, trusted source.

Comment on lines 36 to 111
eks_addons = {

# Cluster Autoscaler
cluster_autoscaler = {
set = [{
name = "extraArgs.scale-down-utilization-threshold"
value = "0.7"
}]
}

# External-DNS
enable_external_dns = true
external_dns_route53_zone_arns = [local.hostedzone_arn]
external_dns = {
values = [
<<-EOT
env:
# Don't change anything, useful for debugging purposes.
- name: EXTERNAL_DNS_DRY_RUN
value: "0"
txtOwnerId: "${dependency.eks.outputs.cluster_name}"
EOT
]
}

# Cert-Manager
enable_cert_manager = true
cert_manager = {
values = [
<<-EOT
ingressShim:
defaultIssuerName: letsencrypt
defaultIssuerKind: ClusterIssuer
defaultIssuerGroup: cert-manager.io

extraArgs:
- --feature-gates=ExperimentalGatewayAPISupport=true
EOT
]
}

# Traefik Load Balancer
enable_traefik_load_balancer = true
traefik_load_balancer = {
values = [
<<-EOT
image:
tag: "v3.0"

experimental:
kubernetesGateway:
enabled: true
namespacePolicy: All

ports:
web:
redirectTo:
port: websecure
priority: 10

service:
annotations:
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": "external"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": "instance"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": "traefik"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing"
"external-dns.alpha.kubernetes.io/hostname": "${join(",", local.hostnames)}"
"external-dns.alpha.kubernetes.io/aws-weight": "100"
"external-dns.alpha.kubernetes.io/set-identifier": "traefik-blue"
EOT
]
}

# KubeVela Controller
enable_kubevela_controller = true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration of the EKS addons is hardcoded in the Terragrunt configuration. This might limit the flexibility of the setup and make it harder to adapt to different environments or requirements. Consider moving these configurations to separate files or modules, and passing them as variables or file inputs to the Terragrunt configuration.

-   eks_addons = {
-     # Cluster Autoscaler
-     cluster_autoscaler = {
-       set = [{
-         name  = "extraArgs.scale-down-utilization-threshold"
-         value = "0.7"
-       }]
-     }
-     # ...
-   }
+   eks_addons = var.eks_addons

Comment on lines 78 to 107
enable_traefik_load_balancer = true
traefik_load_balancer = {
values = [
<<-EOT
image:
tag: "v3.0"

experimental:
kubernetesGateway:
enabled: true
namespacePolicy: All

ports:
web:
redirectTo:
port: websecure
priority: 10

service:
annotations:
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": "external"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": "instance"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": "traefik"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing"
"external-dns.alpha.kubernetes.io/hostname": "${join(",", local.hostnames)}"
"external-dns.alpha.kubernetes.io/aws-weight": "100"
"external-dns.alpha.kubernetes.io/set-identifier": "traefik-blue"
EOT
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Traefik Load Balancer configuration includes hardcoded values for the service annotations (lines 97-104). These annotations might need to be different for different environments or use cases. Consider making these annotations configurable through variables.

-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": "external"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": "instance"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": "traefik"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing"
-             "external-dns.alpha.kubernetes.io/hostname": "${join(",", local.hostnames)}"
-             "external-dns.alpha.kubernetes.io/aws-weight": "100"
-             "external-dns.alpha.kubernetes.io/set-identifier": "traefik-blue"
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": var.aws_load_balancer_type
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": var.aws_load_balancer_nlb_target_type
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": var.aws_load_balancer_name
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": var.aws_load_balancer_scheme
+             "external-dns.alpha.kubernetes.io/hostname": "${join(",", var.hostnames)}"
+             "external-dns.alpha.kubernetes.io/aws-weight": var.aws_weight
+             "external-dns.alpha.kubernetes.io/set-identifier": var.set_identifier

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between fc5d7cc and ce04062.
Files selected for processing (1)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (1 hunks)
Additional comments (Suppressed): 2
infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (2)
  • 12-12: The hardcoded ARN for the hosted zone might cause issues if the hosted zone changes or if this code is used in a different environment. Consider making this a variable that can be passed in, or retrieving it dynamically using a data source.
-   hostedzone_arn = "arn:aws:route53:::hostedzone/Z10147571DRRDCJXSER5Y"
+   hostedzone_arn = var.hostedzone_arn
  • 97-104: The Traefik Load Balancer configuration includes hardcoded values for the service annotations. These annotations might need to be different for different environments or use cases. Consider making these annotations configurable through variables.
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": "external"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": "instance"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": "traefik"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing"
-             "external-dns.alpha.kubernetes.io/hostname": "${join(",", local.hostnames)}"
-             "external-dns.alpha.kubernetes.io/aws-weight": "100"
-             "external-dns.alpha.kubernetes.io/set-identifier": "traefik-blue"
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": var.aws_load_balancer_type
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": var.aws_load_balancer_nlb_target_type
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": var.aws_load_balancer_name
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": var.aws_load_balancer_scheme
+             "external-dns.alpha.kubernetes.io/hostname": "${join(",", var.hostnames)}"
+             "external-dns.alpha.kubernetes.io/aws-weight": var.aws_weight
+             "external-dns.alpha.kubernetes.io/set-identifier": var.set_identifier

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 5
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between ce04062 and ea401d8.
Files ignored due to filter (3)
  • infra/modules/eks/addons/manifests/gateway_crds.yaml
  • infra/modules/eks/addons/manifests/gateway_system.yaml
  • infra/modules/eks/addons/manifests/letsencrypt_issuer.yaml
Files selected for processing (7)
  • infra/modules/eks/addons/locals.tf (1 hunks)
  • infra/modules/eks/addons/main.tf (1 hunks)
  • infra/modules/eks/addons/variables.tf (1 hunks)
  • infra/modules/eks/locals.tf (1 hunks)
  • infra/modules/eks/main.tf (1 hunks)
  • infra/modules/eks/outputs.tf (1 hunks)
  • infra/modules/eks/variables.tf (1 hunks)
Files skipped from review due to trivial changes (4)
  • infra/modules/eks/addons/variables.tf
  • infra/modules/eks/locals.tf
  • infra/modules/eks/outputs.tf
  • infra/modules/eks/variables.tf
Additional comments (Suppressed): 5
infra/modules/eks/main.tf (2)
  • 1-8: The Kubernetes provider is configured to use the AWS EKS cluster endpoint and certificate, and to authenticate using the AWS CLI. Ensure that the AWS CLI is installed and properly configured on the machine where Terraform will be run. Also, ensure that the AWS profile specified in var.aws_profile has the necessary permissions to retrieve the token for the EKS cluster.

  • 11-39: The EKS module is being used to create a Kubernetes cluster. The module version is specified as "19.17.2". Ensure that this version of the module is compatible with the current version of Terraform and the AWS provider. Also, verify that all the variables used in this module are defined and have appropriate default values if necessary. The module is highly configurable, so make sure that the configuration matches the desired state of the EKS cluster.

infra/modules/eks/addons/locals.tf (2)
  • 63-71: These blocks define the configuration for various Kubernetes addons. Ensure that the chart versions specified are compatible with your Kubernetes version and that the repositories are accessible and contain the specified charts. Also, verify that the namespaces specified exist in your cluster or are created elsewhere in your code.

  • 37-37: The IAM role ARN for the AWS EBS CSI Driver is being interpolated from a module output. Ensure that the module module.eks_addon_aws_ebs_csi_driver_iam_role is correctly defined and outputs the iam_role_arn attribute.

infra/modules/eks/addons/main.tf (1)
  • 174-192: Ensure that the modules and resources being used are compatible with the versions of the providers specified at the beginning of the file. Also, verify that the modules and resources are available and accessible from the specified sources.

Comment on lines +3 to +4
helm_wait = true
helm_create_namespace = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helm_wait and helm_create_namespace variables are declared but not used anywhere in this hunk. If they are not used elsewhere in the code, consider removing them to avoid confusion.

repository = "https://kubernetes-sigs.github.io/aws-ebs-csi-driver"
description = "A Helm chart for AWS EBS CSI Driver"
namespace = "kube-system"
create_namespace = "false"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The create_namespace attribute is set to "false" as a string, not a boolean. In Terraform, boolean values should be true or false, not "true" or "false". If this attribute is expected to be a string elsewhere in your code, this is fine. Otherwise, consider changing these to boolean values.

-       create_namespace = "false"
+       create_namespace = false

Comment on lines +1 to +43
terraform {
required_providers {
kubectl = {
source = "gavinbunney/kubectl"
version = "~> 1.14"
}
}
}

provider "kubernetes" {
host = var.cluster_endpoint
cluster_ca_certificate = base64decode(var.cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", var.cluster_name, "--profile", var.aws_profile]
}
}

provider "helm" {
kubernetes {
host = var.cluster_endpoint
cluster_ca_certificate = base64decode(var.cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", var.cluster_name, "--profile", var.aws_profile]
}
}
}

provider "kubectl" {
host = var.cluster_endpoint
cluster_ca_certificate = base64decode(var.cluster_certificate_authority_data)

exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", var.cluster_name, "--profile", var.aws_profile]
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The providers for "kubernetes", "helm", and "kubectl" are all configured in the same way. This is a repetition of code and can be refactored to a local function that returns the configuration. This will make the code more maintainable and easier to change in the future.

+ locals {
+   provider_config = {
+     host                   = var.cluster_endpoint
+     cluster_ca_certificate = base64decode(var.cluster_certificate_authority_data)
+     exec {
+       api_version = "client.authentication.k8s.io/v1beta1"
+       command     = "aws"
+       args        = ["eks", "get-token", "--cluster-name", var.cluster_name, "--profile", var.aws_profile]
+     }
+   }
+ }

- provider "kubernetes" {
-   host                   = var.cluster_endpoint
-   cluster_ca_certificate = base64decode(var.cluster_certificate_authority_data)
-   exec {
-     api_version = "client.authentication.k8s.io/v1beta1"
-     command     = "aws"
-     args        = ["eks", "get-token", "--cluster-name", var.cluster_name, "--profile", var.aws_profile]
-   }
- }

+ provider "kubernetes" {
+   config = local.provider_config
+ }

- provider "helm" {
-   kubernetes {
-     host                   = var.cluster_endpoint
-     cluster_ca_certificate = base64decode(var.cluster_certificate_authority_data)
-     exec {
-       api_version = "client.authentication.k8s.io/v1beta1"
-       command     = "aws"
-       args = ["eks", "get-token", "--cluster-name", var.cluster_name, "--profile", var.aws_profile]
-     }
-   }
- }

+ provider "helm" {
+   kubernetes = local.provider_config
+ }

- provider "kubectl" {
-   host                   = var.cluster_endpoint
-   cluster_ca_certificate = base64decode(var.cluster_certificate_authority_data)
-   exec {
-     api_version = "client.authentication.k8s.io/v1beta1"
-     command     = "aws"
-     args        = ["eks", "get-token", "--cluster-name", var.cluster_name, "--profile", var.aws_profile]
-   }
- }

+ provider "kubectl" {
+   config = local.provider_config
+ }

Comment on lines +152 to +172
module "eks_addon_traefik_load_balancer" {
count = try(var.eks_addons.enable_traefik_load_balancer, false) ? 1 : 0
source = "aws-ia/eks-blueprints-addon/aws"
version = "1.1.1"

chart = local.eks_addons.traefik_load_balancer.chart
chart_version = try(var.eks_addons.traefik_load_balancer.chart_version, local.eks_addons.traefik_load_balancer.chart_version)
repository = try(var.eks_addons.traefik_load_balancer.repository, local.eks_addons.traefik_load_balancer.repository)
description = try(var.eks_addons.traefik_load_balancer.description, local.eks_addons.traefik_load_balancer.description)
namespace = try(var.eks_addons.traefik_load_balancer.namespace, local.eks_addons.traefik_load_balancer.namespace)
create_namespace = try(var.eks_addons.traefik_load_balancer.create_namespace, local.helm_create_namespace)
values = try(var.eks_addons.traefik_load_balancer.values, local.eks_addons.traefik_load_balancer.values)
set = try(var.eks_addons.traefik_load_balancer.set, local.eks_addons.traefik_load_balancer.set)
wait = try(var.eks_addons.traefik_load_balancer.wait, local.helm_wait)

depends_on = [
module.eks_addons,
kubectl_manifest.gateway_crds,
kubectl_manifest.gateway_system
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 28 and the depends_on block on lines 167-170 indicate that the AWS CLI must be installed where Terraform is executed. This is a potential issue as it introduces a dependency on the local environment. Consider using a Docker container or another form of environment isolation to ensure that the AWS CLI is available regardless of the local environment.

Comment on lines +174 to +192
module "eks_addon_kubevela_controller" {
count = try(var.eks_addons.enable_kubevela_controller, false) ? 1 : 0
source = "aws-ia/eks-blueprints-addon/aws"
version = "1.1.1"

chart = local.eks_addons.kubevela_controller.chart
chart_version = try(var.eks_addons.kubevela_controller.chart_version, local.eks_addons.kubevela_controller.chart_version)
repository = try(var.eks_addons.kubevela_controller.repository, local.eks_addons.kubevela_controller.repository)
description = try(var.eks_addons.kubevela_controller.description, local.eks_addons.kubevela_controller.description)
namespace = try(var.eks_addons.kubevela_controller.namespace, local.eks_addons.kubevela_controller.namespace)
create_namespace = try(var.eks_addons.kubevela_controller.create_namespace, local.helm_create_namespace)
values = try(var.eks_addons.kubevela_controller.values, local.eks_addons.kubevela_controller.values)
set = try(var.eks_addons.kubevela_controller.set, local.eks_addons.kubevela_controller.set)
wait = try(var.eks_addons.kubevela_controller.wait, local.helm_wait)

depends_on = [
module.eks_addons
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try function is used extensively throughout the code. While this is not necessarily a problem, it can make the code harder to read and debug. Consider simplifying the code or adding comments to clarify the expected behavior in case of errors.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 4
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between ea401d8 and c423f77.
Files selected for processing (2)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (1 hunks)
  • infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (1 hunks)
Additional comments (Suppressed): 9
infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (9)
  • 1-14: The use of locals to define and manage variables is a good practice. It improves readability and maintainability of the code. However, ensure that the files env.hcl and account.hcl exist in the parent folders as expected, and they contain all the required variables.

  • 26-29: The tags for the autoscaling group are well defined. However, ensure that the cluster autoscaler is properly configured and enabled in the Kubernetes cluster to make use of these tags.

  • 34-38: The source for the Terraform module is a specific commit in a GitHub repository. This is a good practice as it ensures the module used is a known version. However, ensure that this commit is the intended one and it contains all the necessary changes.

  • 46-48: The dependency on the VPC is well defined. This ensures that the VPC is created before the EKS cluster. However, ensure that the VPC configuration at the specified path is correct and creates the necessary resources.

  • 52-53: The Kubernetes cluster version is hardcoded to 1.26. It's recommended to make this a variable so that it can be easily updated in the future.

-  cluster_version = "1.26"
+  cluster_version = var.cluster_version
  • 62-133: The security group rules are well defined, but they seem to allow all traffic (0.0.0.0/0 and ::/0) for certain ports. This could potentially expose the cluster to unnecessary risks. Consider restricting the CIDR blocks to only necessary IP ranges.

  • 136-171: The configuration for the EKS managed node group is well defined. However, the ami_release_version is hardcoded. Consider making this a variable for easier updates in the future.

-  ami_release_version = "1.15.1-264e294c"
+  ami_release_version = var.ami_release_version
  • 174-182: The configuration for the worker node group is well defined. However, the instance type (t3a.xlarge) is hardcoded. Consider making this a variable to allow for easier scaling or changes in the future.

  • 185-191: The management of the aws-auth configmap and the KMS key is well done. However, ensure that the users listed in local.list_users have the necessary permissions to manage these resources.

Comment on lines +16 to +24
list_users = [for user in local.users :
"arn:aws:iam::${local.aws_account_id}:user/${user}"
]

map_users = [for user in local.users : {
userarn = "arn:aws:iam::${local.aws_account_id}:user/${user}"
username = user
groups = ["system:masters"]
}]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The creation of IAM user ARNs and user maps is well done. However, all users are being assigned to the system:masters group. This gives them full control over all resources in the cluster. Consider assigning users to more specific roles or groups based on the principle of least privilege.

profile = local.account_vars.locals.aws_profile

# Hosted Zone ARN for scdev-test.aws.iohkdev.io
hostedzone_arn = "arn:aws:route53:::hostedzone/Z10147571DRRDCJXSER5Y"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded ARN for the hosted zone might cause issues if the hosted zone changes or if this code is used in a different environment. Consider making this a variable that can be passed in, or retrieving it dynamically using a data source.

-   hostedzone_arn = "arn:aws:route53:::hostedzone/Z10147571DRRDCJXSER5Y"
+   hostedzone_arn = var.hostedzone_arn

# Cert-Manager
enable_cert_manager = true
cert_manager = {
chart_version = "v1.9.1" #FIXME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chart_version for Cert-Manager is hardcoded and marked as FIXME. This could lead to issues with version compatibility in the future. Consider making this a variable that can be passed in.

-       chart_version = "v1.9.1" #FIXME
+       chart_version = var.cert_manager_chart_version

Comment on lines 97 to 104
service:
annotations:
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": "external"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": "instance"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": "traefik"
"service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing"
"external-dns.alpha.kubernetes.io/hostname": "${join(",", local.hostnames)}"
"external-dns.alpha.kubernetes.io/aws-weight": "100"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Traefik Load Balancer configuration includes hardcoded values for the service annotations. These annotations might need to be different for different environments or use cases. Consider making these annotations configurable through variables.

-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": "external"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": "instance"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": "traefik"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing"
-             "external-dns.alpha.kubernetes.io/hostname": "${join(",", local.hostnames)}"
-             "external-dns.alpha.kubernetes.io/aws-weight": "100"
-             "external-dns.alpha.kubernetes.io/set-identifier": "traefik-blue"
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": var.aws_load_balancer_type
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": var.aws_load_balancer_nlb_target_type
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": var.aws_load_balancer_name
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": var.aws_load_balancer_scheme
+             "external-dns.alpha.kubernetes.io/hostname": "${join(",", var.hostnames)}"
+             "external-dns.alpha.kubernetes.io/aws-weight": var.aws_weight
+             "external-dns.alpha.kubernetes.io/set-identifier": var.set_identifier

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between c423f77 and 2e8c2ca.
Files selected for processing (1)
  • infra/modules/eks/outputs.tf (1 hunks)
Files skipped from review due to trivial changes (1)
  • infra/modules/eks/outputs.tf

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 3
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 2e8c2ca and 1d227f9.
Files selected for processing (3)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (1 hunks)
  • infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (1 hunks)
  • infra/us-east-1/prod/rds/prod-marlowe-runtime-db/terragrunt.hcl (1 hunks)
Additional comments (Suppressed): 12
infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (9)
  • 52-53: The Kubernetes cluster version is still hardcoded to 1.26. As mentioned in the previous review, it's recommended to make this a variable so that it can be easily updated in the future.
-  cluster_version = "1.26"
+  cluster_version = var.cluster_version
  • 62-133: The security group rules still seem to allow all traffic (0.0.0.0/0 and ::/0) for certain ports. This could potentially expose the cluster to unnecessary risks. As suggested in the previous review, consider restricting the CIDR blocks to only necessary IP ranges.

  • 136-171: The ami_release_version is still hardcoded. As mentioned in the previous review, consider making this a variable for easier updates in the future.

-  ami_release_version = "1.15.1-264e294c"
+  ami_release_version = var.ami_release_version
  • 174-182: The instance type (t3a.xlarge) for the worker node group is still hardcoded. As suggested in the previous review, consider making this a variable to allow for easier scaling or changes in the future.

  • 16-24: All users are still being assigned to the system:masters group. This gives them full control over all resources in the cluster. As suggested in the previous review, consider assigning users to more specific roles or groups based on the principle of least privilege.

  • 1-14: The extraction of common variables for reuse is well done. This improves the readability and maintainability of the code.

  • 34-38: The use of a specific commit hash in the source URL ensures that the exact version of the module is used. This is a good practice for reproducibility and avoiding unexpected changes.

  • 45-48: The use of the dependency block to reference the VPC configuration is a good practice. This ensures that the VPC is created before the EKS cluster, establishing the correct dependency order.

  • 185-191: The management of the aws-auth configmap and the KMS key is well done. This ensures that only the specified users have access to the EKS cluster and the KMS key.

infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (3)
  • 12-12: The hardcoded ARN for the hosted zone might cause issues if the hosted zone changes or if this code is used in a different environment. Consider making this a variable that can be passed in, or retrieving it dynamically using a data source.
-   hostedzone_arn = "arn:aws:route53:::hostedzone/Z10147571DRRDCJXSER5Y"
+   hostedzone_arn = var.hostedzone_arn
  • 64-64: The chart_version for Cert-Manager is hardcoded and marked as FIXME. This could lead to issues with version compatibility in the future. Consider making this a variable that can be passed in.
-       chart_version = "v1.9.1" #FIXME
+       chart_version = var.cert_manager_chart_version
  • 97-104: The Traefik Load Balancer configuration includes hardcoded values for the service annotations. These annotations might need to be different for different environments or use cases. Consider making these annotations configurable through variables.
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": "external"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": "instance"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": "traefik"
-             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing"
-             "external-dns.alpha.kubernetes.io/hostname": "${join(",", local.hostnames)}"
-             "external-dns.alpha.kubernetes.io/aws-weight": "100"
-             "external-dns.alpha.kubernetes.io/set-identifier": "traefik-blue"
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-type": var.aws_load_balancer_type
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-nlb-target-type": var.aws_load_balancer_nlb_target_type
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-name": var.aws_load_balancer_name
+             "service.annotations.service.beta.kubernetes.io/aws-load-balancer-scheme": var.aws_load_balancer_scheme
+             "external-dns.alpha.kubernetes.io/hostname": "${join(",", var.hostnames)}"
+             "external-dns.alpha.kubernetes.io/aws-weight": var.aws_weight
+             "external-dns.alpha.kubernetes.io/set-identifier": var.set_identifier

Comment on lines +1 to +15
locals {
# Automatically load environment-level variables
environment_vars = read_terragrunt_config(find_in_parent_folders("env.hcl"))
account_vars = read_terragrunt_config(find_in_parent_folders("account.hcl"))

# Extract out common variables for reuse
project = local.account_vars.locals.project
env = local.environment_vars.locals.environment
app = "marlowe-runtime"

database_name = "${local.env}-${local.app}-db"

tags = {}

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tags local variable is defined but not populated with any values. If you plan to use AWS resource tagging, consider populating this variable with appropriate key-value pairs. If not, consider removing this unused variable to avoid confusion.

domain = ""
publicly_accessible = false
monitoring_interval = 60
monitoring_role_arn = "arn:aws:iam::${local.account_vars.locals.aws_account_id}:role/rds-monitoring-role"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The monitoring_role_arn is still hardcoded. As mentioned in the previous review, this could potentially lead to issues if the role does not exist or is not accessible. Consider making this a variable that can be passed in, or add error handling to ensure that the role exists and has the necessary permissions.

apply_immediately = true
deletion_protection = true

tags = local.tags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tags input is set to local.tags which is an empty map. If you plan to use AWS resource tagging, consider populating this variable with appropriate key-value pairs. If not, consider removing this unused input to avoid confusion.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 1d227f9 and 2e7e497.
Files selected for processing (1)
  • packages/kubevela.nix (1 hunks)
Files skipped from review due to trivial changes (1)
  • packages/kubevela.nix

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 2e7e497 and 1819050.
Files selected for processing (1)
  • infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (1 hunks)
Additional comments: 4
infra/us-east-1/prod/eks/blue/eks-addons/terragrunt.hcl (4)
  • 12-12: The hardcoded ARN for the hosted zone might cause issues if the hosted zone changes or if this code is used in a different environment. Consider making this a variable that can be passed in, or retrieving it dynamically using a data source.
-   hostedzone_arn = "arn:aws:route53:::hostedzone/Z10147571DRRDCJXSER5Y"
+   hostedzone_arn = var.hostedzone_arn
  • 64-64: The chart_version for Cert-Manager is hardcoded and marked as FIXME. This could lead to issues with version compatibility in the future. Consider making this a variable that can be passed in.
-       chart_version = "v1.9.1" #FIXME
+       chart_version = var.cert_manager_chart_version
  • 20-20: Ensure that the specific commit hash referenced in the Terraform source URL is the intended one. Using a specific commit hash can lead to outdated code if not updated regularly.

  • 103-103: The hostnames for the external DNS are being joined from a local variable. Ensure that this local variable is being correctly populated and that the resulting string matches the expected format for the DNS hostnames.

values = [
<<-EOT
image:
tag: "v3.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image tag for the Traefik Load Balancer is hardcoded. This could lead to issues with version compatibility in the future. Consider making this a variable that can be passed in.

-           tag: "v3.0"
+           tag: var.traefik_load_balancer_image_tag

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 1819050 and 27d0bc9.
Files selected for processing (1)
  • infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl (1 hunks)
Files skipped from review due to trivial changes (1)
  • infra/us-east-1/prod/eks/blue/eks/terragrunt.hcl

Copy link
Contributor

@renebarbosafl renebarbosafl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tshaynik
Copy link
Contributor Author

Merging now, since this cluster is deployed, and we can get back to smaller PRs targeting main.

@tshaynik tshaynik merged commit fbbd34f into main Oct 24, 2023
@renebarbosafl renebarbosafl deleted the infra/eks-cluster-migration branch October 24, 2023 19:50
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants