Skip to content

Commit

Permalink
Merge pull request #4 from anyscale/gke
Browse files Browse the repository at this point in the history
Add example for existing GKE cluster
  • Loading branch information
brent-anyscale authored Sep 25, 2024
2 parents 583c7cb + afccf7a commit de71dae
Show file tree
Hide file tree
Showing 8 changed files with 160 additions and 61 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Local .terraform directories
**/.terraform/*
.terraform

# Terraform lockfile
.terraform.lock.hcl
Expand Down
54 changes: 37 additions & 17 deletions examples/gcp/gke-existing_cluster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,29 @@
[![Google Provider Version][badge-tf-google]](https://github.com/terraform-providers/terraform-provider-google/releases)

# Anyscale GCP GKE Example - Existing Cluster
This example creates the resources to run Anyscale on GCP GKE with an existing cluster
**Work in progress**

## Needs to Create:
- DONE - filestore
- DONE - IAM Service Accounts for ControlPlane
- DONE - Firewall
- IAM Service Accounts for Dataplane (?) (needs a cluster role for GKE)
- DONE - storage bucket
- namespace
- helm charts
- configmap

This example creates the resources to run Anyscale on GCP GKE with an existing GKE cluster.

## Known Issues on GKE

- Autopilot GKE clusters are not supported.
- Node auto-provisioning for GKE failing with GPU nodes: https://github.com/GoogleCloudPlatform/container-engine-accelerators/issues/407
- When choosing "GPU Driver installation", select "Google-managed".

## terraform.tfvars

```hcl
anyscale_deploy_env = "..."
anyscale_org_id = "..." # Troubleshooting Org Id
google_region = "..."
google_project_id = "..."
existing_vpc_name = "..."
existing_subnet_name = "..."
customer_ingress_cidr_ranges = "0.0.0.0/0"
gke_endpoint = "..."
gke_ca_certificate = "..."
```

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements
Expand All @@ -28,7 +39,10 @@ This example creates the resources to run Anyscale on GCP GKE with an existing c

## Providers

No providers.
| Name | Version |
|------|---------|
| <a name="provider_google"></a> [google](#provider\_google) | 5.44.1 |
| <a name="provider_helm"></a> [helm](#provider\_helm) | 2.15.0 |

## Modules

Expand All @@ -41,22 +55,28 @@ No providers.

## Resources

No resources.
| Name | Type |
|------|------|
| [helm_release.ingress_nginx](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource |
| [google_client_config.provider](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_config) | data source |
| [google_compute_network.existing_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/compute_network) | data source |
| [google_compute_subnetwork.exising_subnet](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/compute_subnetwork) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_anyscale_org_id"></a> [anyscale\_org\_id](#input\_anyscale\_org\_id) | (Required) Anyscale Organization ID | `string` | n/a | yes |
| <a name="input_customer_ingress_cidr_ranges"></a> [customer\_ingress\_cidr\_ranges](#input\_customer\_ingress\_cidr\_ranges) | The IPv4 CIDR blocks that allows access Anyscale clusters.<br>These are added to the firewall and allows port 443 (https) and 22 (ssh) access.<br>ex: `52.1.1.23/32,10.1.0.0/16'<br>` | `string` | n/a | yes |
| <a name="input_existing_subnet_cidr"></a> [existing\_subnet\_cidr](#input\_existing\_subnet\_cidr) | The CIDR range of the existing subnet | `string` | n/a | yes |
| <a name="input_existing_vpc_id"></a> [existing\_vpc\_id](#input\_existing\_vpc\_id) | The ID of the existing VPC | `string` | n/a | yes |
| <a name="input_existing_subnet_name"></a> [existing\_subnet\_name](#input\_existing\_subnet\_name) | The name of the existing Subnet | `string` | n/a | yes |
| <a name="input_existing_vpc_name"></a> [existing\_vpc\_name](#input\_existing\_vpc\_name) | The name of the existing VPC | `string` | n/a | yes |
| <a name="input_gke_ca_certificate"></a> [gke\_ca\_certificate](#input\_gke\_ca\_certificate) | Base64 encoded PEM certificate for the cluster | `string` | n/a | yes |
| <a name="input_gke_endpoint"></a> [gke\_endpoint](#input\_gke\_endpoint) | The endpoint for the GKE cluster | `string` | n/a | yes |
| <a name="input_google_project_id"></a> [google\_project\_id](#input\_google\_project\_id) | ID of the Project to put these resources in | `string` | n/a | yes |
| <a name="input_google_region"></a> [google\_region](#input\_google\_region) | The Google region in which all resources will be created. | `string` | n/a | yes |
| <a name="input_anyscale_cloud_id"></a> [anyscale\_cloud\_id](#input\_anyscale\_cloud\_id) | (Optional) Anyscale Cloud ID | `string` | `null` | no |
| <a name="input_anyscale_deploy_env"></a> [anyscale\_deploy\_env](#input\_anyscale\_deploy\_env) | (Optional) Anyscale deploy environment. Used in resource names and tags.<br><br>ex:<pre>anyscale_deploy_env = "production"</pre> | `string` | `"production"` | no |
| <a name="input_labels"></a> [labels](#input\_labels) | (Optional) A map of labels to all resources that accept labels. | `map(string)` | <pre>{<br> "environment": "test",<br> "test": true<br>}</pre> | no |
| <a name="input_labels"></a> [labels](#input\_labels) | (Optional) A map of labels to all resources that accept labels. | `map(string)` | <pre>{<br> "environment": "test"<br>}</pre> | no |

## Outputs

Expand Down
51 changes: 45 additions & 6 deletions examples/gcp/gke-existing_cluster/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,14 @@
# - Filestore
# - IAM Service Accounts
# - Firewall Policy
# - Helm Charts
# - Nginx ingress controller (Helm Chart)
# It expects the following to be already created:
# - GCP Project
# - GKE Cluster
# - GKE Node Pool
# - VPC
# - GKE cluster
# - VPC and Subnet
# - Dataplane service account: See https://docs.anyscale.com/administration/cloud-deployment/deploy-gcp-cloud
# - Workload Identity Provider
#
# ---------------------------------------------------------------------------------------------------------------------
locals {
full_labels = merge(tomap({
Expand All @@ -29,6 +31,8 @@ module "anyscale_cloudstorage" {

anyscale_project_id = var.google_project_id
labels = local.full_labels

bucket_force_destroy = true
}

module "anyscale_iam" {
Expand Down Expand Up @@ -58,16 +62,25 @@ module "anyscale_filestore" {
labels = local.full_labels
}

data "google_compute_network" "existing_vpc" {
name = var.existing_vpc_name
}

data "google_compute_subnetwork" "exising_subnet" {
name = var.existing_subnet_name
region = var.google_region
}

module "anyscale_firewall" {
#checkov:skip=CKV_TF_1: Example code should use the latest version of the module
#checkov:skip=CKV_TF_2: Example code should use the latest version of the module
source = "github.com/anyscale/terraform-google-anyscale-cloudfoundation-modules//modules/google-anyscale-vpc-firewall"
module_enabled = true

vpc_name = var.existing_vpc_name
vpc_id = var.existing_vpc_id
vpc_id = data.google_compute_network.existing_vpc.id

ingress_with_self_cidr_range = [var.existing_subnet_cidr]
ingress_with_self_cidr_range = [data.google_compute_subnetwork.exising_subnet.ip_cidr_range]
ingress_from_cidr_map = [
{
rule = "https-443-tcp"
Expand All @@ -81,3 +94,29 @@ module "anyscale_firewall" {

anyscale_project_id = var.google_project_id
}

resource "helm_release" "ingress_nginx" {
name = "ingress-nginx"
repository = "https://kubernetes.github.io/ingress-nginx"
chart = "ingress-nginx"
version = "4.11.2"
namespace = "ingress-nginx"

create_namespace = true
wait = false

set {
name = "controller.service.type"
value = "LoadBalancer"
}

set {
name = "controller.service.annotations.cloud\\.google\\.com/load-balancer-type"
value = "External"
}

set {
name = "controller.service.externalTrafficPolicy"
value = "Local"
}
}
4 changes: 2 additions & 2 deletions examples/gcp/gke-existing_cluster/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ output "anyscale_registration_command" {
--filestore-location ${module.anyscale_filestore.anyscale_filestore_location} \
--anyscale-service-account-email ${module.anyscale_iam.iam_anyscale_access_service_acct_email} \
--provider-name ${module.anyscale_iam.iam_workload_identity_provider_name} \
--kubernetes-namespaces <kubernetes-namespaces>
--kubernetes-namespaces <kubernetes-namespaces> \
--kubernetes-ingress-external-address <kubernetes-ingress-external-address-or-ip> \
--kubernetes-zones <comma-separated-zones> \
--kubernetes-dataplane-identity <data-plane-service-account-email>
--kubernetes-dataplane-identity <gke-service-account>
EOT
}
29 changes: 17 additions & 12 deletions examples/gcp/gke-existing_cluster/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,16 @@ variable "anyscale_org_id" {
}
}

# -----------------
# Kubernetes
# -----------------
variable "gke_endpoint" {
description = "The endpoint for the GKE cluster"
type = string
}

variable "customer_ingress_cidr_ranges" {
description = <<-EOT
The IPv4 CIDR blocks that allows access Anyscale clusters.
These are added to the firewall and allows port 443 (https) and 22 (ssh) access.
ex: `52.1.1.23/32,10.1.0.0/16'
EOT
variable "gke_ca_certificate" {
description = "Base64 encoded PEM certificate for the cluster"
type = string
}

Expand All @@ -48,17 +51,20 @@ variable "existing_vpc_name" {
type = string
}

variable "existing_vpc_id" {
description = "The ID of the existing VPC"
variable "existing_subnet_name" {
description = "The name of the existing Subnet"
type = string
}

variable "existing_subnet_cidr" {
description = "The CIDR range of the existing subnet"
variable "customer_ingress_cidr_ranges" {
description = <<-EOT
The IPv4 CIDR blocks that allows access Anyscale clusters.
These are added to the firewall and allows port 443 (https) and 22 (ssh) access.
ex: `52.1.1.23/32,10.1.0.0/16'
EOT
type = string
}


# ------------------------------------------------------------------------------
# OPTIONAL PARAMETERS
# These variables have defaults, but may be overridden.
Expand Down Expand Up @@ -102,7 +108,6 @@ variable "labels" {
description = "(Optional) A map of labels to all resources that accept labels."
type = map(string)
default = {
"test" : true,
"environment" : "test"
}
}
40 changes: 19 additions & 21 deletions examples/gcp/gke-existing_cluster/versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -18,33 +18,31 @@ terraform {
}
}


provider "helm" {
kubernetes {
host = module.anyscale_eks_cluster.eks_kubeconfig.endpoint
cluster_ca_certificate = base64decode(module.anyscale_eks_cluster.eks_kubeconfig.cluster_ca_certificate)

# https://registry.terraform.io/providers/hashicorp/helm/latest/docs#exec-plugins
exec {
api_version = "client.authentication.k8s.io/v1beta1"
args = ["eks", "get-token", "--cluster-name", module.anyscale_eks_cluster.eks_cluster_name]
command = "aws"
}
}
provider "google" {
project = var.google_project_id
region = var.google_region
}

provider "kubernetes" {
host = module.anyscale_eks_cluster.eks_kubeconfig.endpoint
cluster_ca_certificate = base64decode(module.anyscale_eks_cluster.eks_kubeconfig.cluster_ca_certificate)

host = "https://${var.gke_endpoint}"
token = data.google_client_config.provider.access_token
cluster_ca_certificate = base64decode(var.gke_ca_certificate)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
args = ["eks", "get-token", "--cluster-name", module.anyscale_eks_cluster.eks_cluster_name]
command = "aws"
command = "gke-gcloud-auth-plugin"
}
}

provider "google" {
project = var.google_project_id
region = var.google_region
data "google_client_config" "provider" {}

provider "helm" {
kubernetes {
host = "https://${var.gke_endpoint}"
token = data.google_client_config.provider.access_token
cluster_ca_certificate = base64decode(var.gke_ca_certificate)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "gke-gcloud-auth-plugin"
}
}
}
4 changes: 2 additions & 2 deletions modules/anyscale-k8s-helm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,10 @@ This module creates Kubernetes helm charts for Anyscale applications and workloa

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | 5.63.0 |
| <a name="provider_aws"></a> [aws](#provider\_aws) | 5.68.0 |
| <a name="provider_helm"></a> [helm](#provider\_helm) | 2.15.0 |
| <a name="provider_kubernetes"></a> [kubernetes](#provider\_kubernetes) | 2.32.0 |
| <a name="provider_time"></a> [time](#provider\_time) | 0.12.0 |
| <a name="provider_time"></a> [time](#provider\_time) | 0.12.1 |

## Modules

Expand Down
37 changes: 37 additions & 0 deletions modules/anyscale-k8s-namespace/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,43 @@ This module creates a Kubernetes Namespace for Anyscale.
The Anyscale Namespace can also be created via the Anycsale Helm Chart.

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.0 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | ~> 2.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_kubernetes"></a> [kubernetes](#provider\_kubernetes) | 2.32.0 |

## Modules

No modules.

## Resources

| Name | Type |
|------|------|
| [kubernetes_namespace.anyscale](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/namespace) | resource |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_cloud_provider"></a> [cloud\_provider](#input\_cloud\_provider) | (Required) The cloud provider (aws or gcp)<br><br>ex:<pre>cloud_provider = "aws"</pre> | `string` | n/a | yes |
| <a name="input_anyscale_kubernetes_namespace"></a> [anyscale\_kubernetes\_namespace](#input\_anyscale\_kubernetes\_namespace) | (Optional) The name of the Kubernetes namespace.<br><br>ex:<pre>anyscale_kubernetes_namespace = "anyscale-k8s"</pre> | `string` | `"anyscale-k8s"` | no |
| <a name="input_kubernetes_cluster_name"></a> [kubernetes\_cluster\_name](#input\_kubernetes\_cluster\_name) | (Optional) The name of the Kubernetes cluster.<br><br>ex:<pre>kubernetes_cluster_name = "my-cluster"</pre> | `string` | `null` | no |
| <a name="input_module_enabled"></a> [module\_enabled](#input\_module\_enabled) | (Optional) Determines if this module should create resources.<br><br>If set to true, `eks_role_arn`, `anyscale_subnet_ids`, and `anyscale_security_group_id` must be provided.<br>ex:<pre>module_enabled = true</pre> | `bool` | `true` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_anyscale_kubernetes_namespace_name"></a> [anyscale\_kubernetes\_namespace\_name](#output\_anyscale\_kubernetes\_namespace\_name) | The name of the Kubernetes namespace. |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->

<!-- References -->
Expand Down

0 comments on commit de71dae

Please sign in to comment.