Skip to content

Commit

Permalink
Refactor GCP terraform code
Browse files Browse the repository at this point in the history
- Setup the cluster with the [terraform google provider][1], instead
  of the higher level [gke module][2]. The code gets simpler, and
  makes more terraform features (like for_each) accessible more easily.
- Allow multiple notebook and dask nodepools to be set up. Most
  research hubs want 2-3 options of notebook sizes to optimize for
  spend. I attempted to use [gke node autoprovisioning][3] instead of
  requiring manual nodepool provisioning, but it consistently
  provisioned nodes bigger than required. We should re-evaluate it
  later.
- Expose the GCP SA used by the k8s nodes to the user pods. A highly
  restricted SA is used for this, to prevent damage as much as possible.
  Users can then make requests to GCS buckets in other
  projects on behalf of this project.
- Dask Nodepools will default to matching sizes of the notebook
  nodepools. Can be overriden if necessary.
- Split terraform code into multiple files for easier maintenance
- Move tfvars files into a subdirectory and split terraform code
  into multiple files for easier maintenance
- Remove unused terraform variables
- Add some inline terraform docs
- Setup MOEM-IGE cluster + hub with new terraform code

[1]: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_cluster
[2]: https://registry.terraform.io/modules/terraform-google-modules/kubernetes-engine/google/latest
[3]: https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-provisioning
  • Loading branch information
yuvipanda committed May 30, 2021
1 parent c3a7ef4 commit 1f9d739
Show file tree
Hide file tree
Showing 9 changed files with 376 additions and 160 deletions.
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
name: meom
name: meom-ige
provider: gcp
gcp:
key: secrets/meom.json
project: meom-ige-cnrs
cluster: meom-cluster
cluster: meom-ige-cluster
zone: us-central1-b
hubs:
- name: staging
Expand Down Expand Up @@ -32,7 +32,7 @@ hubs:
org:
name: "SWOT Ocean Pangeo Team"
logo_url: https://2i2c.org/media/logo.png
url: https://2i2c.org
url: https://meom-group.github.io/
designed_by:
name: 2i2c
url: https://2i2c.org
Expand All @@ -41,8 +41,40 @@ hubs:
url: https://2i2c.org
funded_by:
name: SWOT Ocean Pangeo Team
url: https://2i2c.org
url: https://meom-group.github.io/
singleuser:
profileList:
# The mem-guarantees are here so k8s doesn't schedule other pods
# on these nodes. They need to be just under total allocatable
# RAM on a node, not total node capacity
- display_name: "Small"
description: "~2 CPU, ~8G RAM"
kubespawner_override:
mem_limit: 8G
mem_guarantee: 5.5G
node_selector:
node.kubernetes.io/instance-type: e2-standard-2
- display_name: "Medium"
description: "~8 CPU, ~32G RAM"
kubespawner_override:
mem_limit: 32G
mem_guarantee: 25G
node_selector:
node.kubernetes.io/instance-type: e2-standard-8
- display_name: "Large"
description: "~16 CPU, ~64G RAM"
kubespawner_override:
mem_limit: 64G
mem_guarantee: 55G
node_selector:
node.kubernetes.io/instance-type: e2-standard-16
- display_name: "Very Large"
description: "~32 CPU, ~128G RAM"
kubespawner_override:
mem_limit: 128G
mem_guarantee: 115G
node_selector:
node.kubernetes.io/instance-type: e2-standard-32
defaultUrl: /lab
image:
name: pangeo/pangeo-notebook
Expand All @@ -58,7 +90,20 @@ hubs:
type: LoadBalancer
https:
enabled: true
chp:
resources:
requests:
# FIXME: We want no guarantees here!!!
# This is lowest possible value
cpu: 0.01
memory: 1Mi
hub:
resources:
requests:
# FIXME: We want no guarantees here!!!
# This is lowest possible value
cpu: 0.01
memory: 1Mi
config:
Authenticator:
allowed_users: &users
Expand Down
31 changes: 31 additions & 0 deletions terraform/cd.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
/**
* Setup Service Accounts for authentication during continuous deployment
*/

// Service account used by GitHub Actions to deploy to the cluster
resource "google_service_account" "cd_sa" {
account_id = "${var.prefix}-cd-sa"
display_name = "Continuous Deployment SA for ${var.prefix}"
project = var.project_id
}

// Roles the service account needs to deploy hubs to the cluster
resource "google_project_iam_member" "cd_sa_roles" {
for_each = var.cd_sa_roles

project = var.project_id
role = each.value
member = "serviceAccount:${google_service_account.cd_sa.email}"
}

// JSON encoded private key to be kept in secrets/* to for the
// deployment script to authenticate to the cluster
resource "google_service_account_key" "cd_sa" {
service_account_id = google_service_account.cd_sa.name
public_key_type = "TYPE_X509_PEM_FILE"
}

output "ci_deployer_key" {
value = base64decode(google_service_account_key.cd_sa.private_key)
sensitive = true
}
Loading

0 comments on commit 1f9d739

Please sign in to comment.