This repo contains a Terraform code for running a Kubernetes cluster on Google Cloud Platform (GCP) using Google Kubernetes Engine (GKE) and GitHub Action to Build, Publish, and Deploy an application. You can see the application metrics in Grafana by Prometheus.
- Urban Task
- Table of contents
- Quickstart
- Destroy infrustructure
- Homework task for Urban
- List of decisions/compromises
Please review the Requirements
before starting.
Requirements
Requirements
- Terraform and kubectl are installed on the machine where Terraform is executed.
- The Compute Engine and Kubernetes Engine APIs are active on the project you will launch the cluster in.
Google Cloud Account
- You have to loginin your Google Cloud Account
- Create new Project
- Add billing on this Project
CLI gcloud
- Some submodules use the terraform-google-gcloud module. By default, this module assumes you already have gcloud installed in your $PATH.
- See the module documentation for more information.
Enable APIs
- In order to operate with the Service Account you must activate the following APIs on the project where the Service Account was created:
- Compute Engine API - compute.googleapis.com
- Kubernetes Engine API - container.googleapis.com
Software Dependencies
- kubectl >= 1.9.x
Terraform and Plugins
- Terraform >= 1.0
- [Terraform Provider for GCP][terraform-provider-google] >= v3.41
Google Cloud Account and New Project
- You have to login in your Google Cloud Account
- Create new Project
- Add billing on this Project
We can use the script start.sh
to create GCP Infrustructure.
You have to run the script from folder scripts.
(it takes about 25-30 minutes).
- Before start you have to connect to gcloud CLI in terminal:
gcloud init
- connect to your Google Accountgcloud auth application-default login
- Choose your Google Project
- Install the gke-gcloud-auth-plugin binary
sudo apt-get install google-cloud-sdk-gke-gcloud-auth-plugin
# (Ubuntu solution)
- Clone repository
git clone git@github.com:Aleh-Mudrak/urban.git
- You can change initial parameters. They will be used in the script.
- tf-code/variables/infr.tfvars - Cluster Terraform variables include
project_id
andregion
- tf-code/infrustructure/main.tf -
bucket
name andprefix
- tf-code/variables/deploy.tfvars - Deploy to Cluster: Namespaces, Ingress and Prometheus
- tf-code/deploy/main.tf - Deploy
prefix
- tf-code/variables/infr.tfvars - Cluster Terraform variables include
- Start script start.sh from folder scripts
cd scripts
./start.sh
- Add GitHub Secrets to your Repository
- When
start.sh
script finished work you can see secrets in terminal - You have to add this secrets in your GitHub Repository
- When
The second way is to build infrastructure step by step. (Tested on Ubuntu 20)
Use Google Cloud CLI
- Go to Google Cloud Console and autorize.
- Install the gcloud CLI
# install gcloud CLI for Ubuntu
sudo apt-get install apt-transport-https ca-certificates gnupg
echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
sudo apt-get update && sudo apt-get install google-cloud-cli
# Connect to Google CLI
gcloud init
- Create new Project
# Choose dafault Project
gcloud auth application-default login
# Enable the Cloud Storage API:
gcloud services enable storage.googleapis.com
# Create Bucket to save tfstate-files
region="us-central1" # please check in file `tf-code/variables/infr.tfvars`
bucket="tfstate_files"
gsutil mb -p taskurban -c REGIONAL -l $region -b on gs://$bucket
# Clone repository urban-test
git clone git@github.com:Aleh-Mudrak/urban.git
Use Terraform code
Variables to create Cloud Infrastructure in file tf-code/variables/infr.tfvars
# Go to folder `tf-code/infrustructure` and run commands:
cd tf-code/infrustructure
terraform init
terraform apply -var-file ../variables/infr.tfvars -auto-approve
After that you have to Deploy Ingress and Prometheus. Parameters in file tf-code/variables/deploy.tfvars
# Go to folder `tf-code/deploy` and run commands:
cd ../deploy
terraform init
terraform apply -var-file ../variables/deploy.tfvars -auto-approve
Connect to Urban-Cluster
Then you have to Connect to Cluster
# Install the gke-gcloud-auth-plugin binary
sudo apt-get install google-cloud-sdk-gke-gcloud-auth-plugin
# Update the kubectl configuration to use the plugin:
cd ../infrustructure
CLUSTER_NAME=$(terraform output -raw cluster_name)
cluster_location=$(terraform output -raw cluster_location)
gcloud container clusters get-credentials $CLUSTER_NAME --region $cluster_location
# test connetion
kubectl get nodes
Terraform code in folders:
tf-code/infrustructure
- create infrustructure: Google Kubernetes Engine (GKE) Cluster, Network with Firewall and rules, Google Container Regygistry (GCR), and Service Accounttf-code/infrustructure
- Create Kubernetes Namespaces: test, dev, prod. Deploy Nginx Ingress and Prometheus with Grafana by helm deploy.tf-code/modules/service-account
- by this Module create Service Account
Infrustructure
- container-registry.tf - GCR to store docker images
- k8s-cluster.tf - GKE CLuster
- main.tf - TF requerments: backend, requiered providers and providers (google, kubernetes, helm), Datasources
- network.tf - VPC, Subnet, Router, NAT, Firewall
- outputs.tf - Output data
- service-account.tf - Service account to create GKE Cluster and Deploy by GitHub Action. Used module modules/service-account to create Service Account and add Roles. Module documentation
- variables.tf - Used variables. Set variables in file like infr.tfvars
Deploy
- ingress.tf - Ingress controller deploy in Namespace
ingress
- main.tf - TF requerments: backend, requiered providers and providers (google, kubernetes, helm), Datasources
- namaspaces.tf - Create Namespaces in Cluster:
test
,dev
,prod
- prometheus.tf - Prometheus deploy in Namespace
metrics
- variables.tf - Used variables. Set variables in file like deploy.tfvars
Module service-account
- main.tf - Create Service Account and Add Roles, Create SA-KEY
- outputs.tf - Output data
- variables.tf - Used variables. Set variables in file like infr.tfvars
When infrustructure ready you can use GitHub Actions to deploy application in Kubernetes Cluster.
GitHub Secrets link like this: https://github.com/<Your-Account-Name>/<Your-Repository>/settings/secrets/actions
- GCP_SA_KEY - Service Account Key to connect in Cluster
- GKE_PROJECT - Your
project_id
in Google Cloud - GKE_CLUSTER - Cluster Name
- GKE_ZONE - Region of your Cluster
- SLACK_WEBHOOK_URL` - Webhook URL to connect in Slack API and send messages
Screenshots and Commands to get GitHub Repository Secrtets
-
You can Get Secrets by the script output.sh. You have to run the script from folder
scripts\
. -
Example of output from script:
- GitHub Secrets link like this:
https://github.com/<Your-Account-Name>/<Your-Repository>/settings/secrets/actions
- Screenshot from GitHub Repository Secrets page
Deploy App
You have to go in GitHub Actions page and run Build and Deploy to GKE
like on picture bellow.
- Choose
Environment
(test|dev|prod) - And
Replicas
of the application (1-5)
- Checkout - Clone GitHub repository
- Check_input_Variables - Check entered data on this step
- Slack_Notification_Start - After that you recieve message in Slack about Start deploy and initial parameters on step
- Setup_gcloud - Setup gcloud CLI and Configure Docker to use the gcloud command-line tool as a credential
- get_gke_credentials - Get the GKE credentials so we can deploy to the cluster
- Setting_Environment_Variables - Configure Setting Environment Variables to Build, Push, and Deploy the application
- Build - Build the application
- Publish - Push to GCR this application Docker image.
- Deploy - Deploy in Cluster this application.
- Slack_Notification_Finish - Last step send message to Slack with deploy results and link.
Docker image has image name:
gcr.io/$PROJECT_ID/$APP_NAME:$PROJECT_VERSION
Where
- PROJECT_ID - Google Cloud ProgectID
- APP_NAME - Application Name
- PROJECT_VERSION - Created from
branch_name-commit_hash
:- branch_name - Get from started GHActions brunch
- commit_hash - Short Commit Hash
Deploy configuration files you can find in folder application/deploy-app/
deploy.yml
- Deploy the applicationingress.yml
- Ingress service to connect the application from the InternetpromMetrics.yml
- Deploy a service-monitor to get metrics from the applicationservice.yml
- Service to connect the applicastion pods
Show the application the Internet
Add in your hosts file string like that: 34.69.160.165 taskurban.com
Command to change in the Linux: sudo vim /etc/hosts
Where
34.69.160.165
- IP address from Slack message;taskurban.com
- URL from Slack message.
- Was added string in
application/package.json
file to run application by commandnpm start
- String 7:
"start": "node app/index.js",
- String 7:
- Added Prometheus-metrics code in file
application/app/index.ts
to get metrics in Prometheus- String 5-32:
added code to application/app/index.ts
const express = require('express')
const metrics = require('express-prometheus-metrics')
const app = express();
app.use(
metrics({
// The route to expose the metrics on
metricsPath: '/metrics',
// How often prometheus should collect the metrics
interval: 60 * 1000,
// Any routes that should be ignored
excludeRoutes: [],
// Percentiles for request duration summary
requestDurationBuckets: [0.5, 0.9, 0.95, 0.99],
// Time buckets for request duration histogram
requestDurationHistogramBuckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10],
// Size buckets for request
requestSizeBuckets: [5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, 10000],
// Size buckets for response
responseSizeBuckets: [5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, 10000],
}),
)
- Created Dockerfile to build image
- Added commands for Prometheus metrics:
RUN npm add express-prometheus-metrics
RUN npm add pkginfo
- Added commands for Prometheus metrics:
To destroy infrastructure you can use the script destroy.sh in folder scripts
.
You have to run the script from folder scripts/
.
(it takes about 15-20 minutes)
The goal of the task is to demonstrate how a candidate can create an environment with terraform. You should commit little and often to show your ways of working
- The environment should get created in Google Cloud Platform
- Create a VPC native Kubernetes cluster
- Host the provided Node.js application provided in the
app
folder in the created cluster with 3 replicas - Expose the provided application to the public internet
- Include at least 1 custom module in Terraform
- Add the prometheus-client to the provided application and expose one metric on a
/metrics
endpoint - Write down some thoughts about what compromises you've applied (if any) and how would you like to improve the solution
- Code quality
- Solution architecture
- Whether the code is "production-ready" (i.e. the environment starts and works as expected)
Any solution can be improved, but usually we don't have free time for this and we have to choose a more effective way to solve our tasks. In this task, I created the GKE infrastructure and described two ways to deploy it, and added scripts to get variables for GitHub Actions and to destroy it. I prefer to create easy-to-understand solutions by adding comments to the code and documentation where possible.
- The folders in the repo have been sorted and moved by category and logic.
- All parameters were in variables.tf as default.
- Not important parameters were deleted from
infr.tfvars
- Cluster parameters
- Network parameters
- Service Account parameters
- Bash script get initial parameters from
infr.tfvars
- Not important parameters were deleted from
- Used for_each to create multiple node pools in a cluster.
- Used for_each to create multiple firewall rules.
- Data parameters in the
main.tf
file used to connect in the Cluester on the step Deploy. - Output data the same used to connect in the Cluster on the step Deploy and in the GitHub Actions.
- Prometheus scrape has been resolved. Issue was in the service labels.
Compromises:
- Start scripts can be improved:
- Get variables from Google Secret Manager;
- Add Secret GKE_SA_KEY in the GitHub Repository;
- Add more checks
- Terraform:
- Terraform Cloud is good solution to use with a GitHub repository;
- The application and the GH Action have to be in one repo, TF-code in another;
- TF-code Infrustructure and Deploy have to separate to diffirent git repository;
- Can add output variables in Deploy part;
- Can add option to disable deploy Prometheus;
- Firewall rules can be moved to the Deploy TF-code part;
- Can add more modules:
- Create GKE Cluster and Nodes;
- Network with VPC, Subnet, NAT, and Router;
- Firewall
- You can use Terragrunt if you will use a lot of GKE Clusters .
- Variables in Terraform code can be added into the objects.
- GitHub Actions can be improved with:
- steps: test-application, cash, deploy by git tag-version;
- Helm charts;
- Some Terraform Secrets can be moved to GitHub Secrets by GH CLI.
- Prod and test+dev deploy have to be in different Clusters.