Kubernetes configs and utilities for managing Plants of the World Online on Google Cloud. For the POWO source code see the powop repository.
There are two main deployed components:
- The POWO builder which orchestrates weekly rebuilds of the POWO site
- The POWO site itself
- Overview
- POWO Site
- POWO Builder
- Reference
The POWO site deployment is the collection of services that makes up POWO (and related sites) - it combines the Helm configuration in powo/
with the images built by the powop
repo build process.
The site is redeployed from scratch every week by the POWO builder - Helm release, data, everything! However this build takes time and only occurs once weekly and is more aimed at keeping data up to date than making releases. For making releases there are two more convenient options than waiting for the weekly build: upgrade or build.
If you are just making changes to the portal/dashboard and there no data changes you can deploy without rebuilding all the data.
To do this you need to:
- Build and push Docker images -
mvn clean deploy
- Update the image tags (in
powo/prod.yaml
for prod orpowo/uat.yaml
for uat) and commit these changes - Push the image tags to Github origin (this is required so that when the builder next runs it uses the same image)
- Work out required variables:
$RELEASE_CONTEXT
-powo-dev
for UAT,powo-prod
for production (or what you have set up in your Kubernetes context to access the relevant cluster)$RELEASE
- the latest built release, get usinghelm ls
$ENVIRONMENT
-uat
for UAT,prod
for production
- Upgrade the current release with the latest tags
helm upgrade $RELEASE powo/ --kube-context $RELEASE_CONTEXT -f secrets/$ENVIRONMENT/secrets.yaml -f powo/$ENVIRONMENT.yaml
For UAT:
helm upgrade RELEASE powo/ -f secrets/uat/secrets.yaml -f powo/uat.yaml
For Prod:
helm upgrade RELEASE powo/ -f secrets/prod/secrets.yaml -f powo/prod.yaml
Occasionally, upgrading the portal does not work properly. The ingress and portal containers become out of sync, so the CSS and JS assets are not loaded. TODO: figure out exactly why this happens.
To fix this, the steps are as follows:
- Delete the bad containers in the Google Cloud Container Registry
- Re-create the containers with
mvn clean deploy
- Restart the broken pods as follows:
- Get the namespace of the broken pods:
kubectl get ns
- Scale the portal and ingress deployments to 0 pods:
kubectl scale --replicas=0 deployments/portal -n uat-nneom
kubectl scale --replicas=0 deployments/ingress -n uat-nneom
- Scale the portal and ingress deployments to 1 pod:
kubectl scale --replicas=1 deployments/portal -n uat-nneom
kubectl scale --replicas=1 deployments/ingress -n uat-nneom
- Get the namespace of the broken pods:
If you want to deploy new images and also rebuild the data, you can trigger a build job immediately based on the CronJob
defined by the POWO builder.
- Build and push Docker images -
mvn clean deploy
- Update the image tags (in
powo/prod.yaml
for prod orpowo/uat.yaml
for uat) and commit these changes - Push the image tags to Github origin (this is required so that when the builder next runs it uses the same image
- Work out required variables:
$RELEASE_CONTEXT
-powo-dev
for UAT,powo-prod
for production (or what you have set up in your Kubernetes context to access the relevant cluster)$NAMESPACE
-builder-uat
for UAT,builder-prod
for production
kubectl create job deploy-manual --from=cronjob/builder --namespace=builder-uat --context $RELEASE_CONTEXT
You may need to change the name from
deploy-manual
if there is an existing one there.
The automated builds of the POWO site is executed based on the schedule defined in powo-builder/prod.yaml
/powo-builder/uat.yaml
You may need to manage the automated builds for example if one has failed or was triggered with incorrect configuration.
To cancel a job we need to:
- Stop the job and any pods it has created
- Remove the Helm deployment created as part of the job
- Remove the namespace created as part of the job.
- Get the job name using
kubectl get jobs -n builder-uat
- you will probably be looking for the youngest job - Delete the job using
kubectl delete jobs/<job_name> -n builder-uat
- Get the name of the Helm release using
helm ls
- you want the release which was created later - Delete the helm release using
helm delete --purge <release_name>
The release name is also the name of the namespace - you will need it in the following step.
- Get the namespace name based on the previous step, or run
kubectl get ns
and use the youngest namespace - Delete the old namespace using
kubectl delete ns <namespace_name>
This step is required since the builder raises an error if there would be more than 2 namespaces at one time.
Reference documentation for one-time setup etc.
Both methods use Helm to deploy the necessary components to your cluster. See Helm installation instructions to get set up.
Once Helm is installed on your machine and the kubernetes cluster, you have to bootstrap
a few cluster-wide resources by running the bootstrap.sh
script. This initialises
storage classes and rbac roles needed for installation.
The POWO builder allows you to re-build the entire stack, and re-load a set of data, automatically on a fixed schedule. It does this by running a "builder" script on a cron schedule that:
- deploys a new release in it's own namespace
- loads a data configuration file
- harvests all data
- swaps DNS from old release to new when new release is complete
- deletes old release
This process allows for automated data upates to happen in the background and not impact the performance and functioning of the live site.
When deploying on GCP, this will require a service account with the "Kubernetes Engine Developer", "DNS Administrator", and "Storage Admin" roles. The service account key is then deployed in a secret to the builder.
As an example, to deploy a builder that will build dev
environment releases, run:
$ helm install -f secrets/dev/secrets.yaml -f secrets/deployer/secrets.yaml --namespace builder-dev --name builder-dev powo-builder/
For more details on production operations, please see the production operations manual
To upgrade the builder:
helm upgrade builder-uat powo-builder/ -f powo-builder/uat.yaml -f secrets/deployer/secrets.yaml -f secrets/uat/secrets.yaml
Then you can install a powo installation by running
$ helm install -f [ path to secrets file ] powo/
Parallel releases can be installed in the same cluster by specifying a --namespace
$ helm install --namespace uat -f [ path to secrets file ] powo/
Any namespace-specific overrides are in values files named the same as the namespace
$ helm install --namespace uat -f uat.yaml -f [path to uat secrets] powo/
Upgrade
$ helm upgrade --namespace uat -f uat.yaml -f [path to uat secrets] uat powo/