-
Notifications
You must be signed in to change notification settings - Fork 14
DevOps and Running the Application
This document will explain DevOps setup and utilities for AEST/SIMS project.
N.B: ROOT means repository root directory
- DevOps
- OpenShift namespace
- OpenShift Client (OC CLI)
- Keycloak realm
- Docker (for local development only)
- Make cmd (for local development only - windows users)
- 5.1 Install Chocolatey first in order to install 'make'. In a CMD terminal execute:
@"%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command "[System.Net.ServicePointManager]::SecurityProtocol = 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://community.chocolatey.org/install.ps1'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin"
or refer to: https://docs.chocolatey.org/en-us/choco/setup
- 5.2 Install make using the commmand:
choco install make
- Test cmd (you can comment out when using as well - windows users)
- Clone repo to the local machine
git clone https://github.com/bcgov/SIMS
- Create a .env file in the repository root dir, for reference check /config/env-example on Microsoft Teams files. Then run all make commands from the
/sources
dir (see below) - To build the application:
make local-build
- To run all web+api+DB:
make local
- To stop all application stack:
make stop
- To clean all applications including storage:
make local-clean
- To run database only:
make postgres
- To run api with database:
make local-api
- Shell into local api container:
make api
- Run api test on Docker:
make test-api
- To run local redis:
make local-redis
ormake redis
(local redis is required to run the application) - To run queue-consumers in local docker:
make queue-consumers
- To run forms in local docker:
make forms
- To run clamav in local docker:
make clamav
- To make Camunda
make camunda
from sources - run
npm i
from packages/backend/workflows - To deploy workflows
make deploy-camunda-definitions
fromSIMS/sources
ornpm run deploy
from packages/backend/workflows folder
- run
npm i
from packages/web folder - To run web
npm run serve
from the packages/web folder
- run
npm i
from packages/backend npm run start:[dev][debug] [workers][api][other]
OpenShift is cloud-native deployment platform to run all our application stacks. The OpenShift (oc) CLI is required to run any OpenShit operation from local machine or OpenShift web console.
-
Developer need an account on OpenShift 4 cluster managed by BC Gov.
-
Copy temporary token from web console and use
oc login --token=#Token --server=https://api.silver.devops.gov.bc.ca:6443
-
After login please verify all the attached namespaces:
oc projects
-
Select any project:
oc project #ProjectName
-
Application images are building on a single namespace (tools namespace)
-
Images are promoted to different environment using Deployment Config.
-
All application secrets and configs are kept in OpenShift Secret and config maps. Theses values are injected to target application through deployment config.
Under ROOT/devops/openshift/
, all the OpenShift related template files are stored.
-
api-deploy.yml
: Api deployment config template. -
db-migrations-job.yml
: DB migrations job template. -
docker-build.yml
: Generic builder template. -
forms-build.yml
: Formio builder template. -
forms-deploy.yml
: Formio deployment config template. -
init-secrets.yml
: Init secrets to setup the initial environment setup on Openshift. -
networkpolicy.yml
: Security on the network policy on openshift template. -
queue-consumers-deploy.yml
: Queue Consumers deployment config template. -
security-init.yml
: Network and security polices template to enable any namespace for application dev. -
web-deploy.yml
: Web app deployment config template. -
workers-deploy.yml
: Workers deployment config template.
Under ROOT/devops/openshift/database/
, all the database related template files are stored.
-
mongo-ha-param.yml
: Parameter file to run mongo template file mongo-ha.yml. -
mongo-ha.yml
: HA Mongo State-full-state deployment config template. -
redis-ha-deploy.yml
: Redis State-full-state deployment config template. -
redis-secrets.yml
: Redis secrets template.
We have created a setup of make helper commands, Now we can perform following steps to setup any namespace.
-
Setup your env variable in
ROOT/.env
file or inROOT/devops/Makefile
, sample env file is available underROOT/configs/env-example
. The list of essential env variables are- NAMESPACE
- BUILD_NAMESPACE
- HOST_PREFIX (optional)
- BUILD_REF (optional, git brach/tag to use for building images)
- BUILD_ID (optional, default is 1)
-
Login and select namespace
-
Setup network, security policies OpenShift network policies, ClusterRole addition for image puller and github-action rolebinding :
make init-oc NAMESPACE=$namespace
- Run Github action
ClamAV - Install/Upgrade/Remove
to create ClamAV server in openshift.- Select the
Environment
to create the server. - Input the action either install/upgrade/uninstall.
- ClamAV Image Tag for the version of clam av server in the workflow and `Run workflow'.
- Run Github action
Crunchy DB - Install/Upgrade
to create Crunchy server in openshift.- Select the
Environment
to create the server. - Input the action
install
. - `Run workflow'.
- Select the
-
Add to the existing env variable in
ROOT/.env
file or inROOT/devops/Makefile
, sample env file is available underROOT/configs/env-example
. The list of essential env variables are- INIT_ZONE_B_SFTP_SERVER=
- INIT_ZONE_B_SFTP_SERVER_PORT=
- INIT_ZONE_B_SFTP_USER_NAME=
- INIT_ZONE_B_SFTP_PRIVATE_KEY_PASSPHRASE=
-
Add the private key for the zone B sftp server in the file
ROOT/devops/openshift/zone-b-private-key.cer
-
Setup Zone B SFTP secrets:
make init-zone-b-sftp-secret NAMESPACE=$namespace
-
Add the appropriate openshift secrets in the Github Secrets -> Environments.
-
Run Github action
Env Setup - Deploy SIMS Secrets to Openshift
to create all the secrets in openshift.- Select the
Environment
to create the secrets. - Input the tag as
Build Ref
in the workflow and `Run workflow'.
-
Popluate the mongo-ha-param.yml with the required values for mongo db creation.
-
Create Mongo DB:
make oc-deploy-ha-mongo NAMESPACE=$namespace
Note: For a fresh install we may need to Build Forms in the tools namespace and then deploy, else for deploying into a new environment, where the build is already available Deploy Forms is enough.
- Run Github action
Env Setup - Build Forms Server
to build the formio(Forms server) in tools namespace of openshift.- The minimum version of the formio server to be deployed is
v2.5.3
. Please refer the formio tag url for any updates needed. - Input the tag as
Build Ref
in the workflow and `Run workflow'.
-
Fetch the
mongo-url
secret from themongodb-ha-creds
created as part of the previousoc-deploy-ha-mongo
make command and update theGITHUB secrets
-> Environment -> MONGODB_URI -
Run Github action
Env Setup - Deploy Forms Server
to deploy the formio(Forms server) in tools namespace of openshift.- Select the
Environment
to build, deploy the Forms server and its related secrets, service and routes. - The minimum version of the formio server to be deployed is
v2.5.3
. Please refer the formio tag url for any updates needed. - Input the tag as
Build Ref
in the workflow and `Run workflow'.
-
Fetch the secrets from the
{HOST-PREFIX}-forms
created as part of the previous Github action and update theGITHUB secrets
-> Environment ->- FORMIO_ROOT_EMAIL : FORMS_SA_USER_NAME
- FORMIO_ROOT_PASSWORD : FORMS_SA_PASSWORD
- FORMS_URL : FORMS_URL
- FORMS_SECRET_NAME : {HOST-PREFIX}-forms
-
Run Github action
Release - Deploy Form.io resources
to deploy the forms resources to the formio server.- Select the
Environment
to build, deploy the Forms server and its related secrets, service and routes. - Input the tag as
Build Ref
in the workflow and `Run workflow'.
- Select the
-
Setup Redis secrets:
make init-redis NAMESPACE=$namespace
-
Deploy Redis with 6 replicas:
make deploy-redis NAMESPACE=$namespace
-
Initialize the Redis Cluster
- Make sure that all the redis pods are up and running before initializing the cluster:
make init-redis-cluster NAMESPACE=$namespace REDIS_PORT=$redis_port
- When prompted, type 'yes'
- Make sure that all the redis pods are up and running before initializing the cluster:
- Setup Redis:
- Go to Env Setup - Deploy Redis in Openshift. Select branch or tag and the environment. Click on "Run Workflow".
- Initialize Redis Cluster:
- Make sure that all the redis pods are up and running before initializing the cluster.
- Go to Env Setup - Initialize Redis Cluster in Openshift. Select branch or tag and the environment. Click on "Run Workflow".
- Run Github action
Release - Deploy
to deploy the API Web Workers Queue-consumers in the namespace.- Input the tag as
Git Ref
in the workflow. - Select the
Environment
to deploy the API Web Workers Queue-consumers. - `Run workflow'.
- Note Crunchy has automatic jobs to have backups running automaticatlly to restore to a particular timestamp, make file has been created for ease of execution.
- Run Github action
Crunchy DB - Install/Upgrade
to upgrade the crunchy helm chart in the openshift server.- Select the
Environment
to restore the crunchy. - Input the action either
upgrade
. - Check the checkbox Enable restore.
- Input the timestamp the DB has to be restored in 'YYYY-MM-DD HH:MM:SS' format.
- `Run workflow'.
- Select the
- To run the helm restore, the postgres operator has a security command to enable it from your local.
- Connect the openshift server locally using the oc commands and the oc token.
- Get proper approvals before the Restore command is executed.
- Run the command
kubectl annotate -n <namespace> postgrescluster simsdb --overwrite postgres-operator.crunchydata.com/pgbackrest-restore="$(date)"
. Update the namespace with the approporiate environment namespace to start the restore.
- Disable Restore
- Update the crunchy openshift with the restore disabled by running the helm upgrade github action
Crunchy DB - Install/Upgrade
again like below. - This disables the restore in helm chart, thus even when the helm restore command is run again locally from the previous steps, the restore will not happen.
- Update the crunchy openshift with the restore disabled by running the helm upgrade github action
Steps to Perform in Master Node of Postgres
- Wait around 20 mins to deploy helm chart completely after Helm upgrade.
- Run
connect-[ENV]-db-superuser
for each environment from~/sources/makefile
eg.make connect-dev-db-superuser MASTER_POD=Pod_id
- For superuser credentials please look into openshift secrets. Secret name: simsdb-pguser-postgres Secret key names:
user
andpassword
. - Once connected to Database as superuser, run the following commands.
GRANT USAGE ON SCHEMA information_schema TO "read-only-user";
GRANT USAGE ON SCHEMA sims TO "read-only-user";
GRANT SELECT ON ALL TABLES IN SCHEMA sims TO "read-only-user";
ALTER DEFAULT PRIVILEGES IN SCHEMA sims GRANT SELECT ON TABLES TO "read-only-user";
- Delete the resources associate with Mongo database (PVCs are not deleted):
oc-db-backup-delete-mongodb
When redis cluster is restarted or pods are put down and started, and the cluster did not recover gracefully in the openshift environment, or when the Redis node does not connect to another node and join the cluster. Please follow the below make commands.
-
Step 1: Bring down the slave pods: Make the redis pods from 6 to 3 in openshift.
-
STEP 2: run the following Github action
Env Setup - Redis recovery in Openshift
- Check if the masters are Cluster meet successfully. -
STEP 3: Make the redis pods replicas to 6 - This should automatically pass the queue-consumers.
When the Redis Cluster is not able to connect the slaves or the master as below, or in most of the scenarios to recover the failing redis due to slave-master or master-master connect issues.
-
To check the below, go to one of the pods in redis like 'redis-0' in this case and go to 'Terminal'
-
Run command
redis-cli
and you will go into the redis command line terminal, when you can run cluster commands. -
Run
cluster nodes
to see the ip and the slave status. -
Check the ip of one of the master redis pod (redis-1/redis-2 --> Note, sometimes those might not be the masters, go to logs to verify atleast 3 masters are up and proceed further) as below, by going into the details of any one of the pods.
-
Clearly the ip's in the cluster nodes command and the pods are not same. This means the redis cluster is trying to connect to a master which has a different ip.
-
Step to recover masters pods
- Bring down the slave pods: Make the redis pods from 6 to 3 in openshift(Note, sometimes those might not be the masters, go to logs to verify atleast 3 masters are up and if needed bring down the pods to 4).
- Run the following Github action
Env Setup - Redis recovery in Openshift
- Check if the masters are Cluster meet successfully.- Make the redis pods replicas to 6 again. This should update the nodes.conf file in the redis cluster and when you run the
cluster nodes
command again in redis-0, you should see that the masters are connected but slaves are failing.- This is because the masters are up and connected in the cluster but there are no slaves to serve its slots. This can be verified by checking one of the master redis pods as below.
- Also if we verify the slave redis pods logs, you can see they are trying to connect the master but not successful as below.
- Step to recover slave pods
- Delete the slave pods manually as below.
- This should automatically update the nodes.conf for new slaves to be connected to the cluster as below and some slave pods the address is updated.
- Now when we try to run the same
cluster nodes
command in one of the masters, you can see all the slave and masters are connected successfully and the ips are the appropriate ones from the pods.
- Now the queue-consumers pods will recover automatically.
When nothing above works - Important this is the last resort, we will not have any backups of db, as it clears the PVC also.
The redis nodes are unable to connect the cluster we have to delete and deploy the redis instances and create cluster again to make redis up and running normally.
Note: This is just a temporary solution as we do not have a permanent solution in place to recover the redis cluster. This process will result in deleting all the redis data as we have to delete the stateful set.
Follow the given set of instructions to deploy the redis instance and create cluster.
-
STEP 1: run the following command to delete the stateful set and other redis dependencies
make delete-redis NAMESPACE=$namespace
Or go to Env Setup - Delete Redis in Openshift, select a branch or tag, select the environment and click on "Run Workflow" like the image below:
-
STEP 2: Follow the instructions from the section above Redis Setup through Make or Redis Setup through Github Actions to setup redis.
-
STEP 3: Restart queue-consumers and api.
-
STEP 4: If this activity is performed in DEV(!!ONLY DEV ENVIRONMENT!!) environment, please pause the schedulers used for file integrations.
- All pods are created with 2 replicas and will go to a maximum of 10 replicas when load increases.
- Pod disruption budget is set for all the deployment config pods (except the db backup container) with a maxUnavailable value of 1.
- For the databases mongo has a maxUnavailable value of 1, but redis has a maxUnavailable of 2.
- As per the configuration, when there is a drain of nodes or maintenance happening, only one node will be drained and the application will be live.
- Sample PDB for our API is shown below: