Skip to content

Commit

Permalink
Openshift support (hashicorp#600)
Browse files Browse the repository at this point in the history
Helm chart values and template changes:

* Add a new Helm value global.openshift.enabled
* Add a SecurityContextConstraint for the consul clients when global.openshift.enabled is set to true
* Don't set fsGroup for the servers when global.openshift.enabled is set to true
* Remove server.disableFsGroupSecurityContext value and fail in the chart if someone tries to set it
* Increase memory limits and requests from 25Mi to 50Mi for all jobs and service-init containers in the terminating and ingress gateway deployments. This was mostly determined by running the tests. There are other containers that still have the 25Mi memory request and limit, however those containers were not causing failures on OpenShift.

Acceptance tests changes: 

* Add a new flag -enable-openshift to the framework which will set global.openshift.enabled to true for all helm installs/upgrades.
* Increase timeouts in various places because it takes longer for things to be created on OpenShift.
* Change consul-dns test to retry and to not use TTY since it's not always available.

CI changes:

* Add a new job to run acceptance tests against OpenShift. Note that it currently runs against pre-created Azure Red Hat OpenShift clusters. I was not able to get terraform running to create them before each run due to a variety of reasons, the main one being that it takes a not yet determined amount of time for the clusters to become ready. It has been hard to determine that time in a script/CI, which resulted in intermittent/non-deterministic test failures. We will have to address further automation of the OpenShift cluster creation at a later time.
* Add a new workflow to run acceptance tests nightly.
* Add Azure CLI and OpenShift CLI to the Docker image used in CI
  • Loading branch information
ishustava authored Oct 5, 2020
1 parent f1286d7 commit 74eeb15
Show file tree
Hide file tree
Showing 24 changed files with 413 additions and 76 deletions.
65 changes: 64 additions & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
command: bats ./test/unit
unit-helm3:
docker:
- image: hashicorpdev/consul-helm-test:0.5.0
- image: hashicorpdev/consul-helm-test:0.6.0

steps:
- checkout
Expand Down Expand Up @@ -153,6 +153,59 @@ jobs:
terraform destroy -var project=${CLOUDSDK_CORE_PROJECT} -auto-approve
when: always

acceptance-openshift:
environment:
- TEST_RESULTS: /tmp/test-results
# Primary and secondary Azure OpenShift clusters (created manually) that are used to run acceptance tests.
- OC_PRIMARY_NAME: consul-helm-test-2757871175
- OC_SECONDARY_NAME: consul-helm-test-3737660519
docker:
# This image is build from test/docker/Test.dockerfile
- image: hashicorpdev/consul-helm-test:0.6.0

steps:
- checkout

- run:
name: openshift login
command: |
az login --service-principal -u "$ARM_CLIENT_ID" -p "$ARM_CLIENT_SECRET" --tenant "$ARM_TENANT_ID" > /dev/null
for cluster_name in "$OC_PRIMARY_NAME" "$OC_SECONDARY_NAME"; do
apiServer=$(az aro show -g "$cluster_name" -n "$cluster_name" --query apiserverProfile.url -o tsv)
kubeUser=$(az aro list-credentials -g "$cluster_name" -n "$cluster_name" | jq -r .kubeadminUsername)
kubePassword=$(az aro list-credentials -g "$cluster_name" -n "$cluster_name" | jq -r .kubeadminPassword)
KUBECONFIG="$HOME/.kube/$cluster_name" oc login "$apiServer" -u "$kubeUser" -p "$kubePassword"
KUBECONFIG="$HOME/.kube/$cluster_name" oc project consul
done
# Restore go module cache if there is one
- restore_cache:
keys:
- consul-helm-modcache-v1-{{ checksum "test/acceptance/go.mod" }}

- run: mkdir -p $TEST_RESULTS

- run:
name: Run acceptance tests
working_directory: test/acceptance/tests
no_output_timeout: 30m
command: |
gotestsum --junitfile "$TEST_RESULTS/gotestsum-report.xml" -- ./... -p 1 -timeout 30m -failfast \
-enable-openshift \
-enable-enterprise \
-enable-multi-cluster \
-kubeconfig="$HOME/.kube/$OC_PRIMARY_NAME" \
-secondary-kubeconfig="$HOME/.kube/$OC_SECONDARY_NAME" \
-debug-directory="$TEST_RESULTS/debug" \
-consul-k8s-image=hashicorpdev/consul-k8s:latest
- store_test_results:
path: /tmp/test-results
- store_artifacts:
path: /tmp/test-results

update-helm-charts-index:
docker:
- image: circleci/golang:latest
Expand Down Expand Up @@ -196,6 +249,16 @@ workflows:
- unit-helm2
- unit-helm3
- unit-acceptance-framework
nightly-acceptance-tests:
triggers:
- schedule:
cron: "0 0 * * *"
filters:
branches:
only:
- master
jobs:
- acceptance-openshift
update-helm-charts-index:
jobs:
- update-helm-charts-index:
Expand Down
10 changes: 9 additions & 1 deletion templates/client-role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ metadata:
chart: {{ template "consul.chart" . }}
heritage: {{ .Release.Service }}
release: {{ .Release.Name }}
{{- if (or .Values.global.acls.manageSystemACLs .Values.global.enablePodSecurityPolicies) }}
{{- if (or .Values.global.acls.manageSystemACLs .Values.global.enablePodSecurityPolicies .Values.global.openshift.enabled) }}
rules:
{{- if .Values.global.enablePodSecurityPolicies }}
- apiGroups: ["policy"]
Expand All @@ -28,6 +28,14 @@ rules:
verbs:
- get
{{- end }}
{{- if .Values.global.openshift.enabled}}
- apiGroups: ["security.openshift.io"]
resources: ["securitycontextconstraints"]
resourceNames:
- {{ template "consul.fullname" . }}-client
verbs:
- use
{{- end}}
{{- else}}
rules: []
{{- end }}
Expand Down
54 changes: 54 additions & 0 deletions templates/client-securitycontextconstraints.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
{{- if (and .Values.global.openshift.enabled (or (and (ne (.Values.client.enabled | toString) "-") .Values.client.enabled) (and (eq (.Values.client.enabled | toString) "-") .Values.global.enabled))) }}
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: {{ template "consul.fullname" . }}-client
labels:
app: {{ template "consul.name" . }}
chart: {{ template "consul.chart" . }}
heritage: {{ .Release.Service }}
release: {{ .Release.Name }}
annotations:
kubernetes.io/description: {{ template "consul.fullname" . }}-client are the security context constraints required
to run the consul client.
{{- if .Values.client.dataDirectoryHostPath }}
allowHostDirVolumePlugin: true
{{- else }}
allowHostDirVolumePlugin: false
{{- end}}
allowHostIPC: false
allowHostNetwork: {{ .Values.client.hostNetwork }}
allowHostPID: false
allowHostPorts: true
allowPrivilegeEscalation: true
allowPrivilegedContainer: false
allowedCapabilities: null
defaultAddCapabilities: null
fsGroup:
type: MustRunAs
groups: []
priority: null
readOnlyRootFilesystem: false
requiredDropCapabilities:
- KILL
- MKNOD
- SETUID
- SETGID
runAsUser:
type: MustRunAsRange
seLinuxContext:
type: MustRunAs
supplementalGroups:
type: MustRunAs
users: []
volumes:
- configMap
- downwardAPI
- emptyDir
- persistentVolumeClaim
- projected
- secret
{{- if .Values.client.dataDirectoryHostPath }}
- hostPath
{{- end }}
{{- end}}
4 changes: 2 additions & 2 deletions templates/create-federation-secret-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -129,9 +129,9 @@ spec:
-server-ca-key-file=/consul/tls/server/ca/tls.key
resources:
requests:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
limits:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
{{- end }}
4 changes: 2 additions & 2 deletions templates/ingress-gateways-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -264,10 +264,10 @@ spec:
{{- end }}
resources:
requests:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
limits:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
containers:
- name: ingress-gateway
Expand Down
4 changes: 2 additions & 2 deletions templates/server-acl-init-cleanup-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ spec:
- {{ template "consul.fullname" . }}-server-acl-init
resources:
requests:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
limits:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
{{- end }}
{{- end }}
Expand Down
4 changes: 2 additions & 2 deletions templates/server-acl-init-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -238,10 +238,10 @@ spec:
{{- end }}
resources:
requests:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
limits:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
{{- end }}
{{- end }}
Expand Down
3 changes: 2 additions & 1 deletion templates/server-statefulset.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{{- if (or (and (ne (.Values.server.enabled | toString) "-") .Values.server.enabled) (and (eq (.Values.server.enabled | toString) "-") .Values.global.enabled)) }}
{{- if and .Values.global.federation.enabled (not .Values.global.tls.enabled) }}{{ fail "If global.federation.enabled is true, global.tls.enabled must be true because federation is only supported with TLS enabled" }}{{ end }}
{{- if and .Values.global.federation.enabled (not .Values.meshGateway.enabled) }}{{ fail "If global.federation.enabled is true, meshGateway.enabled must be true because mesh gateways are required for federation" }}{{ end }}
{{- if .Values.server.disableFsGroupSecurityContext }}{{ fail "server.disableFsGroupSecurityContext has been removed. Please use global.openshift.enabled instead." }}{{ end }}
# StatefulSet to run the actual Consul server cluster.
apiVersion: apps/v1
kind: StatefulSet
Expand Down Expand Up @@ -58,7 +59,7 @@ spec:
{{- end }}
terminationGracePeriodSeconds: 30
serviceAccountName: {{ template "consul.fullname" . }}-server
{{- if not .Values.server.disableFsGroupSecurityContext }}
{{- if not .Values.global.openshift.enabled}}
securityContext:
fsGroup: 1000
{{- end }}
Expand Down
4 changes: 2 additions & 2 deletions templates/terminating-gateways-deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -211,10 +211,10 @@ spec:
{{- end }}
resources:
requests:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
limits:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
containers:
- name: terminating-gateway
Expand Down
4 changes: 2 additions & 2 deletions templates/tls-init-cleanup-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ spec:
-H "Authorization: Bearer $( cat /var/run/secrets/kubernetes.io/serviceaccount/token )"
resources:
requests:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
limits:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
{{- end }}
{{- end }}
5 changes: 3 additions & 2 deletions templates/tls-init-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ spec:
# Note that in the subsequent runs of the job, POST requests will
# return a 409 because these secrets would already exist;
# we are ignoring these response codes.
workingDir: /tmp
command:
- "/bin/sh"
- "-ec"
Expand Down Expand Up @@ -116,10 +117,10 @@ spec:
{{- end }}
resources:
requests:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
limits:
memory: "25Mi"
memory: "50Mi"
cpu: "50m"
{{- end }}
{{- end }}
6 changes: 6 additions & 0 deletions test/acceptance/framework/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ type TestConfig struct {
EnterpriseLicenseSecretName string
EnterpriseLicenseSecretKey string

EnableOpenshift bool

ConsulImage string
ConsulK8SImage string

Expand Down Expand Up @@ -52,6 +54,10 @@ func (t *TestConfig) HelmValuesFromConfig() (map[string]string, error) {
setIfNotEmpty(helmValues, "server.enterpriseLicense.secretKey", t.EnterpriseLicenseSecretKey)
}

if t.EnableOpenshift {
setIfNotEmpty(helmValues, "global.openshift.enabled", "true")
}

setIfNotEmpty(helmValues, "global.image", t.ConsulImage)
setIfNotEmpty(helmValues, "global.imageK8S", t.ConsulK8SImage)

Expand Down
9 changes: 9 additions & 0 deletions test/acceptance/framework/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,15 @@ func TestConfig_HelmValuesFromConfig(t *testing.T) {
},
map[string]string{},
},
{
"sets openshift value when EnableOpenshift is set",
TestConfig{
EnableOpenshift: true,
},
map[string]string{
"global.openshift.enabled": "true",
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
Expand Down
7 changes: 7 additions & 0 deletions test/acceptance/framework/flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ type TestFlags struct {
flagEnterpriseLicenseSecretName string
flagEnterpriseLicenseSecretKey string

flagEnableOpenshift bool

flagConsulImage string
flagConsulK8sImage string

Expand Down Expand Up @@ -64,6 +66,9 @@ func (t *TestFlags) init() {
flag.StringVar(&t.flagEnterpriseLicenseSecretKey, "enterprise-license-secret-key", "",
"The key of the Kubernetes secret containing the enterprise license.")

flag.BoolVar(&t.flagEnableOpenshift, "enable-openshift", false,
"If true, the tests will automatically add Openshift Helm value for each Helm install.")

flag.BoolVar(&t.flagNoCleanupOnFailure, "no-cleanup-on-failure", false,
"If true, the tests will not cleanup Kubernetes resources they create when they finish running."+
"Note this flag must be run with -failfast flag, otherwise subsequent tests will fail.")
Expand Down Expand Up @@ -105,6 +110,8 @@ func (t *TestFlags) testConfigFromFlags() *TestConfig {
EnterpriseLicenseSecretName: t.flagEnterpriseLicenseSecretName,
EnterpriseLicenseSecretKey: t.flagEnterpriseLicenseSecretKey,

EnableOpenshift: t.flagEnableOpenshift,

ConsulImage: t.flagConsulImage,
ConsulK8SImage: t.flagConsulK8sImage,

Expand Down
6 changes: 3 additions & 3 deletions test/acceptance/helpers/helpers.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ func WaitForAllPodsToBeReady(t *testing.T, client kubernetes.Interface, namespac

t.Log("Waiting for pods to be ready.")

// Wait up to 3m.
counter := &retry.Counter{Count: 36, Wait: 5 * time.Second}
// Wait up to 5m.
counter := &retry.Counter{Count: 60, Wait: 5 * time.Second}
retry.RunWith(counter, t, func(r *retry.R) {
pods, err := client.CoreV1().Pods(namespace).List(context.Background(), metav1.ListOptions{LabelSelector: podLabelSelector})
require.NoError(r, err)
Expand Down Expand Up @@ -105,7 +105,7 @@ func DeployKustomize(t *testing.T, options *k8s.KubectlOptions, noCleanupOnFailu
KubectlDeleteK(t, options, kustomizeDir)
})

RunKubectl(t, options, "wait", "--for=condition=available", fmt.Sprintf("deploy/%s", deployment.Name))
RunKubectl(t, options, "wait", "--for=condition=available", "--timeout=1m", fmt.Sprintf("deploy/%s", deployment.Name))
}

// CheckStaticServerConnection execs into a pod of the deployment given by deploymentName
Expand Down
Loading

0 comments on commit 74eeb15

Please sign in to comment.