Sample app to demo a Red Hat helm-controller bug on OpenShift.
Bug report - operator-framework/operator-sdk#6494
When an Operand upgrade fails and the bundled helm chart has one or more PersistentVolumeClaim
and ServiceAccount
,
the secrets generated by the openshift-controller-manager
for each ServiceAccount increase constantly
until a successful Operand update is made.
This was observed on OpenShift 4.12
with operator-sdk version v1.28.0-ocp
.
-
The secrets that increase over time are not defined by the helm chart. OpenShift automatically generates 2 secrets for every service account - the API
token
secret and thedockercfg
secret with creds to access the Red Hat container registry. The openshift-controller-manager creates the secrets and updates thespec.imagePullSecrets
andspec.secrets
fields of the service account with thedockercfg
secret reference. -
Any change made to the Operand is applied as a helm upgrade by the Red Hat helm-controller using the helm chart. When the helm upgrade fails, the helm-controller attempts a helm rollback which results in the
spec.imagePullSecrets
andspec.secrets
of the service account being cleared since they are not part of the manifests rendered by the helm chart. Clearing these fields results in theopenshift-controller-manager
again generating a new set of secrets. -
The rollback initiated by the helm-controller fail with errors for the
PersistentVolumeClaim
, since the manifests used for rollback do not have the auto-populated fields such asspec.volumeName
andspec.storageClassName
. These fields cannot be cleared after creation. -
The helm-controller clears any helm revision which does not have a successful deployment status to ensure that a helm upgrade failure is retried. This results in an upgrade and rollback with each reconciliation loop attempt until the upgrade succeeds (when the user corrects the Operand yaml). And each upgrade/rollback attempt results in a fresh pair of
token
anddockercfg
secrets being created for every service account.
Prerequisite: Access to an OpenShift cluster.
- Create the
Sample
Catalog using thecatalogsource.yaml
file.$ kubectl apply -f catalogsource.yaml
- Install the
Sample
Operator from the Operator Hub using the OpenShift cluster console. The Operator installation defaults to thesample
namespace. - Create the
Sample
Operand using the Operator and ensure the deployment is successful. - Watch the secrets in the
sample
namespace:There should be only 2$ watch kubectl get secrets -n sample
sample-sa-*
secrets. - Watch the helm revision history:
There should be only 1 revision listed, which has
$ watch helm history sample -n sample
deployed
status andInstall completed
as description - Update the Operand from the YAML view. Reduce the
pvc.size
from the default2Gi
to1Gi
. This upgrade will fail since the persistent volume size cannot be reduced.
Logs for a test run are recorded in the debug folder.
You will observe in the watch windows that the number of sample-sa-*
secrets increase rapidly and
eventually slows down with reconciliation backoff. The revision history switches between listing 1 revision and
3 revisions. The second revision is an upgrade failure and the third revision is a rollback failure.
Revision 1 is always with Install completed
.
Error from Revision 2 for upgrade failure (pulled from the sh.helm.release.v1.sample.v2
secret):
Upgrade "sample" failed: cannot patch "sample-pvc" with kind PersistentVolumeClaim: PersistentVolumeClaim "sample-pvc" is invalid: spec.resources.requests.storage: Forbidden: field can not be less than previous value
Error from Revision 3 for the rollback failure (pulled from the sh.helm.release.v1.sample.v3
secret):
Rollback "sample" failed: failed to replace object: PersistentVolumeClaim "sample-pvc" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
core.PersistentVolumeClaimSpec{
AccessModes: {"ReadWriteOnce"},
Selector: nil,
Resources: {Requests: {s"storage": {i: {...}, s: "2Gi", Format: "BinarySI"}}},
- VolumeName: "pvc-a5cf5a2c-ae5b-4e4a-bbe8-a25b75a4b9be",
+ VolumeName: "",
- StorageClassName: &"gp3-csi",
+ StorageClassName: nil,
VolumeMode: &"Filesystem",
DataSource: nil,
DataSourceRef: nil,
}