Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8SSAND-954 ⁃ Unable to create CassandraDatacenter if Setup containers.securityContext.readOnlyRootFilesystem: true #196

Closed
Tracked by #199
zhimsun opened this issue Oct 8, 2021 · 5 comments · Fixed by #661
Labels
bug Something isn't working done Issues in the state 'done'

Comments

@zhimsun
Copy link

zhimsun commented Oct 8, 2021

What happened?
I tried to create a CassandraDatacenter with the containers.securityContext.readOnlyRootFilesystem: true, but the pod is always in the CrashLoopBackOff status.

The pods are running normally if I change the containers.securityContext.readOnlyRootFilesystem: false

The yaml

# Sized to work on 3 k8s workers nodes with 1 core / 4 GB RAM
# See neighboring example-cassdc-full.yaml for docs for each parameter
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
  name: dc21
spec:
  nodeAffinityLabels:
    beta.kubernetes.io/arch: amd64
  clusterName: cluster2
  serverType: dse
  serverVersion: "6.8.14"
  systemLoggerImage: 
  serverImage: 
  configBuilderImage: 
  managementApiAuth:
    insecure: {}
  size: 1
  resources:
    requests:
      cpu: 1
      memory: 4Gi
    limits:
      cpu: 1
      memory: 4Gi
  storageConfig:
    cassandraDataVolumeClaimSpec:
      storageClassName: nfs-client
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
  dockerImageRunsAsCassandra: false
  podTemplateSpec:
    spec:
      initContainers:
      - name: server-config-init
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          privileged: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
      containers:
      - name: "cassandra"
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          privileged: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
      hostIPC: false
      hostNetwork: false
      hostPID: false
      securityContext:
        runAsNonRoot: true
  config:
    jvm-server-options:
      initial_heap_size: "800M"
      max_heap_size: "800M"
      additional-jvm-opts:
        # As the database comes up for the first time, set system keyspaces to RF=3
        - "-Ddse.system_distributed_replication_dc_names=dc21"
        - "-Ddse.system_distributed_replication_per_dc=3"

The pod status

MacBook-Pro-3:db zhiminsun$ oc get pod 
NAME                                                 READY   STATUS             RESTARTS   AGE
cluster2-dc21-default-sts-0                          1/2     CrashLoopBackOff   213        17h

The pod Events error

Events:
  Warning  BackOff         62s (x7 over 103s)  kubelet, worker2.zhim.cp.fyre.ibm.com  Back-off restarting failed container

Did you expect to see something different?
I expect that containers.securityContext.readOnlyRootFilesystem: true

┆Issue is synchronized with this Jira Task by Unito
┆Reviewer: Michael Burman
┆friendlyId: K8SSAND-954
┆priority: Medium

@zhimsun zhimsun added the bug Something isn't working label Oct 8, 2021
@jsanda
Copy link
Contributor

jsanda commented Oct 8, 2021

Hi @zhimsun

What version of cass-operator are you using?

The pods are running normally if I change the containers.securityContext.readOnlyRootFilesystem: false

Did you make this change for all containers? If not, which one(s)?

I am trying to test and produce with CodeReady Containers, but cass-operator is crashing. Looks like it is happening during initialization. I'll try some more.

@jsanda
Copy link
Contributor

jsanda commented Oct 8, 2021

I tested against my local kind cluster with a slightly modified manifest. Here is mine:

# Sized to work on 3 k8s workers nodes with 1 core / 4 GB RAM
# See neighboring example-cassdc-full.yaml for docs for each parameter
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
  name: dc21
spec:
#  nodeAffinityLabels:
#    beta.kubernetes.io/arch: amd64
  clusterName: cluster2
  serverType: dse
  serverVersion: "6.8.14"
  systemLoggerImage:
  serverImage:
  configBuilderImage:
  managementApiAuth:
    insecure: {}
  size: 1
#  resources:
#    requests:
#      cpu: 1
#      memory: 4Gi
#    limits:
#      cpu: 1
#      memory: 4Gi
  storageConfig:
    cassandraDataVolumeClaimSpec:
      storageClassName: standard
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
  dockerImageRunsAsCassandra: false
  podTemplateSpec:
    spec:
      initContainers:
        - name: server-config-init
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
            privileged: false
            readOnlyRootFilesystem: true
            runAsNonRoot: true
      containers:
        - name: "cassandra"
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
            privileged: false
            readOnlyRootFilesystem: true
            runAsNonRoot: true
      hostIPC: false
      hostNetwork: false
      hostPID: false
      securityContext:
        runAsNonRoot: true
        runAsUser: 65533
        runAsGroup: 65533
        fsGroup: 65533
  config:
    jvm-server-options:
      initial_heap_size: "800M"
      max_heap_size: "800M"
      additional-jvm-opts:
        # As the database comes up for the first time, set system keyspaces to RF=3
        - "-Ddse.system_distributed_replication_dc_names=dc21"
        - "-Ddse.system_distributed_replication_per_dc=3"

I had to update securityContext. Without setting the user and group the pod was failing to initialize with this error:

    state:
      waiting:
        message: 'container has runAsNonRoot and image has non-numeric user (cassandra),
          cannot verify user is non-root (pod: "cluster2-dc21-default-sts-0_cass-operator(7a5fc807-2b54-4751-9c08-497470fa0ef1)",
          container: server-config-init)'
        reason: CreateContainerConfigError

I deleted my CassandraDatacenter and changed the securityContext and now I do end up with a CrashLoopBackOff due to the cassandra container. Here is the error in the logs:

ln: failed to create symbolic link '/opt/dse/resources/spark/conf/hive-site.xml': Read-only file system

I need to pull someone in whose is more familiar with DSE for some help.

cc @bradfordcp

@zhimsun
Copy link
Author

zhimsun commented Oct 8, 2021

@jsanda my cass-operator version is v1.7.1, I only have one container, cassandra

For the initContainers, I can setup readOnlyRootFilesystem: true

  podTemplateSpec:
    spec:
      initContainers:
      - name: server-config-init
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          privileged: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true

but for the containers I cannot setup readOnlyRootFilesystem: true

      containers:
      - name: "cassandra"
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          privileged: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true

@jsanda
Copy link
Contributor

jsanda commented Oct 8, 2021

@zhimsun can you share the logs from the cassandra container?

@zhimsun
Copy link
Author

zhimsun commented Oct 8, 2021

@jsanda The cassandra container did not create

oc exec -it cluster2-dc21-default-sts-0 -n zen bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Defaulting container name to cassandra.
Use 'oc describe pod/cluster2-dc21-default-sts-0 -n zen' to see all of the containers in this pod.
error: unable to upgrade connection: container not found ("cassandra")

You can reproduce on your cluster input the systemLoggerImage, serverImage, configBuilderImage values

# Sized to work on 3 k8s workers nodes with 1 core / 4 GB RAM
# See neighboring example-cassdc-full.yaml for docs for each parameter
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
  name: dc21
spec:
  nodeAffinityLabels:
    beta.kubernetes.io/arch: amd64
  clusterName: cluster2
  serverType: dse
  serverVersion: "6.8.14"
  systemLoggerImage: <image>
  serverImage:  <image>
  configBuilderImage:  <image>
  managementApiAuth:
    insecure: {}
  size: 1
  resources:
    requests:
      cpu: 1
      memory: 4Gi
    limits:
      cpu: 1
      memory: 4Gi
  storageConfig:
    cassandraDataVolumeClaimSpec:
      storageClassName: nfs-client
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
  dockerImageRunsAsCassandra: false
  podTemplateSpec:
    spec:
      initContainers:
      - name: server-config-init
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          privileged: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
      containers:
      - name: "cassandra"
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          privileged: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
      hostIPC: false
      hostNetwork: false
      hostPID: false
      securityContext:
        runAsNonRoot: true
  config:
    jvm-server-options:
      initial_heap_size: "800M"
      max_heap_size: "800M"
      additional-jvm-opts:
        # As the database comes up for the first time, set system keyspaces to RF=3
        - "-Ddse.system_distributed_replication_dc_names=dc21"
        - "-Ddse.system_distributed_replication_per_dc=3"

@jsanda jsanda self-assigned this Oct 9, 2021
@sync-by-unito sync-by-unito bot changed the title Unable to create CassandraDatacenter if Setup containers.securityContext.readOnlyRootFilesystem: true K8SSAND-954 ⁃ Unable to create CassandraDatacenter if Setup containers.securityContext.readOnlyRootFilesystem: true Oct 19, 2021
@jsanda jsanda self-assigned this Oct 21, 2021
@adejanovski adejanovski added zh:Ready For Review Issues in the ZenHub pipeline 'Ready For Review' zh:Ready-For-Review and removed zh:Ready For Review Issues in the ZenHub pipeline 'Ready For Review' labels Mar 30, 2022
@adejanovski adejanovski moved this to To Groom in K8ssandra Nov 8, 2022
@burmanm burmanm moved this to Ready For Review in K8ssandra Jul 16, 2024
@adejanovski adejanovski added the ready-for-review Issues in the state 'ready-for-review' label Jul 16, 2024
@github-project-automation github-project-automation bot moved this from Ready For Review to Done in K8ssandra Jul 19, 2024
@adejanovski adejanovski added done Issues in the state 'done' and removed ready-for-review Issues in the state 'ready-for-review' labels Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working done Issues in the state 'done'
Projects
No open projects
Status: Done
3 participants