Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: runtime error: invalid memory address or nil pointer dereference #1624

Closed
artemkurylev opened this issue Jun 6, 2023 · 25 comments · Fixed by #1625 or #1634
Closed

panic: runtime error: invalid memory address or nil pointer dereference #1624

artemkurylev opened this issue Jun 6, 2023 · 25 comments · Fixed by #1625 or #1634
Assignees
Labels
bug Something isn't working community fixed

Comments

@artemkurylev
Copy link

Expected Behavior

Minio-operator runs smoothly even after reboot of the cluster

Current Behavior

After force reboot mini-operator fail to work resulting with panic: runtime error: invalid memory address or nil pointer dereference

Possible Solution

Steps to Reproduce (for bugs)

I0606 14:08:10.684364 1 controller.go:70] Starting MinIO Operator
I0606 14:08:10.822737 1 main-controller.go:272] Setting up event handlers
I0606 14:08:11.059053 1 main-controller.go:481] Using Kubernetes CSR Version: v1beta1
I0606 14:08:11.059069 1 main-controller.go:501] STS Api server is not enabled, not starting
I0606 14:08:11.059091 1 leaderelection.go:248] attempting to acquire leader lease minio-operator/minio-operator-lock...
I0606 14:08:11.102238 1 main-controller.go:548] minio-operator-6c68464fc4-kxdgs: is the leader, removing any leader labels that I 'minio-operator-6c68464fc4-m8qdf' might have
I0606 14:09:13.760393 1 leaderelection.go:258] successfully acquired lease minio-operator/minio-operator-lock
I0606 14:09:13.760453 1 main-controller.go:530] minio-operator-6c68464fc4-m8qdf: I am the leader, applying leader labels on myself
I0606 14:09:13.760472 1 main-controller.go:385] Waiting for Upgrade Server to start
I0606 14:09:13.760481 1 main-controller.go:389] Starting Tenant controller
I0606 14:09:13.760485 1 main-controller.go:392] Waiting for informer caches to sync
I0606 14:09:13.760493 1 main-controller.go:397] Starting workers
I0606 14:09:13.760501 1 main-controller.go:425] Console TLS is not enabled
I0606 14:09:13.760543 1 main-controller.go:347] Starting HTTP Upgrade Tenant Image server
I0606 14:09:13.895457 1 decomission.go:66] minio-tenant-1/s3-local Detected we are removing a pool
E0606 14:09:13.895570 1 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 320 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x22a51e0?, 0x53b1f60})
k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:75 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000a12070?})
k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:49 +0x75
panic({0x22a51e0, 0x53b1f60})
runtime/panic.go:884 +0x212
github.com/minio/operator/pkg/apis/minio.min.io/v2.(*Tenant).EnsureDefaults(0xc0002b52c0?)
github.com/minio/operator/pkg/apis/minio.min.io/v2/helper.go:299 +0x29
github.com/minio/operator/pkg/controller.(*Controller).syncHandler(0xc0002b52c0, {0xc000c98030, 0x17})
github.com/minio/operator/pkg/controller/main-controller.go:742 +0x516
github.com/minio/operator/pkg/controller.(*Controller).processNextWorkItem.func1({0x21885c0?, 0xc000a12070})
github.com/minio/operator/pkg/controller/main-controller.go:653 +0x24f
github.com/minio/operator/pkg/controller.(*Controller).processNextWorkItem(0xc0002b52c0)
github.com/minio/operator/pkg/controller/main-controller.go:665 +0x62
github.com/minio/operator/pkg/controller.(*Controller).runWorker(0xc00053e6a0?)
github.com/minio/operator/pkg/controller/main-controller.go:605 +0x47
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x10000000000?)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:157 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x101000000?, {0x40ccc60, 0xc000b88300}, 0x1, 0xc000114720)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:158 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00053e758?, 0x3b9aca00, 0x0, 0x40?, 0x0?)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:135 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(0xc000c880c0?, 0x0?, 0xc00053e778?)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:92 +0x25
created by github.com/minio/operator/pkg/controller.leaderRun
github.com/minio/operator/pkg/controller/main-controller.go:400 +0x287
E0606 14:09:13.895655 1 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 320 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x22a51e0?, 0x53b1f60})
k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:75 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000d86a00?})
k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:49 +0x75
panic({0x22a51e0, 0x53b1f60})
runtime/panic.go:884 +0x212
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000a12070?})
k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:56 +0xd7
panic({0x22a51e0, 0x53b1f60})
runtime/panic.go:884 +0x212
github.com/minio/operator/pkg/apis/minio.min.io/v2.(*Tenant).EnsureDefaults(0xc0002b52c0?)
github.com/minio/operator/pkg/apis/minio.min.io/v2/helper.go:299 +0x29
github.com/minio/operator/pkg/controller.(*Controller).syncHandler(0xc0002b52c0, {0xc000c98030, 0x17})
github.com/minio/operator/pkg/controller/main-controller.go:742 +0x516
github.com/minio/operator/pkg/controller.(*Controller).processNextWorkItem.func1({0x21885c0?, 0xc000a12070})
github.com/minio/operator/pkg/controller/main-controller.go:653 +0x24f
github.com/minio/operator/pkg/controller.(*Controller).processNextWorkItem(0xc0002b52c0)
github.com/minio/operator/pkg/controller/main-controller.go:665 +0x62
github.com/minio/operator/pkg/controller.(*Controller).runWorker(0xc00053e6a0?)
github.com/minio/operator/pkg/controller/main-controller.go:605 +0x47
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x10000000000?)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:157 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x101000000?, {0x40ccc60, 0xc000b88300}, 0x1, 0xc000114720)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:158 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00053e758?, 0x3b9aca00, 0x0, 0x40?, 0x0?)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:135 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(0xc000c880c0?, 0x0?, 0xc00053e778?)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:92 +0x25
created by github.com/minio/operator/pkg/controller.leaderRun
github.com/minio/operator/pkg/controller/main-controller.go:400 +0x287
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x158 pc=0xdf6e29]

goroutine 320 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000d86a00?})
k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:56 +0xd7
panic({0x22a51e0, 0x53b1f60})
runtime/panic.go:884 +0x212
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc000a12070?})
k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:56 +0xd7
panic({0x22a51e0, 0x53b1f60})
runtime/panic.go:884 +0x212
github.com/minio/operator/pkg/apis/minio.min.io/v2.(*Tenant).EnsureDefaults(0xc0002b52c0?)
github.com/minio/operator/pkg/apis/minio.min.io/v2/helper.go:299 +0x29
github.com/minio/operator/pkg/controller.(*Controller).syncHandler(0xc0002b52c0, {0xc000c98030, 0x17})
github.com/minio/operator/pkg/controller/main-controller.go:742 +0x516
github.com/minio/operator/pkg/controller.(*Controller).processNextWorkItem.func1({0x21885c0?, 0xc000a12070})
github.com/minio/operator/pkg/controller/main-controller.go:653 +0x24f
github.com/minio/operator/pkg/controller.(*Controller).processNextWorkItem(0xc0002b52c0)
github.com/minio/operator/pkg/controller/main-controller.go:665 +0x62
github.com/minio/operator/pkg/controller.(*Controller).runWorker(0xc00053e6a0?)
github.com/minio/operator/pkg/controller/main-controller.go:605 +0x47
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x10000000000?)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:157 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x101000000?, {0x40ccc60, 0xc000b88300}, 0x1, 0xc000114720)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:158 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00053e758?, 0x3b9aca00, 0x0, 0x40?, 0x0?)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:135 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(0xc000c880c0?, 0x0?, 0xc00053e778?)
k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:92 +0x25
created by github.com/minio/operator/pkg/controller.leaderRun
github.com/minio/operator/pkg/controller/main-controller.go:400 +0x287

Context

Minio tenant cannot operate without operator (Although it has its own error, but it seems that it fails because of failed operator)
And yes, there is in an upgrade command here, but it was just a try to fix this. Without upgrading it was all the same

Your Environment

  • Version used (minio-operator): 4.5.8, 5.0.5 (Both leaded to this error)
  • Environment name and version (e.g. kubernetes v1.17.2):
  • Server type and version:
  • Operating System and version (uname -a):
  • Link to your deployment file:
@jiuker
Copy link
Contributor

jiuker commented Jun 6, 2023

@artemkurylev Could you post the tenant? It will help me. Thanks

@artemkurylev
Copy link
Author

Here are the logs form tenant-pod
Screenshot 2023-06-06 at 19 44 18
And this result of kubectl describe
Screenshot 2023-06-06 at 19 47 03

Screenshot 2023-06-06 at 19 47 41 Screenshot 2023-06-06 at 19 47 26

@jiuker
Copy link
Contributor

jiuker commented Jun 6, 2023

@artemkurylev I mean kubectl get tenant -A -oyaml @artemkurylev

@artemkurylev
Copy link
Author

apiVersion: v1
items:

  • apiVersion: minio.min.io/v2
    kind: Tenant
    metadata:
    creationTimestamp: "2022-06-15T08:13:37Z"
    generation: 75
    name: s3-local
    namespace: minio-tenant-1
    resourceVersion: "144093988"
    uid: a19e6bf0-b1e3-407a-a4cb-115120b1f48d
    scheduler:
    name: ""
    spec:
    certConfig:
    commonName: '*.s3-local-hl.minio-tenant-1.svc.cluster.local'
    dnsNames:
    - s3-local-pool-0-0.s3-local-hl.minio-tenant-1.svc.cluster.local
    organizationName:
    - system:nodes
    configuration:
    name: s3-local-env-configuration
    credsSecret:
    name: s3-local-secret
    exposeServices:
    console: true
    minio: true
    features:
    domains:
    image: minio/minio:RELEASE.2023-05-27T05-56-19Z
    imagePullPolicy: IfNotPresent
    imagePullSecret: {}
    mountPath: /export
    podManagementPolicy: Parallel
    pools:
    • affinity:
      nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
      - key: role
      operator: In
      values:
      - service
      name: pool-1
      resources: {}
      securityContext:
      fsGroup: 1000
      fsGroupChangePolicy: Always
      runAsGroup: 1000
      runAsNonRoot: false
      runAsUser: 1000
      servers: 1
      volumeClaimTemplate:
      metadata:
      name: data
      spec:
      accessModes:
      - ReadWriteOnce
      resources:
      requests:
      storage: "536870912e3"
      storageClassName: direct-csi-min-io
      status: {}
      volumesPerServer: 8
      requestAutoCert: false
      users:
    • name: s3-local-user-0
    • name: s3-local-user-1
      status:
      availableReplicas: 2
      certificates:
      autoCertEnabled: false
      currentState: Initialized
      drivesOnline: 8
      healthStatus: green
      pools:
    • legacySecurityContext: false
      ssName: s3-local-pool-1
      state: PoolInitialized
    • legacySecurityContext: false
      ssName: s3-local-pool-1
      state: PoolInitialized
      provisionedUsers: true
      revision: 0
      syncVersion: v5.0.0
      usage:
      capacity: 4294967296000
      rawCapacity: 4294967296000
      rawUsage: 762208948224
      usage: 748983581290
      writeQuorum: 8
      kind: List
      metadata:
      resourceVersion: ""

@jiuker
Copy link
Contributor

jiuker commented Jun 6, 2023

@artemkurylev Do you want remove pool or add new pool or update pool when the error occur.

@artemkurylev
Copy link
Author

@artemkurylev Do you want remove pool or add new pool or update pool when the error occur.

Neither.It occurred just after cluster restart, I tried to update it hoping that it could solve the problem

@jiuker
Copy link
Contributor

jiuker commented Jun 6, 2023

@artemkurylev how to restart cluster?

@artemkurylev
Copy link
Author

@artemkurylev how to restart cluster?

It happened due to electricity problems, you mean that restarting it manually could help solving the problem?

@jiuker
Copy link
Contributor

jiuker commented Jun 6, 2023

@artemkurylev you can delete the related minio's instance statefulset. And restart minio-operator. PVC will not lose.

@jiuker
Copy link
Contributor

jiuker commented Jun 6, 2023

@artemkurylev if it exists. kubectl get tenant s3-local -n minio-tenant-1 -oyaml, post the full yaml. I will check it soon. Use

k:v

to format them. Thank you

@artemkurylev
Copy link
Author

kind: Tenant
metadata:
  creationTimestamp: "2022-06-15T08:13:37Z"
  generation: 75
  name: s3-local
  namespace: minio-tenant-1
  resourceVersion: "144093988"
  uid: a19e6bf0-b1e3-407a-a4cb-115120b1f48d
scheduler:
  name: ""
spec:
  certConfig:
    commonName: '*.s3-local-hl.minio-tenant-1.svc.cluster.local'
    dnsNames:
    - s3-local-pool-0-0.s3-local-hl.minio-tenant-1.svc.cluster.local
    organizationName:
    - system:nodes
  configuration:
    name: s3-local-env-configuration
  credsSecret:
    name: s3-local-secret
  exposeServices:
    console: true
    minio: true
  features:
    domains: <research-pods>
  image: minio/minio:RELEASE.2023-05-27T05-56-19Z
  imagePullPolicy: IfNotPresent
  imagePullSecret: {}
  mountPath: /export
  podManagementPolicy: Parallel
  pools:
  - affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: role
              operator: In
              values:
              - service
    name: pool-1
    resources: {}
    securityContext:
      fsGroup: 1000
      fsGroupChangePolicy: Always
      runAsGroup: 1000
      runAsNonRoot: false
      runAsUser: 1000
    servers: 1
    volumeClaimTemplate:
      metadata:
        name: data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: "536870912e3"
        storageClassName: direct-csi-min-io
      status: {}
    volumesPerServer: 8
  requestAutoCert: false
  users:
  - name: s3-local-user-0
  - name: s3-local-user-1
status:
  availableReplicas: 2
  certificates:
    autoCertEnabled: false
  currentState: Initialized
  drivesOnline: 8
  healthStatus: green
  pools:
  - legacySecurityContext: false
    ssName: s3-local-pool-1
    state: PoolInitialized
  - legacySecurityContext: false
    ssName: s3-local-pool-1
    state: PoolInitialized
  provisionedUsers: true
  revision: 0
  syncVersion: v5.0.0
  usage:
    capacity: 4294967296000
    rawCapacity: 4294967296000
    rawUsage: 762208948224
    usage: 748983581290
  writeQuorum: 8

@jiuker
Copy link
Contributor

jiuker commented Jun 6, 2023

@artemkurylev you can edit this CR kubectl edit tenant s3-local -n minio-tenant-1. you can see the .status.pools have two the same item. Delete one plz. You will recover. Thanks

@harshavardhana
Copy link
Member

I am surprised why did we add a duplicate entry in the first place.

@harshavardhana harshavardhana added bug Something isn't working and removed triage labels Jun 6, 2023
@jiuker
Copy link
Contributor

jiuker commented Jun 7, 2023

@artemkurylev Did it work? When you delete one item.

@artemkurylev
Copy link
Author

@jiuker Unfortunately no

@jiuker
Copy link
Contributor

jiuker commented Jun 7, 2023

@jiuker Unfortunately no

@artemkurylev what's the CR look like now? kubectl get tenant s3-local -n minio-tenant-1 -oyaml

@artemkurylev
Copy link
Author

It seems that changes(deletion of one the pools in status.pools) , that I applied with kubectl edit actually were not applied while in console it showed that tenant was edited

@jiuker
Copy link
Contributor

jiuker commented Jun 7, 2023

@artemkurylev post the log. plz

@artemkurylev
Copy link
Author

kind: Tenant
metadata:
  creationTimestamp: "2022-06-15T08:13:37Z"
  generation: 75
  name: s3-local
  namespace: minio-tenant-1
  resourceVersion: "144093988"
  uid: a19e6bf0-b1e3-407a-a4cb-115120b1f48d
scheduler:
  name: ""
spec:
  certConfig:
    commonName: '*.s3-local-hl.minio-tenant-1.svc.cluster.local'
    dnsNames:
    - s3-local-pool-0-0.s3-local-hl.minio-tenant-1.svc.cluster.local
    organizationName:
    - system:nodes
  configuration:
    name: s3-local-env-configuration
  credsSecret:
    name: s3-local-secret
  exposeServices:
    console: true
    minio: true
  features:
    domains: <>
  image: minio/minio:RELEASE.2023-05-27T05-56-19Z
  imagePullPolicy: IfNotPresent
  imagePullSecret: {}
  mountPath: /export
  podManagementPolicy: Parallel
  pools:
  - affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: role
              operator: In
              values:
              - service
    name: pool-1
    resources: {}
    securityContext:
      fsGroup: 1000
      fsGroupChangePolicy: Always
      runAsGroup: 1000
      runAsNonRoot: false
      runAsUser: 1000
    servers: 1
    volumeClaimTemplate:
      metadata:
        name: data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: "536870912e3"
        storageClassName: direct-csi-min-io
      status: {}
    volumesPerServer: 8
  requestAutoCert: false
  users:
  - name: s3-local-user-0
  - name: s3-local-user-1
status:
  availableReplicas: 2
  certificates:
    autoCertEnabled: false
  currentState: Initialized
  drivesOnline: 8
  healthStatus: green
  pools:
  - legacySecurityContext: false
    ssName: s3-local-pool-1
    state: PoolInitialized
  - legacySecurityContext: false
    ssName: s3-local-pool-1
    state: PoolInitialized
  provisionedUsers: true
  revision: 0
  syncVersion: v5.0.0
  usage:
    capacity: 4294967296000
    rawCapacity: 4294967296000
    rawUsage: 762208948224
    usage: 748983581290
  writeQuorum: 8

@jiuker
Copy link
Contributor

jiuker commented Jun 7, 2023

@artemkurylev you delete it without save? I mean one item of .status.pools.

@artemkurylev
Copy link
Author

artemkurylev commented Jun 7, 2023

@artemkurylev you delete it without save? I mean one item of .status.pools.

Screenshot 2023-06-07 at 12 42 14

No,

@jiuker
Copy link
Contributor

jiuker commented Jun 7, 2023

@artemkurylev Use two shell to run
one to run:

kubectl proxy

another to run:

curl -XPATCH  -H "Accept: application/json" -H "Content-Type: application/json-patch+json"  --data '[{"op": "remove", "path": "/status/pools/0"}]' http://127.0.0.1:8001/apis/minio.min.io/v2/namespaces/minio-tenant-1/tenants/s3-local/status

@jiuker
Copy link
Contributor

jiuker commented Jun 7, 2023

@artemkurylev let know if it works for you

@artemkurylev
Copy link
Author

Thanks, from now minio-operator successfully runs without crashing. However, there is a problem that it fails to mount volumes for created tenant pool

@jiuker
Copy link
Contributor

jiuker commented Jun 7, 2023

Thanks, from now minio-operator successfully runs without crashing. However, there is a problem that it fails to mount volumes for created tenant pool

Please open another issue if it related to minio. And post the logs. We will follow this. plz. @artemkurylev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community fixed
Projects
None yet
3 participants