Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Pool creation name conflict when Expanding Tenant #1626

Closed
bobacarOrpheo opened this issue Jun 7, 2023 · 18 comments · Fixed by #1628 or #1633
Closed

BUG: Pool creation name conflict when Expanding Tenant #1626

bobacarOrpheo opened this issue Jun 7, 2023 · 18 comments · Fixed by #1628 or #1633
Assignees
Labels
bug Something isn't working community

Comments

@bobacarOrpheo
Copy link

Expected Behavior

When expanding tenant, the new pool should have a name different from the others pools, thanks to incrementation, to provoke restart of the MinIO tenant and process of it's creation.

I suggest the possibility of deleting a pool directly inside the Pools interface.

Current Behavior

When extanding the tenant with a new pool, there is no effect since the new pool "pool-1" have the same name of another active pool "pool-1".
If i try to add on top of that a new pool, this creation have now a different name "pool-2" and make the tenant restart.
It never occurs since it's stuck.

Possible Solution

My workaround was to avoid the double "pool-1" name who break the expand of the tenant.
So for that, instead of decommissioning the "pool-0", i do it on the "pool-1" that i delete once completed.

Like that i have no conflict.

This approach work for the PoC i try to do, but if want to decommission the "pool-0" it may be a problem after due to the presence of "pool-1" name used again when expanding the tenant.

Steps to Reproduce (for bugs)

I built the nodes thanks to those commands:

kubectl krew install directpv
kubectl directpv install
kubectl directpv discover
kubectl directpv init drives.yaml --dangerous
kubectl krew install minio
kubectl minio init
kubectl minio proxy -n minio-operator

Once in the Operator Console web interface, i first create a tenant named test with 4 servers, 36 Gi in total and 1 volume per server.
Like that i obtain my first Pool named "pool-0".

After creating a bucket where i upload one file, i go back to the Operator Console web interface.
In the Pool section, i click on Expand Tenant.

There i put 4 Number of Servers, 9 Gi as Volume Size, 1 Volumes per Server, and i choose directpv-min-io for Storage Class.
Like that i obtain my second Pool named "pool-1".

I decommission the "pool-0", then deleted it once completed using multiple different approachs:

  • Edit the tenant configuration inside the Operator Console with the icon of the pen at the up-right

  • Get the tenant.yaml from this command:

kubectl get tenants test -n test -o yaml > tenant.yaml

Then i remove the corresponding pools block starting from - affinity to volumesPerServer in the yaml and multiple others things to finish with this:

apiVersion: minio.min.io/v2
kind: Tenant
metadata:
  name: test
  namespace: test
  uid: 42296fa6-b7b5-4e78-924d-c7170f8fa160
scheduler:
  name: ""
spec:
  configuration:
    name: test-env-configuration
  credsSecret:
    name: test-secret
  exposeServices:
    console: true
    minio: true
  image: minio/minio:RELEASE.2023-06-02T23-17-26Z
  imagePullSecret: {}
  mountPath: /export
  pools:
  - affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: v1.min.io/tenant
              operator: In
              values:
              - test
            - key: v1.min.io/pool
              operator: In
              values:
              - pool-1
          topologyKey: kubernetes.io/hostname
    name: pool-1
    resources: {}
    runtimeClassName: ""
    servers: 4
    volumeClaimTemplate:
      metadata:
        name: data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: "9663676416"
        storageClassName: directpv-min-io
      status: {}
    volumesPerServer: 1
  requestAutoCert: true
  users:
  - name: test-user-0
  • Used this command and removing approximatively the same extra parts than previously:
kubectl edit tenants test -n test

Context

I'm trying to find a way for expanding storage size of a tenant without having to create numbers of other servers for cumulative gains.

Your Environment

A Kubernetes cluster from OVH composed of 8 nodes with 10 Go HDD attached for each.

  • Version used (minio-operator): 5.0.4
  • Environment name and version (e.g. kubernetes v1.17.2): kubernetes 1.26.4-0
  • Server type and version: OVH Provider, 4 CPU with 8 Go RAM
  • Operating System and version (uname -a): Arch-Linux
  • Link to your deployment file: See the commands i used, no deployment file.
@bobacarOrpheo
Copy link
Author

bobacarOrpheo commented Jun 7, 2023

I even tried the manual creation with this command:

kubectl minio tenant expand test --pool pool-0 --servers 4 --volumes 4 --capacity 12Gi --namespace test --storage-class directpv-min-io

With the pool name i choosed (to avoid the use of "pool-1" again), i doesn't seems to have any effect on the tenant, since i don't see it restart nor "provision" the pods and volumes available.

EDIT :

I tried by curiosity to start again from scratch.

I avoided the Operator Console web interface approach for the second expansion (after the "pool-1" creation and the deletion of "pool-0") of the tenant for the creation of a new pool.

By doing it directly with the following command, it works:

kubectl minio tenant expand test --pool pool-0 --servers 4 --volumes 4 --capacity 48Gi --namespace test --storage-class directpv-min-io

The tenant restart and create the "pool-0" who get the pods and volumes up.

@jiuker
Copy link
Contributor

jiuker commented Jun 8, 2023

@bobacarOrpheo How do you remove the pool-0? Edit the tenant?

@bobacarOrpheo
Copy link
Author

bobacarOrpheo commented Jun 8, 2023

@jiuker Yes, as i told, i tried to remove the "pool-0" with multiple ways:

  • bash kubectl edit tenants test -n test and remove the corresponding pool block from the yaml.
  • The same but with the edit button in the web interface of the Operator Console:
    Edit tenant web interface

@jiuker
Copy link
Contributor

jiuker commented Jun 8, 2023

@jiuker Yes, as i told, i tried to remove the "pool-0" with multiple ways:

  • bash kubectl edit tenants test -n test and remove the corresponding pool block from the yaml.
  • The same but with the edit button in the web interface of the Operator Console:
    Edit tenant web interface

Ok, it seem edit like kubectl edit , they are the same. @bobacarOrpheo There are some fix. I will commit. Thanks for your reply.

@jiuker
Copy link
Contributor

jiuker commented Jun 8, 2023

When you add the same pool name. Operator will logs err always with a cycle. So it can't do anything. @bobacarOrpheo

@jiuker
Copy link
Contributor

jiuker commented Jun 8, 2023

If you want to fix this. There is no way. Unless you rename it back by the edit. But the data that has be stored will be gone. @bobacarOrpheo

@jiuker jiuker added bug Something isn't working and removed triage labels Jun 8, 2023
@bobacarOrpheo
Copy link
Author

bobacarOrpheo commented Jun 8, 2023

No problem @jiuker .
Thank you for the correction.

Now, when we click on the Expand Tenant from the web interface of the Operator, it will automaticly assign a name not already used right?

Do you know how i will be able to update to get the fix applied later?

For the moment i'm using the manual way thanks to this command:

kubectl minio tenant expand test \
    --pool pool-0 \ # pool-0 is different from the already active pool named pool-1
    --servers 4 \
    --volumes 4 \
    --capacity 36Gi \
    --namespace test \
    --storage-class directpv-min-io

@jiuker
Copy link
Contributor

jiuker commented Jun 8, 2023

@bobacarOrpheo if you want to fix. The name will give pool-pool.length,So it will work to you.
Add the pool to tenant before pool-1 by maual. By edit.

@cniackz
Copy link
Contributor

cniackz commented Jun 8, 2023

I was learning and reproducing this issue, my steps:

  1. Started with two pools: pool-0 and pool-1
  2. Decommissioned pool-0
$ mc admin decommission status myminio --insecure
┌─────┬───────────────────────────────────────────────────────────────────────────────────────┬──────────────────────────────────┬────────┐
│ ID  │ Pools                                                                                 │ Capacity                         │ Status │
│ 1st │ https://myminio-pool-0-{0...3}.myminio-hl.tenant-lite.svc.cluster.local/export{0...1} │ 121 GiB (used) / 2.3 TiB (total) │ Active │
│ 2nd │ https://myminio-pool-1-{0...3}.myminio-hl.tenant-lite.svc.cluster.local/export{0...1} │ 121 GiB (used) / 2.3 TiB (total) │ Active │
└─────┴───────────────────────────────────────────────────────────────────────────────────────┴──────────────────────────────────┴────────┘
$ mc admin decommission start myminio/ https://myminio-pool-0-{0...3}.myminio-hl.tenant-lite.svc.cluster.local/export{0...1} --insecure
Decommission started successfully for `https://myminio-pool-0-{0...3}.myminio-hl.tenant-lite.svc.cluster.local/export{0...1}`.
$ mc admin decommission status myminio --insecure
┌─────┬───────────────────────────────────────────────────────────────────────────────────────┬──────────────────────────────────┬──────────┐
│ ID  │ Pools                                                                                 │ Capacity                         │ Status   │
│ 1st │ https://myminio-pool-0-{0...3}.myminio-hl.tenant-lite.svc.cluster.local/export{0...1} │ 121 GiB (used) / 2.3 TiB (total) │ Complete │
│ 2nd │ https://myminio-pool-1-{0...3}.myminio-hl.tenant-lite.svc.cluster.local/export{0...1} │ 121 GiB (used) / 2.3 TiB (total) │ Active   │
└─────┴───────────────────────────────────────────────────────────────────────────────────────┴──────────────────────────────────┴──────────┘
  1. Remove the Decommissioned Pool
$ mc admin decommission status myminio --insecure
┌─────┬───────────────────────────────────────────────────────────────────────────────────────┬──────────────────────────────────┬────────┐
│ ID  │ Pools                                                                                 │ Capacity                         │ Status │
│ 1st │ https://myminio-pool-1-{0...3}.myminio-hl.tenant-lite.svc.cluster.local/export{0...1} │ 121 GiB (used) / 2.3 TiB (total) │ Active │
└─────┴───────────────────────────────────────────────────────────────────────────────────────┴──────────────────────────────────┴────────┘
  1. Expanded via UI and restarted all possible pods and even deleted the statefulset of MinIO:
API: SYSTEM()
Time: 12:47:10 UTC 06/08/2023
Error: lookup myminio-pool-1-2.myminio-hl.tenant-lite.svc.cluster.local on 10.96.0.10:53: no such host (*net.DNSError)
       host="myminio-pool-1-2.myminio-hl.tenant-lite.svc.cluster.local"
       8: internal/logger/logonce.go:118:logger.(*logOnceType).logOnceIf()
       7: internal/logger/logonce.go:149:logger.LogOnceIf()
       6: cmd/endpoint.go:430:cmd.hostResolveToLocalhost()
       5: cmd/endpoint.go:477:cmd.Endpoints.UpdateIsLocal()
       4: cmd/endpoint.go:675:cmd.CreateEndpoints()
       3: cmd/endpoint-ellipses.go:385:cmd.createServerEndpoints()
       2: cmd/server-main.go:222:cmd.serverHandleCmdArgs()
       1: cmd/server-main.go:531:cmd.serverMain()
ERROR Invalid command line arguments: duplicate endpoints found
  1. One problem here is that we shouldn't allow to have two pools with the same name, as MinIO will mark it as duplicated endpoints and it won't start:
  pools:
  - name: pool-1
    resources: {}
    servers: 4
    volumeClaimTemplate:
      metadata:
        name: data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 2Gi
      status: {}
    volumesPerServer: 2
  - affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
            - key: v1.min.io/tenant
              operator: In
              values:
              - myminio
            - key: v1.min.io/pool
              operator: In
              values:
              - pool-1
          topologyKey: kubernetes.io/hostname
    name: pool-1

@cniackz
Copy link
Contributor

cniackz commented Jun 8, 2023

I am not sure what is the best pathway for the solution, but I will review @jiuker solution at #1628

@jiuker
Copy link
Contributor

jiuker commented Jun 8, 2023

I am not sure what is the best pathway for the solution, but I will review @jiuker solution at #1628

@cniackz I mean the way to delete pool-0 is also the way to recover. Add pool-0 via UI edit.

@cniackz
Copy link
Contributor

cniackz commented Jun 8, 2023

Yes that is correct, data is never lost, PVCs are persisted. All you need to do is to put back the pool-0 back in the YAML of the tenant spec, so you are correct. Like the point here is that there is a WA by just editing the YAML and all is going to be good. But the issue I don't like and that I agree has to be fixed at some point is that we shouldn't assign or allow same name in two pools after decommission as this is not correct

@jiuker
Copy link
Contributor

jiuker commented Jun 8, 2023

@cniackz So the pr protect that thing. But can't protect via console Edit.

@cniackz
Copy link
Contributor

cniackz commented Jun 8, 2023

Oh ok got it, ok one step at a time. If PR protects that, then we have half of the equation resolved.
Later on we can look for a solution in the UI consolde edit I think.

@cniackz
Copy link
Contributor

cniackz commented Jun 8, 2023

We can consult with @dvaldivia or @bexsoft on this. The right solution is that UI should check existing pool's name and automatically assign next proper name for this to work.

@cniackz
Copy link
Contributor

cniackz commented Jun 8, 2023

No, thank you for the quick fix @jiuker 👍

@jiuker
Copy link
Contributor

jiuker commented Jun 8, 2023

We can consult with @dvaldivia or @bexsoft on this. The right solution is that UI should check existing pool's name and automatically assign next proper name for this to work.

That's right @cniackz It's UI issue. We just check. That's enough

@jiuker
Copy link
Contributor

jiuker commented Jun 8, 2023

No, thank you for the quick fix @jiuker 👍

You're welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants