Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] OpenSearch Dashboards is NOT Deployed after the cluster(master 3, data2) configuration finished. #853

Open
YeonghyeonKO opened this issue Jul 5, 2024 · 11 comments
Labels
bug Something isn't working

Comments

@YeonghyeonKO
Copy link

YeonghyeonKO commented Jul 5, 2024

What is the bug?

  • After the configuration of OpenSearchCluster resource (as below manifests), one pod for the dashboards instance is NOT deployed.
  • A job for securityconfig-update is complete so the .opendistro_security index exists. Also, a bootstrap pod had been terminated normally after leader election between Master nodes.
  • I can't find any error logs or comment about 'dashboards' even the below manifest should deploy ONE replica of pod for opensearch-dashboards.

image
image

opensearchCluster:
  enabled: true
  general:
    httpPort: "9200"
    version: v2.14.0
    image: harbor-xxx.xxx.com/library/opensearchproject/opensearch:v2.14.0
    serviceName: "test-opensearch-cluster"
    drainDataNodes: true
    setVMMaxMapCount: true 
    # it doesn't work. This option let an InitContainer does "sysctl -w vm.max_map_count=262144", 
    # but `max_map_count` depends on the value of kernel, i.e the Kubernetes Worker Node (where the pods start)
    podSecurityContext:
      runAsUser: 1000
      runAsGroup: 1000
    securityContext:
      allowPrivilegeEscalation: true
      privileged: true
  initHelper:
    image: "harbor-xxx.xxx.com/nexus/docker-mig/library/busybox:1.31.1"
    imagePullPolicy: IfNotPresent
    # For setVMMaxMapCount=true, but this initContainer is useless since the upper comment(about kernel) I wrote.
  dashboards:
    enable: true
    replicas: 1
    version: v2.14.0
    image: harbor-xxx.xxx.com/library/opensearchproject/opensearch-dashboards:v2.14.0
    resources:
      requests:
        memory: "1Gi"
        cpu: "500m"
      limits:
        memory: "1Gi"
        cpu: "500m"
    tls:
      enable: false
      generate: false
  nodePools:
    - component: master
      replicas: 3
      diskSize: "10Gi"
      persistence:
        pvc:
          storageClass: "sc-nfs-app-retain"
          accessModes:
           - ReadWriteOnce
      roles:
        - "cluster_manager"
        - "master"
      resources:
        requests:
          memory: "4Gi"
          cpu: "1"
        limits:
          memory: "4Gi"
          cpu: "2"
      env:
        - name: OPENSEARCH_INITIAL_ADMIN_PASSWORD
          value: "PASSWORD"
    - component: data
      replicas: 2
      diskSize: "100Gi"
      persistence:
        pvc:
          storageClass: "sc-nfs-app-retain"
          accessModes:
           - ReadWriteOnce
      roles:
        - "data"
        - "ingest"
        - "ml"
      resources:
        requests:
          memory: "8Gi"
          cpu: "2"
        limits:
          memory: "8Gi"
          cpu: "4"
      env:
        - name: OPENSEARCH_INITIAL_ADMIN_PASSWORD
          value: "PASSWORD"
  security:
    tls:
      transport:
        generate: true
        perNode: true
      http:
        generate: true
    config:
      adminCredentialsSecret:
         name: admin-credentials-secret
      securityConfigSecret:
         name: securityconfig-secret

This is the logs from ArgoCD.

# StatefulSet : test-opensearch-cluster-master
...
[2024-07-05T07:27:59,116][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-2] Running full sweep 
[2024-07-05T07:28:32,508][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-0] Running full sweep 
[2024-07-05T07:31:05,202][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-1] Running full sweep 
[2024-07-05T07:32:59,117][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-2] Running full sweep 
[2024-07-05T07:33:32,509][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-0] Running full sweep 
[2024-07-05T07:34:25,122][INFO ][o.o.s.s.c.FlintStreamingJobHouseKeeperTask] [test-opensearch-cluster-master-1] Starting housekeeping task for auto refresh streaming jobs. 
[2024-07-05T07:34:25,123][INFO ][o.o.s.s.c.FlintStreamingJobHouseKeeperTask] [test-opensearch-cluster-master-1] Finished housekeeping task for auto refresh streaming jobs. 
[2024-07-05T07:36:05,203][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-1] Running full sweep 
[2024-07-05T07:37:59,117][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-2] Running full sweep 
[2024-07-05T07:38:32,509][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-0] Running full sweep 
[2024-07-05T07:41:05,203][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-1] Running full sweep 
[2024-07-05T07:42:59,118][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-2] Running full sweep 
[2024-07-05T07:43:32,510][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-0] Running full sweep 
[2024-07-05T07:46:05,204][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-1] Running full sweep 
[2024-07-05T07:47:59,118][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-2] Running full sweep 
[2024-07-05T07:48:32,510][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-0] Running full sweep 
[2024-07-05T07:49:25,101][INFO ][o.o.a.c.HourlyCron       ] [test-opensearch-cluster-master-1] Hourly maintenance succeeds 
[2024-07-05T07:49:25,123][INFO ][o.o.s.s.c.FlintStreamingJobHouseKeeperTask] [test-opensearch-cluster-master-1] Starting housekeeping task for auto refresh streaming jobs. 
[2024-07-05T07:49:25,123][INFO ][o.o.s.s.c.FlintStreamingJobHouseKeeperTask] [test-opensearch-cluster-master-1] Finished housekeeping task for auto refresh streaming jobs. 
[2024-07-05T07:51:05,204][INFO ][o.o.j.s.JobSweeper       ] [test-opensearch-cluster-master-1] Running full sweep
# Pod : opensearch-operator-controller-manager-6c44cc9df5-69j4l
...
{"level":"debug","ts":"2024-07-05T07:19:51.212Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"089fa76b-dca0-4780-9e18-289477965d12","name":"test-opensearch-cluster-data","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Service"}
{"level":"error","ts":"2024-07-05T07:19:51.262Z","msg":"Reconciler error","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"089fa76b-dca0-4780-9e18-289477965d12","error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource; failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorCauses":[{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"},{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"}],"stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"}
{"level":"info","ts":"2024-07-05T07:36:31.263Z","msg":"Reconciling OpenSearchCluster","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","cluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"}}
{"level":"info","ts":"2024-07-05T07:36:31.276Z","msg":"Generating certificates","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","interface":"transport"}
{"level":"info","ts":"2024-07-05T07:36:31.277Z","msg":"Generating certificates","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","interface":"http"}
{"level":"debug","ts":"2024-07-05T07:36:31.277Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","name":"test-opensearch-cluster-config","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"ConfigMap"}
{"level":"debug","ts":"2024-07-05T07:36:31.278Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","name":"test-opensearch-cluster","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Service"}
{"level":"debug","ts":"2024-07-05T07:36:31.279Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","name":"test-opensearch-cluster-discovery","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Service"}
{"level":"debug","ts":"2024-07-05T07:36:31.279Z","msg":"resource diff","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","name":"test-opensearch-cluster-admin-password","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Secret"}
{"level":"debug","ts":"2024-07-05T07:36:31.280Z","msg":"updating resource","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","name":"test-opensearch-cluster-admin-password","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Secret"}
{"level":"debug","ts":"2024-07-05T07:36:31.283Z","msg":"resource updated","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","name":"test-opensearch-cluster-admin-password","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Secret"}
{"level":"debug","ts":"2024-07-05T07:36:31.284Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","name":"test-opensearch-cluster-master","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Service"}
{"level":"debug","ts":"2024-07-05T07:36:31.286Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","name":"test-opensearch-cluster-data","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Service"}
{"level":"error","ts":"2024-07-05T07:36:31.316Z","msg":"Reconciler error","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"a0edb563-35ee-43af-9a19-dbcc9da7808f","error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource; failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorCauses":[{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"},{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"}],"stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"}
{"level":"info","ts":"2024-07-05T07:53:11.317Z","msg":"Reconciling OpenSearchCluster","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","cluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"}}
{"level":"info","ts":"2024-07-05T07:53:11.331Z","msg":"Generating certificates","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","interface":"transport"}
{"level":"info","ts":"2024-07-05T07:53:11.331Z","msg":"Generating certificates","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","interface":"http"}
{"level":"debug","ts":"2024-07-05T07:53:11.331Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","name":"test-opensearch-cluster-config","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"ConfigMap"}
{"level":"debug","ts":"2024-07-05T07:53:11.332Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","name":"test-opensearch-cluster","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Service"}
{"level":"debug","ts":"2024-07-05T07:53:11.333Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","name":"test-opensearch-cluster-discovery","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Service"}
{"level":"debug","ts":"2024-07-05T07:53:11.333Z","msg":"resource diff","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","name":"test-opensearch-cluster-admin-password","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Secret"}
{"level":"debug","ts":"2024-07-05T07:53:11.334Z","msg":"updating resource","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","name":"test-opensearch-cluster-admin-password","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Secret"}
{"level":"debug","ts":"2024-07-05T07:53:11.338Z","msg":"resource updated","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","name":"test-opensearch-cluster-admin-password","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Secret"}
{"level":"debug","ts":"2024-07-05T07:53:11.339Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","name":"test-opensearch-cluster-master","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Service"}
{"level":"debug","ts":"2024-07-05T07:53:11.341Z","msg":"resource is in sync","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","name":"test-opensearch-cluster-data","namespace":"test-opensearch-cluster","apiVersion":"v1","kind":"Service"}
{"level":"error","ts":"2024-07-05T07:53:11.378Z","msg":"Reconciler error","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource; failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorCauses":[{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"},{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"}],"stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"}
# Job : test-opensearch-cluster-securityconfig-update
Waiting to connect to the cluster
OpenSearch Security not initialized.**************************************************************************
** This tool will be deprecated in the next major release of OpenSearch **
** https://github.com/opensearch-project/security/issues/1755           **
**************************************************************************
Security Admin v7
Will connect to test-opensearch-cluster.test-opensearch-cluster.svc.cluster.local:9200 ... done
Connected as "CN=admin,OU=test-opensearch-cluster"
OpenSearch Version: 2.14.0
Contacting opensearch cluster 'opensearch' and wait for YELLOW clusterstate ...
Clustername: test-opensearch-cluster
Clusterstate: GREEN
Number of nodes: 3
Number of data nodes: 1
.opendistro_security index does not exists, attempt to create it ... done (0-all replicas)
Populate config from /usr/share/opensearch/config/opensearch-security/
Will update '/config' with /usr/share/opensearch/config/opensearch-security/config.yml
   SUCC: Configuration for 'config' created or updated
Will update '/roles' with /usr/share/opensearch/config/opensearch-security/roles.yml
   SUCC: Configuration for 'roles' created or updated
Will update '/rolesmapping' with /usr/share/opensearch/config/opensearch-security/roles_mapping.yml
   SUCC: Configuration for 'rolesmapping' created or updated
Will update '/internalusers' with /usr/share/opensearch/config/opensearch-security/internal_users.yml
   SUCC: Configuration for 'internalusers' created or updated
Will update '/actiongroups' with /usr/share/opensearch/config/opensearch-security/action_groups.yml
   SUCC: Configuration for 'actiongroups' created or updated
Will update '/tenants' with /usr/share/opensearch/config/opensearch-security/tenants.yml
   SUCC: Configuration for 'tenants' created or updated
Will update '/nodesdn' with /usr/share/opensearch/config/opensearch-security/nodes_dn.yml
   SUCC: Configuration for 'nodesdn' created or updated
Will update '/whitelist' with /usr/share/opensearch/config/opensearch-security/whitelist.yml
   SUCC: Configuration for 'whitelist' created or updated
Will update '/audit' with /usr/share/opensearch/config/opensearch-security/audit.yml
   SUCC: Configuration for 'audit' created or updated
Will update '/allowlist' with /usr/share/opensearch/config/opensearch-security/allowlist.yml
   SUCC: Configuration for 'allowlist' created or updated
SUCC: Expected 10 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"],"updated_config_size":10,"message":null} is 10 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 10 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"],"updated_config_size":10,"message":null} is 10 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 10 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"],"updated_config_size":10,"message":null} is 10 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"]) due to: null
Done with success

What is your host/environment?

  • kubernetes: v1.20.7
  • opensearch-operator: v2.6.0
  • opensearch & opensearch-dashboards: v2.14.0
@YeonghyeonKO YeonghyeonKO added bug Something isn't working untriaged Issues that have not yet been triaged labels Jul 5, 2024
@YeonghyeonKO YeonghyeonKO changed the title [BUG] OpenSearch Dashboards is NOT Deployed after the cluster(master 3, data2) configuration is complete [BUG] OpenSearch Dashboards is NOT Deployed after the cluster(master 3, data2) configuration finished. Jul 5, 2024
@jd4883
Copy link

jd4883 commented Jul 18, 2024

I have observed the exact same behavior. I had deployed this in a staging environment 6 months ago and the issue was not present. When preparing to bring the service to production this week, I observed that I can repeatedly redeploy opensearch and the dashboard will silently never come up.

This is my live configuration:

apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
  name: opensearch
  namespace: xxx
spec:
  bootstrap:
    resources: {}
  confMgmt: {}
  dashboards:
    additionalConfig:
      opensearch.ssl.verificationMode: none
      opensearch_security.multitenancy.enable_filter: 'true'
      opensearch_security.multitenancy.enabled: 'true'
      opensearch_security.multitenancy.tenants.enable_global: 'true'
      opensearch_security.multitenancy.tenants.enable_private: 'false'
      opensearch_security.multitenancy.tenants.preferred: '["Global"]'
      server.name: opensearch
    additionalVolumes:
      - name: trusted-cas
        path: /usr/share/opensearch/config/trusted_cas.pem
        restartPods: true
        secret:
          defaultMode: 420
          secretName: opensearch-trusted-cas
        subPath: trusted_cas.pem
    basePath: /logs
    enable: true
    image: 'harbor.xxx/dockerhub/opensearchproject/opensearch-dashboards:2.7.0'
    opensearchCredentialsSecret:
      name: opensearch-basic-auth
    podSecurityContext:
      fsGroup: 1000
    replicas: 1
    resources:
      limits:
        cpu: '2'
        memory: 2Gi
      requests:
        cpu: '1'
        memory: 1Gi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
      privileged: false
      readOnlyRootFilesystem: true
      runAsGroup: 1000
      runAsUser: 1000
    service:
      type: ClusterIP
    version: v2.7.0
  general:
    additionalVolumes:
      - name: trusted-cas
        path: /usr/share/opensearch/config/trusted_cas.pem
        restartPods: true
        secret:
          defaultMode: 420
          secretName: opensearch-trusted-cas
        subPath: trusted_cas.pem
    drainDataNodes: true
    httpPort: 9200
    image: 'harbor.xxx/dockerhub/opensearchproject/opensearch:2.7.0'
    monitoring:
      enable: true
      pluginUrl: >-
        https://github.com/aiven/prometheus-exporter-plugin-for-opensearch/releases/download/2.7.0.0/prometheus-exporter-2.7.0.0.zip
      scrapeInterval: 30s
    pluginsList:
      - repository-s3
    podSecurityContext:
      fsGroup: 1000
      securityContext:
        allowPrivilegeEscalation: true
        runAsGroup: 1000
        runAsUser: 1000
      serviceName: opensearch
      setVMMaxMapCount: true
      version: v2.7.0
  initHelper:
    resources: {}
  nodePools:
    - additionalConfig:
        prometheus.indices: 'true'
        prometheus.metric_name.prefix: opensearch_
        prometheus.nodes.filter: _all
      annotations:
        prometheus.io/path: /_prometheus/metrics
        prometheus.io/port: '9200'
        prometheus.io/scrape: 'true'
      component: masters
      diskSize: 50Gi
      jvm: '-Xmx1024M -Xms1024M'
      pdb:
        enable: true
        maxUnavailable: 2
        minAvailable: 1
      persistence:
        pvc:
          accessModes:
            - ReadWriteOnce
          storageClass: gp3
      replicas: 3
      resources:
        limits:
          cpu: '1'
          memory: 2Gi
        requests:
          cpu: '1'
          memory: 2Gi
      roles:
        - master
        - data
    - additionalConfig:
        prometheus.indices: 'true'
        prometheus.metric_name.prefix: opensearch_
        prometheus.nodes.filter: _all
      annotations:
        prometheus.io/path: /_prometheus/metrics
        prometheus.io/port: '9200'
        prometheus.io/scrape: 'true'
      component: data
      diskSize: 50Gi
      jvm: '-Xmx1024M -Xms1024M'
      persistence:
        pvc:
          accessModes:
            - ReadWriteOnce
          storageClass: gp3
      replicas: 1
      resources: null
      limits:
        cpu: '2'
        memory: 2Gi
      requests:
        cpu: '2'
        memory: 2Gi
      roles:
        - data
        - ingest
        - ml
        - transform
  security:
    config:
      adminCredentialsSecret:
        name: opensearch-basic-auth
      adminSecret: {}
      securityConfigSecret:
        name: opensearch-config-secret
      updateJob:
        resources: {}
    tls:
      http:
        caSecret: {}
        generate: true
        secret: {}
      transport:
        caSecret: {}
        generate: true
        secret: {}
status:
  availableNodes: 4
  componentsStatus: []
  health: green
  initialized: true
  phase: RUNNING
  version: v2.7.0

The live environment looks like this:

% kubectl -n platform-logging get all
NAME                                                         READY   STATUS      RESTARTS   AGE
pod/opensearch-data-0                                        1/1     Running     0          18h
pod/opensearch-masters-0                                     1/1     Running     0          18h
pod/opensearch-masters-1                                     1/1     Running     0          18h
pod/opensearch-masters-2                                     1/1     Running     0          18h
pod/opensearch-securityconfig-update-6qvcp                   0/1     Completed   0          18h
pod/opensearch-staging-controller-manager-56c78c5cb6-8wj79   2/2     Running     0          18h

NAME                                                            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                               AGE
service/opensearch                                              ClusterIP   172.20.49.152   <none>        9200/TCP,9300/TCP,9600/TCP,9650/TCP   18h
service/opensearch-data                                         ClusterIP   None            <none>        9200/TCP,9300/TCP                     18h
service/opensearch-discovery                                    ClusterIP   None            <none>        9300/TCP                              18h
service/opensearch-masters                                      ClusterIP   None            <none>        9200/TCP,9300/TCP                     18h
service/opensearch-staging-controller-manager-metrics-service   ClusterIP   172.20.42.80    <none>        8443/TCP                              18h

NAME                                                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/opensearch-staging-controller-manager   1/1     1            1           18h

NAME                                                               DESIRED   CURRENT   READY   AGE
replicaset.apps/opensearch-staging-controller-manager-56c78c5cb6   1         1         1       18h

NAME                                         READY   AGE
statefulset.apps/opensearch-data             1/1     18h
statefulset.apps/opensearch-masters          3/3     18h

NAME                                         COMPLETIONS   DURATION   AGE
job.batch/opensearch-securityconfig-update   1/1           2m42s      18h

Similar to @YeonghyeonKO's reported issue I never get the dashboard to deploy. Because this worked 6 months ago I tried altering the configuration quite a bit with no change in behavior. I did 10+ redeploys with minimal variation always with the same end result. 6 months ago when deploying with the operator version 2.4, everything worked. It is no longer possible to deploy the opensearch stack with a dashboard using the operator.

My deployment is also facilitated via argocd and is very similar, with the exact same issue. Please let me know if I can provide any specific diagnostic data to help get this issue addressed.

What is your host/environment?
kubernetes: v1.28.0 (EKS)
opensearch-operator: v2.6.0 (I have also tried v2.4.0 and all in-between releases with the same behavior).
opensearch & opensearch-dashboards: v2.7.0

@jd4883
Copy link

jd4883 commented Jul 18, 2024

@YeonghyeonKO I did more debugging and actually got the dashboard to come up. In my case, my pdb settings were wrong; I had a log line similar to one of yours that gave me the hint {"level":"error","ts":"2024-07-05T07:19:51.262Z","msg":"Reconciler error","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"089fa76b-dca0-4780-9e18-289477965d12","error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource; failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorCauses":[{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"},{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"}],"stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"}

In your case, if you explicitly set pdb to a valid configuration or turn it off explicitly do you still have the issue? My valid pdb settings look like this on the master nodes:

pdb:
      enable: true
      minAvailable: 1

It is odd that this setting completely stops the deploy component of the dashboards in the operator but this might also be your root cause

@getsaurabh02 getsaurabh02 moved this from 🆕 New to Backlog in Engineering Effectiveness Board Jul 18, 2024
@getsaurabh02 getsaurabh02 moved this from Backlog to 🆕 New in Engineering Effectiveness Board Jul 18, 2024
@YeonghyeonKO
Copy link
Author

YeonghyeonKO commented Jul 22, 2024

@jd4883 Hi,
I did test deploying opensearch-cluster (k8s crd) while changing versions(docker image) of opensearch-operator, opensearch(and its dashboard), and finally the version of Kubernetes cluster.

To tell you the conclusion, migration of Kubernetes Cluster from 1.20.1 to 1.25.6 does work.
As you pointed out an issue of PodDisruptionBudget resource from logs(opensearch-operator), deployment in the lower version of Kubernetes couldn't find the right version(or spec) of PDB.

By changing just version of k8s regardless whether expliciting the below yaml,

  nodePools:
      ...
      pdb:
        enable: false
        # enable: true
        # minAvailable: 1
      ...

image


Also, with opensearch.ssl.verificationMode: none option in opensearchCluster.general.dashboards.additionalConfig, Service for test-opensearch-cluster-dashboards is exposed externally through 80 port.
image



@jd4883 I'm just curious about your working environment.
Your OpenSearchCluster (kind manifest) lets security.tls.transport and security.tls.http to be generated automatically.
(See https://opensearch-project.github.io/opensearch-k8s-operator/docs/userguide/main.html#node-httprest-api)


image

But you are trying to inject trusted-cas.pem using Secret (kind).

  1. Is it for expose your localhost:port externally to allow access from outside? (This is what I want to do in the next step.)
  2. How did you issue that .pem file?

@jd4883
Copy link

jd4883 commented Jul 22, 2024

@YeonghyeonKO thanks for the detailed response. Glad you got yours behaving now too. A few things that might make this make more sense:

  • SSL termination is being handled with an extension on the chart via istio, meaning I am not generating a cert for external communication or pre-providing one that opensearch uses for the UI. All certs are used internally and the generated ones work fine. Because istio uses a VirtualService in place of an ingress, the chart natively does not account for this so it's being done as an extension and not directly. External access is working with the internal user defined in the chart. I am still working out the LDAP integration to allow more wide access (this environment is still a staging one and the production counterpart has not been deployed yet).
  • The trustred_cas.pem I am injecting relates to an integration with our LDAP server which I am still working out the details on; that uses a custom generated cert which I need to inject into the pods to facilitate that communication securely. It has no direct implication on the opensearch configuration.

The deployment I have is still getting some fine tuning so it's as close to perfect when it comes time to deploy it to production. I am pre-injecting all the configuration files into a secret which I've done by extending the chart as such (this is an additional template file for the chart to render):

---
apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: {{ printf "%s-config-secret" $name | lower }}
  labels:
    name: {{ printf "%s-config-secret" $name | quote }}
    role: "secret"
    {{- range $key, $value := .Values.labels }}
    {{ $key }}: {{ $value | quote }}
    {{- end }}
data:
{{ (.Files.Glob "configs/security/*").AsSecrets | nindent 4 }}

That secret grabs all the files which I have been working out optimal configurations for. Previously, this was problematic because what I had ready to go from 6 months back was not spinning up the dashboard, but now that I've dug into the issue I have a working baseline in my branch to work with and am figuring out how to improve it further. At this point I am not functionally blocked anywhere but hopefully this explains a bit as to why my configuration may be a bit different with regards to ingress support.

Kind regards,

Jacob

@dblock
Copy link
Member

dblock commented Jul 29, 2024

[Catch All Triage - 1, 2, 3, 4]

@dblock dblock removed the untriaged Issues that have not yet been triaged label Jul 29, 2024
@YeonghyeonKO
Copy link
Author

@jd4883 @dblock

{"level":"error","ts":"2024-07-30T07:53:11.378Z","msg":"Reconciler error","controller":"opensearchcluster","controllerGroup":"opensearch.opster.io","controllerKind":"OpenSearchCluster","OpenSearchCluster":{"name":"test-opensearch-cluster","namespace":"test-opensearch-cluster"},"namespace":"test-opensearch-cluster","name":"test-opensearch-cluster","reconcileID":"cfa91fdf-5e47-4216-b95f-53a26640cbee","error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource; failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorCauses":[{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"},{"error":"failed to delete resource: getting resource failed: failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource","errorVerbose":"failed to get API group resources: unable to retrieve the complete list of server APIs: policy/v1: the server could not find the requested resource\ngetting resource failed\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).delete\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:651\ngithub.com/cisco-open/operator-tools/pkg/reconciler.(*GenericResourceReconciler).ReconcileResource\n\t/go/pkg/mod/github.com/cisco-open/operator-tools@v0.30.0/pkg/reconciler/resource.go:523\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers/k8s.K8sClientImpl.ReconcileResource\n\t/workspace/pkg/reconcilers/k8s/client.go:198\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).handlePDB\n\t/workspace/pkg/reconcilers/cluster.go:410\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).reconcileNodeStatefulSet\n\t/workspace/pkg/reconcilers/cluster.go:240\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*ClusterReconciler).Reconcile\n\t/workspace/pkg/reconcilers/cluster.go:111\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nfailed to delete resource"}],"stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:324\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"}

This error log told me that a PDB Api resource cannot be deployed
so I did change the version of Kubernetes Cluster itself, from 1.20.1(policy/v1beta) to 1**.25.6**(policy/v1).

@jd4883
Copy link

jd4883 commented Jul 31, 2024

I think what made the error harder to catch is the error does not come up frequently and if you miss it its hard to spot. If there is an error mid deployment, maybe a functional improvement to the chart is to be more verbose about a specific error that keeps occurring so if you start sifting through logs to debug, the error is easier to find. That is my 2c though. Glad to know why it was a problem and how to fix it for the moment.

@YeonghyeonKO
Copy link
Author

@jd4883 I agree with your opinion, the problem above OpenSearch itself, (for example Kubernetes Cluster, this time) is hard to find why something is wrong or not working.

@getsaurabh02 getsaurabh02 moved this from 🆕 New to Planned (Next Quarter) in Engineering Effectiveness Board Aug 19, 2024
@rshade
Copy link

rshade commented Sep 12, 2024

I seem to be unable to get it working also with EKS 1.29 + OS 2.11.1, even though its status is green.

kn get opensearchclusters.opensearch.opster.io opensearch -o yaml
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
  annotations:
    pulumi.com/waitFor: jsonpath={.status.health}=green
  creationTimestamp: "2024-09-12T17:31:22Z"
  finalizers:
  - Opster
  generation: 2
  name: opensearch
  namespace: pulumi-selfhosted-apps
  resourceVersion: "13797078"
  uid: f808e3b0-70c7-44bd-8c1c-566403dc838e
spec:
  bootstrap:
    resources: {}
  confMgmt:
    smartScaler: true
  dashboards:
    annotations:
      cloud.google.com/neg: '{"ingress": true}'
    basePath: /opensearch-dashboard
    enable: true
    opensearchCredentialsSecret: {}
    replicas: 1
    resources:
      limits:
        cpu: 500m
        memory: 1Gi
      requests:
        cpu: 500m
        memory: 1Gi
    service:
      type: NodePort
    tls:
      caSecret: {}
      secret: {}
    version: 2.11.1
  general:
    httpPort: 9200
    monitoring:
      enable: true
    serviceName: opensearch
    setVMMaxMapCount: true
    vendor: opensearch
    version: 2.11.1
  initHelper:
    resources: {}
  nodePools:
  - component: masters
    diskSize: 30Gi
    pdb: {}
    replicas: 3
    resources:
      limits:
        cpu: "1"
        memory: 2Gi
      requests:
        cpu: "1"
        memory: 2Gi
    roles:
    - cluster_manager
    - master
    - ingest
    - data
  - component: nodes
    diskSize: 30Gi
    replicas: 3
    resources:
      limits:
        cpu: 500m
        memory: 2Gi
      requests:
        cpu: 500m
        memory: 2Gi
    roles:
    - data
status:
  availableNodes: 6
  componentsStatus: []
  health: green
  initialized: true
  phase: RUNNING
  version: 2.11.1

@arshashi
Copy link

arshashi commented Nov 20, 2024

I seem to be unable to get it working also with EKS 1.29 + OS 2.11.1, even though its status is green.

kn get opensearchclusters.opensearch.opster.io opensearch -o yaml
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
  annotations:
    pulumi.com/waitFor: jsonpath={.status.health}=green
  creationTimestamp: "2024-09-12T17:31:22Z"
  finalizers:
  - Opster
  generation: 2
  name: opensearch
  namespace: pulumi-selfhosted-apps
  resourceVersion: "13797078"
  uid: f808e3b0-70c7-44bd-8c1c-566403dc838e
spec:
  bootstrap:
    resources: {}
  confMgmt:
    smartScaler: true
  dashboards:
    annotations:
      cloud.google.com/neg: '{"ingress": true}'
    basePath: /opensearch-dashboard
    enable: true
    opensearchCredentialsSecret: {}
    replicas: 1
    resources:
      limits:
        cpu: 500m
        memory: 1Gi
      requests:
        cpu: 500m
        memory: 1Gi
    service:
      type: NodePort
    tls:
      caSecret: {}
      secret: {}
    version: 2.11.1
  general:
    httpPort: 9200
    monitoring:
      enable: true
    serviceName: opensearch
    setVMMaxMapCount: true
    vendor: opensearch
    version: 2.11.1
  initHelper:
    resources: {}
  nodePools:
  - component: masters
    diskSize: 30Gi
    pdb: {}
    replicas: 3
    resources:
      limits:
        cpu: "1"
        memory: 2Gi
      requests:
        cpu: "1"
        memory: 2Gi
    roles:
    - cluster_manager
    - master
    - ingest
    - data
  - component: nodes
    diskSize: 30Gi
    replicas: 3
    resources:
      limits:
        cpu: 500m
        memory: 2Gi
      requests:
        cpu: 500m
        memory: 2Gi
    roles:
    - data
status:
  availableNodes: 6
  componentsStatus: []
  health: green
  initialized: true
  phase: RUNNING
  version: 2.11.1

Remove this monitoring and redeploy, Opensearch Dashboards will get redeployed. I think Opensearch Operator is not designed to use by providing this configuration. This apprears to be a bug

monitoring:
  enable: true

@arshashi
Copy link

Remove this monitoring and redeploy, Opensearch Dashboards will get redeployed. I think Opensearch Operator is not designed to use by providing this configuration. This apprears to be a bug

monitoring:
  enable: true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 📦 Backlog
Development

No branches or pull requests

5 participants