Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to install solr-operator 0.8.1 - zookeeper operator post install job failure #728

Open
jamesla opened this issue Nov 11, 2024 · 6 comments

Comments

@jamesla
Copy link

jamesla commented Nov 11, 2024

I'm just getting started with this operator and I'm getting a failure on the zookeeper operators postinstall job.

To reproduce:

  1. minikube start
  2. helm install solr-operator apache-solr/solr-operator --version 0.8.1
Error: INSTALLATION FAILED: failed post-install: 1 error occurred:
        * timed out waiting for the condition

or with --debug flag

helm install solr-operator apache-solr/solr-operator --version 0.8.1 --debug
install.go:222: [debug] Original chart version: "0.8.1"
install.go:239: [debug] CHART PATH: /home/james/.cache/helm/repository/solr-operator-0.8.1.tgz

client.go:142: [debug] creating 3 resource(s)
wait.go:50: [debug] beginning wait for 3 resources with timeout of 1m0s
install.go:208: [debug] Clearing REST mapper cache
client.go:142: [debug] creating 12 resource(s)
client.go:486: [debug] Starting delete for "solr-operator-zookeeper-operator-post-install-upgrade" ServiceAccount
client.go:490: [debug] Ignoring delete failure for "solr-operator-zookeeper-operator-post-install-upgrade" /v1, Kind=ServiceAccount: serviceaccounts "solr-operator-zookeeper-operator-post-install-upgrade" not found
wait.go:104: [debug] beginning wait for 1 resources to be deleted with timeout of 5m0s
client.go:142: [debug] creating 1 resource(s)
client.go:486: [debug] Starting delete for "solr-operator-zookeeper-operator-post-install-upgrade" ConfigMap
client.go:490: [debug] Ignoring delete failure for "solr-operator-zookeeper-operator-post-install-upgrade" /v1, Kind=ConfigMap: configmaps "solr-operator-zookeeper-operator-post-install-upgrade" not found
wait.go:104: [debug] beginning wait for 1 resources to be deleted with timeout of 5m0s
client.go:142: [debug] creating 1 resource(s)
client.go:486: [debug] Starting delete for "solr-operator-zookeeper-operator-post-install-upgrade" Role
client.go:490: [debug] Ignoring delete failure for "solr-operator-zookeeper-operator-post-install-upgrade" rbac.authorization.k8s.io/v1, Kind=Role: roles.rbac.authorization.k8s.io "solr-operator-zookeeper-operator-post-install-upgrade" not found
wait.go:104: [debug] beginning wait for 1 resources to be deleted with timeout of 5m0s
client.go:142: [debug] creating 1 resource(s)
client.go:486: [debug] Starting delete for "solr-operator-zookeeper-operator-post-install-upgrade" RoleBinding
client.go:490: [debug] Ignoring delete failure for "solr-operator-zookeeper-operator-post-install-upgrade" rbac.authorization.k8s.io/v1, Kind=RoleBinding: rolebindings.rbac.authorization.k8s.io "solr-operator-zookeeper-operator-post-install-upgrade" not found
wait.go:104: [debug] beginning wait for 1 resources to be deleted with timeout of 5m0s
client.go:142: [debug] creating 1 resource(s)
client.go:486: [debug] Starting delete for "solr-operator-zookeeper-operator-post-install-upgrade" Job
client.go:490: [debug] Ignoring delete failure for "solr-operator-zookeeper-operator-post-install-upgrade" batch/v1, Kind=Job: jobs.batch "solr-operator-zookeeper-operator-post-install-upgrade" not found
wait.go:104: [debug] beginning wait for 1 resources to be deleted with timeout of 5m0s
client.go:142: [debug] creating 1 resource(s)
client.go:712: [debug] Watching for changes to Job solr-operator-zookeeper-operator-post-install-upgrade with timeout of 5m0s
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: ADDED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 0, jobs failed: 1, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 1, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 1, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 1, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 0, jobs failed: 1, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 0, jobs failed: 2, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 2, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 2, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 2, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 0, jobs failed: 2, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 0, jobs failed: 3, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 3, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 3, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 3, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 0, jobs failed: 3, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 0, jobs failed: 4, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 4, jobs succeeded: 0
client.go:740: [debug] Add/Modify event for solr-operator-zookeeper-operator-post-install-upgrade: MODIFIED
client.go:779: [debug] solr-operator-zookeeper-operator-post-install-upgrade: Jobs active: 1, jobs failed: 4, jobs succeeded: 0
client.go:486: [debug] Starting delete for "solr-operator-zookeeper-operator-post-install-upgrade" Job
wait.go:104: [debug] beginning wait for 1 resources to be deleted with timeout of 5m0s
Error: INSTALLATION FAILED: failed post-install: 1 error occurred:
        * timed out waiting for the condition


helm.go:84: [debug] failed post-install: 1 error occurred:
        * timed out waiting for the condition


INSTALLATION FAILED
main.newInstallCmd.func2
        helm.sh/helm/v3/cmd/helm/install.go:158
github.com/spf13/cobra.(*Command).execute
        github.com/spf13/cobra@v1.8.0/command.go:983
github.com/spf13/cobra.(*Command).ExecuteC
        github.com/spf13/cobra@v1.8.0/command.go:1115
github.com/spf13/cobra.(*Command).Execute
        github.com/spf13/cobra@v1.8.0/command.go:1039
main.main
        helm.sh/helm/v3/cmd/helm/helm.go:83
runtime.main
        runtime/proc.go:271
runtime.goexit
        runtime/asm_amd64.s:1695

You will see that the zookeeper post install jobs fail, they output no logs and nothing about the issue is logged to the Kubernetes event log.

The zookeeper-operator chat (with the specific version) that this chart depends on works fine when installed explicitly.

helm install zookeeper-operator pravega/zookeeper-operator --version 0.2.15

Should this be working?

@och42
Copy link

och42 commented Nov 18, 2024

Maybe you haven't installed Custom Resource Definitions:
kubectl create -f https://solr.apache.org/operator/downloads/crds/v0.8.1/all-with-dependencies.yaml,
as described on Artifact Hub? That was the issue in my case.

@jamesla
Copy link
Author

jamesla commented Nov 18, 2024

I thought that was the case but the helm chart actually installs the crds.

I managed to get it working by installing the zookeeper operator seperately and then installing the solr operator and telling it I was handling my own zookeeper operator install handled.

I would suggest that helm chart for this operator may be in a broken state right now in terms of it's default options.

@gerlowskija
Copy link
Contributor

gerlowskija commented Dec 12, 2024

AFAICT running helm install solr-operator apache-solr/solr-operator --version 0.8.1 in a clean environment only installs the solr-operator CRDs:

$ kubectl get crds
NAME                                      CREATED AT
solrbackups.solr.apache.org               2024-12-12T16:14:57Z
solrclouds.solr.apache.org                2024-12-12T16:14:57Z
solrprometheusexporters.solr.apache.org   2024-12-12T16:14:57Z

This is because "zookeeper-operator" is technically a "chart dependency", and helm doesn't install dependency-CRDs by default. That's why a lot of the project's tutorials use the all-with-dependencies.yaml file instead - because that does include the ZK CRDs.

So while unintuitive, I think this is "expected".

Maybe we can improve the documentation to make this a little clearer though - @jamesla , were you following a tutorial or documentation when you ran those commands? If so, maybe that could be updated to mention using all-with-dependencies.yaml?

@HoustonPutman
Copy link
Contributor

Honestly, those ZK jobs often fail for me depending on where I deploy to. I usually fix it by manually deleting the jobs myself. We should see if we can disable them by default.

@gerlowskija
Copy link
Contributor

Do you know what purpose the jobs are supposed to be serving @HoustonPutman ?

Some good docs here around why we don't install the ZK-crd by default, and how it can be enabled.

@HoustonPutman
Copy link
Contributor

Found the PR that created it: pravega/zookeeper-operator#221

Looks like it's just making sure the crds are installed before installing the operator? Sounds pretty useless to me. But I also don't see a way to disable it. One of the hooks you can disable, but the post-install hook looks required....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants