Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inspect for cluster setup and fail with helpful error message during solo network deploy #782

Closed
Tracked by #47
jeromy-cannon opened this issue Oct 31, 2024 · 3 comments · Fixed by #799
Closed
Tracked by #47
Assignees
Labels
P1 High priority issue. Required to be completed in the assigned milestone. Requested by Stakeholder Requested by an individual or team that uses Solo

Comments

@jeromy-cannon
Copy link
Contributor

jeromy-cannon commented Oct 31, 2024

Currently, if the solo cluster setup has not been ran, when we run solo network deploy we get something like this:

> solo network deploy -i node1,node2,node3 -n solo-e2e
******************************* Solo *********************************************
Version			: 0.31.1
Kubernetes Context	: kind-solo-e2e
Kubernetes Cluster	: kind-solo-e2e
Kubernetes Namespace	: solo-e2e
**********************************************************************************
(node:51912) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
✔ Initialize
  ✔ Acquire lease - lease acquired successfully, attempt: 1/10
✔ Prepare staging directory
  ✔ Copy Gossip keys to staging
  ✔ Copy gRPC TLS keys to staging
✔ Copy node keys to secrets
  ✔ Copy TLS keys
  ✔ Node: node1
    ✔ Copy Gossip keys
  ✔ Node: node2
    ✔ Copy Gossip keys
  ✔ Node: node3
    ✔ Copy Gossip keys
✖ failed to install chart solo-deployment: Command exit with error code 1: /Users/user/.solo/bin/helm install solo-deployment solo-charts/solo-deployment
  --version v0.33.0 -n solo-e2e --create-namespace  --values /Users/user/.solo/cache/values-files/solo-local.yaml --set
  "telemetry.prometheus.svcMonitor.enabled=false" --set "defaults.root.image.repository=hashgraph/solo-containers/ubi8-init-java21" --set
  "defaults.volumeClaims.enabled=false"
◼ Check node pods are running
◼ Check proxy pods are running
◼ Check auxiliary pods are ready
*********************************** ERROR *****************************************
Error installing chart solo-deployment
***********************************************************************************

In order to find out what happened we have to check the solo log files to get a better message.

Instead as one of the first tasks in network deploy we should check for helm list --all-namespaces where NAME=solo-cluster-setup.

If this fails, we should log to the user that we require solo-cluster-setup to have been installed in the cluster prior to running the solo network deploy command.

Similarly, we should update the description for solo network deploy from:

Deploy solo network

to:

Deploy solo network.  Requires the `solo-setup-chart` to have been installed in the cluster.  If it hasn't the following command can be ran: `solo cluster setup`
@jeromy-cannon jeromy-cannon added the P1 High priority issue. Required to be completed in the assigned milestone. label Oct 31, 2024
@jeromy-cannon jeromy-cannon added the Requested by Stakeholder Requested by an individual or team that uses Solo label Oct 31, 2024
@JeffreyDallas
Copy link
Contributor

In solo-deplooyment/values.yaml
I saw deployPodMonitor set to true only when tester container is available.
we can find rolebindings, when running ci_test in solo-charts

jeffrey@Jeffreys-MacBook-Pro:~/full-stack-testing$ kubectl get rolebindings --all-namespaces
NAMESPACE      NAME                                                ROLE                                                  AGE
kube-public    kubeadm:bootstrap-signer-clusterinfo                Role/kubeadm:bootstrap-signer-clusterinfo             10m
kube-public    system:controller:bootstrap-signer                  Role/system:controller:bootstrap-signer               10m
kube-system    kube-proxy                                          Role/kube-proxy                                       10m
kube-system    kubeadm:kubelet-config                              Role/kubeadm:kubelet-config                           10m
kube-system    kubeadm:nodes-kubeadm-config                        Role/kubeadm:nodes-kubeadm-config                     10m
kube-system    system::extension-apiserver-authentication-reader   Role/extension-apiserver-authentication-reader        10m
kube-system    system::leader-locking-kube-controller-manager      Role/system::leader-locking-kube-controller-manager   10m
kube-system    system::leader-locking-kube-scheduler               Role/system::leader-locking-kube-scheduler            10m
kube-system    system:controller:bootstrap-signer                  Role/system:controller:bootstrap-signer               10m
kube-system    system:controller:cloud-provider                    Role/system:controller:cloud-provider                 10m
kube-system    system:controller:token-cleaner                     Role/system:controller:token-cleaner                  10m
solo-jeffrey   minio-binding                                       Role/minio-role                                       9m34s
solo-jeffrey   pod-monitor-role-binding                            ClusterRole/pod-monitor-role                          9m43s
solo-jeffrey   solo-cluster-setup-grafana                          Role/solo-cluster-setup-grafana                       9m49s

But could not find any role binding when running solo,
so by default ClusterRole is not available in solo repo

@JeffreyDallas
Copy link
Contributor

Maybe need other approach to check if cluster has been setup properly.

@jeromy-cannon
Copy link
Contributor Author

you are right.

just use:

❯ helm list --all-namespaces
NAME              	NAMESPACE 	REVISION	UPDATED                             	STATUS  	CHART                    	APP VERSION
solo-cluster-setup	solo-setup	1       	2024-11-04 14:35:42.644673 +0000 UTC	deployed	solo-cluster-setup-0.34.0	0.34.0

where NAME = solo-cluster-setup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High priority issue. Required to be completed in the assigned milestone. Requested by Stakeholder Requested by an individual or team that uses Solo
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants