You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
A clear and concise description of what the bug is. multi-node weaviate cluster pod CrashLoopBackOff
kbcli version
Kubernetes: v1.27.9
KubeBlocks: 0.9.0-beta.36
kbcli: 0.9.0-beta.27
WARNING: version difference between kbcli (0.9.0-beta.27) and kubeblocks (0.9.0-beta.36)
To Reproduce
1.create cluster
yaml:
apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
name: weaviate-cluster
namespace: default
spec:
# Specifies the behavior when a Cluster is deleted.
# - `DoNotTerminate`: Prevents deletion of the Cluster. This policy ensures that all resources remain intact.
# - `Halt`: Deletes Cluster resources like Pods and Services but retains Persistent Volume Claims (PVCs), allowing for data preservation while stopping other operations.
# - `Delete`: Extends the `Halt` policy by also removing PVCs, leading to a thorough cleanup while removing all persistent data.
# - `WipeOut`: An aggressive policy that deletes all Cluster resources, including volume snapshots and backups in external storage. This results in complete data removal and should be used cautiously, primarily in non-production environments to avoid irreversible data loss.
terminationPolicy: Delete
# Specifies a list of ClusterComponentSpec objects used to define the individual components that make up a Cluster.
componentSpecs:
# Specifies the name of the Component. This name is also part of the Service DNS name and must comply with the IANA service naming rule.
- name: weaviate
# References the name of a ComponentDefinition. The ComponentDefinition specifies the behavior and characteristics of the Component. If both `componentDefRef` and `componentDef` are provided, the `componentDef` will take precedence over `componentDefRef`.
componentDef: weaviate
# Specifies a group of affinity scheduling rules for the Component. It allows users to control how the Component's Pods are scheduled onto nodes in the cluster.
affinity:
podAntiAffinity: Preferred
topologyKeys:
- kubernetes.io/hostname
tenancy: SharedNode
# Allows the Component to be scheduled onto nodes with matching taints.
tolerations:
- key: kb-data
operator: Equal
value: 'true'
effect: NoSchedule
# Determines whether the metrics exporter needs to be published to the service endpoint.
disableExporter: true
# Specifies the name of the ServiceAccount required by the running Component.
serviceAccountName: kb-weaviate-cluster
# Each component supports running multiple replicas to provide high availability and persistence. This field can be used to specify the desired number of replicas.
replicas: 2
# Specifies the resources required by the Component. It allows defining the CPU, memory requirements and limits for the Component's containers.
resources:
limits:
cpu: '0.5'
memory: 0.5Gi
requests:
cpu: '0.5'
memory: 0.5Gi
# Specifies a list of PersistentVolumeClaim templates that define the storage requirements for the Component.
volumeClaimTemplates:
- name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
see error
k get pod -l app.kubernetes.io/instance=weaviate-cluster
NAME READY STATUS RESTARTS AGE
weaviate-cluster-weaviate-0 0/1 CrashLoopBackOff 19 (119s ago) 57m
k describe pod weaviate-cluster-weaviate-0
Name: weaviate-cluster-weaviate-0
Namespace: default
Priority: 0
Service Account: kb-weaviate-cluster
Node: aks-cicdamdpool-42454392-vmss000003/10.224.0.8
Start Time: Tue, 25 Jun 2024 14:03:25 +0800
Labels: app.kubernetes.io/component=weaviate
app.kubernetes.io/instance=weaviate-cluster
app.kubernetes.io/managed-by=kubeblocks
app.kubernetes.io/name=weaviate
app.kubernetes.io/version=weaviate
apps.kubeblocks.io/cluster-uid=25f24b02-88fd-4654-b3c3-eac8b3b328c3
apps.kubeblocks.io/component-name=weaviate
apps.kubeblocks.io/pod-name=weaviate-cluster-weaviate-0
componentdefinition.kubeblocks.io/name=weaviate
controller-revision-hash=69d75c66dd
workloads.kubeblocks.io/instance=weaviate-cluster-weaviate
workloads.kubeblocks.io/managed-by=InstanceSet
Annotations: apps.kubeblocks.io/component-replicas: 2
Status: Running
IP: 10.244.4.85
IPs:
IP: 10.244.4.85
Controlled By: InstanceSet/weaviate-cluster-weaviate
Containers:
weaviate:
Container ID: containerd://5ae669a72e386d14e0f3efec57f5dba748ef58f7b05a14c7279cf1fcdfb243e1
Image: docker.io/semitechnologies/weaviate:1.19.6
Image ID: docker.io/semitechnologies/weaviate@sha256:6bd9b062b8fe9a3dd33f3c0706f83f7ff28a2b4de7e3bc43971385ca838d4034
Ports: 8080/TCP, 2112/TCP, 7000/TCP, 7001/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
/bin/sh
-c
replicas=$(eval echo ${KB_POD_LIST} | tr ',' '\n')
# Initialize count
replca_count=0
# Use a for loop to iterate over each space-separated word
for item in $replicas; do
replca_count=$((replca_count + 1))
done
while true; do
count=$(nslookup ${CLUSTER_JOIN} | awk '/^Address: / { print $2 }' | wc -l)
if [ "$count" -eq ${replca_count} ]; then
break
fi
echo "Waiting for all nodes to be running..."
sleep 1
done
export $(cat /weaviate-env/envs | xargs)
/bin/weaviate --host 0.0.0.0 --port "8080" --scheme http --config-file /weaviate-config/conf.yaml --read-timeout=60s --write-timeout=60s
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Tue, 25 Jun 2024 14:57:32 +0800
Finished: Tue, 25 Jun 2024 14:58:31 +0800
Ready: False
Restart Count: 19
Limits:
cpu: 500m
memory: 512Mi
Requests:
cpu: 500m
memory: 512Mi
Liveness: http-get http://:8080/v1/.well-known/live delay=900s timeout=3s period=10s #success=1 #failure=30
Readiness: http-get http://:8080/v1/.well-known/ready delay=3s timeout=3s period=10s #success=1 #failure=3
Startup: http-get http://:8080/v1/.well-known/ready delay=0s timeout=3s period=10s #success=1 #failure=3
Environment Variables from:
weaviate-cluster-weaviate-env ConfigMap Optional: false
weaviate-cluster-weaviate-rsm-env ConfigMap Optional: false
Environment:
KB_POD_NAME: weaviate-cluster-weaviate-0 (v1:metadata.name)
KB_POD_UID: (v1:metadata.uid)
KB_NAMESPACE: default (v1:metadata.namespace)
KB_SA_NAME: (v1:spec.serviceAccountName)
KB_NODENAME: (v1:spec.nodeName)
KB_HOST_IP: (v1:status.hostIP)
KB_POD_IP: (v1:status.podIP)
KB_POD_IPS: (v1:status.podIPs)
KB_HOSTIP: (v1:status.hostIP)
KB_PODIP: (v1:status.podIP)
KB_PODIPS: (v1:status.podIPs)
KB_POD_FQDN: $(KB_POD_NAME).weaviate-cluster-weaviate-headless.$(KB_NAMESPACE).svc
CLUSTER_DATA_BIND_PORT: 7001
CLUSTER_GOSSIP_BIND_PORT: 7000
GOGC: 100
PROMETHEUS_MONITORING_ENABLED: true
PROMETHEUS_MONITORING_PORT: 2112
QUERY_MAXIMUM_RESULTS: 100000
REINDEX_VECTOR_DIMENSIONS_AT_STARTUP: false
TRACK_VECTOR_DIMENSIONS: false
PERSISTENCE_DATA_PATH: /var/lib/weaviate
DEFAULT_VECTORIZER_MODULE: none
CLUSTER_HOSTNAME: $(KB_POD_NAME)
CLUSTER_JOIN: $(KB_CLUSTER_COMP_NAME)-node-discovery.$(KB_NAMESPACE).svc.cluster.local
Mounts:
/var/lib/weaviate from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-h6qcn (ro)
/weaviate-config from weaviate-config (rw)
/weaviate-env from weaviate-env (rw)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
weaviate-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: weaviate-cluster-weaviate-weaviate-config-template
Optional: false
weaviate-env:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: weaviate-cluster-weaviate-weaviate-env-template
Optional: false
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-weaviate-cluster-weaviate-0
ReadOnly: false
kube-api-access-h6qcn:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: kb-data=true:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Topology Spread Constraints: kubernetes.io/hostname:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/instance=weaviate-cluster,apps.kubeblocks.io/component-name=weaviate
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 57m default-scheduler Successfully assigned default/weaviate-cluster-weaviate-0 to aks-cicdamdpool-42454392-vmss000003
Normal SuccessfulAttachVolume 57m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-34e28f6f-6739-49fd-81ad-8ad0e2f95db7"
Normal Killing 54m (x3 over 56m) kubelet Container weaviate failed startup probe, will be restarted
Normal Started 54m (x4 over 57m) kubelet Started container weaviate
Normal Created 52m (x6 over 57m) kubelet Created container weaviate
Normal Pulled 31m (x12 over 57m) kubelet Container image "docker.io/semitechnologies/weaviate:1.19.6" already present on machine
Warning Unhealthy 17m (x48 over 57m) kubelet Startup probe failed: Get "http://10.244.4.85:8080/v1/.well-known/ready": dial tcp 10.244.4.85:8080: connect: connection refused
Warning BackOff 2m4s (x175 over 51m) kubelet Back-off restarting failed container weaviate in pod weaviate-cluster-weaviate-0_default(2518cd1c-42b6-469d-8f41-54be1e3047a2)
logs:
k logs -f weaviate-cluster-weaviate-0
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Waiting for all nodes to be running...
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
OS: [e.g. iOS]
Browser [e.g. chrome, safari]
Version [e.g. 22]
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
haowen159
changed the title
[BUG] multi-node weaviate cluster pod CrashLoopBackOff
[BUG] multi-node weaviate cmpd cluster pod CrashLoopBackOff
Jun 25, 2024
haowen159
changed the title
[BUG] multi-node weaviate cmpd cluster pod CrashLoopBackOff
[BUG] multi-node weaviate cluster pod CrashLoopBackOff using cmpd only
Jun 25, 2024
Describe the bug
A clear and concise description of what the bug is.
multi-node weaviate cluster pod CrashLoopBackOff
To Reproduce
1.create cluster
yaml:
logs:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: