Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestLocal_nsmgr_restart.TestRunHealSuite/TestLocal_nsmgr_restart is unstable on GKE #228

Closed
denis-tingaikin opened this issue Nov 25, 2021 · 0 comments · Fixed by networkservicemesh/sdk-vpp#460
Assignees
Labels
ASAP Highest priority problem bug Something isn't working

Comments

@denis-tingaikin
Copy link
Member

denis-tingaikin commented Nov 25, 2021

Logs

Execution attempt: 0 Output file: .tests/cloud_test/gke-1/030-TestRunHealSuite1-TestLocal_nsmgr_restart-run.log
=== RUN   TestRunHealSuite/TestLocal_nsmgr_restart
time=2021-11-25T16:14:04Z level=info msg=NAMESPACE=($(kubectl create -f https://raw.githubusercontent.com/networkservicemesh/deployments-k8s/558934f23907371225c772c723e95b1b1431faae/examples/heal/namespace.yaml)[0])
NAMESPACE=${NAMESPACE:10} TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:04Z level=info msg=NODES=($(kubectl get nodes -o go-template='{{range .items}}{{ if not .spec.taints  }}{{index .metadata.labels "kubernetes.io/hostname"}} {{end}}{{end}}')) TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:05Z level=info msg=cat > kustomization.yaml <<EOF
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: ${NAMESPACE}

bases:
- https://github.com/networkservicemesh/deployments-k8s/apps/nsc-kernel?ref=558934f23907371225c772c723e95b1b1431faae
- https://github.com/networkservicemesh/deployments-k8s/apps/nse-kernel?ref=558934f23907371225c772c723e95b1b1431faae

patchesStrategicMerge:
- patch-nsc.yaml
- patch-nse.yaml
EOF TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:05Z level=info msg=cat > patch-nsc.yaml <<EOF
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nsc-kernel
spec:
template:
spec:
containers:
- name: nsc
env:
- name: NSM_NETWORK_SERVICES
value: kernel://icmp-responder/nsm-1

nodeSelector:
kubernetes.io/hostname: ${NODES[0]}
EOF TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:05Z level=info msg=cat > patch-nse.yaml <<EOF
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nse-kernel
spec:
template:
spec:
containers:
- name: nse
env:
- name: NSM_CIDR_PREFIX
value: 172.16.1.100/31
nodeSelector:
kubernetes.io/hostname: ${NODES[1]}
EOF TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:05Z level=info msg=kubectl apply -k . TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:07Z level=info msg=deployment.apps/nsc-kernel created
deployment.apps/nse-kernel created TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:07Z level=info msg=kubectl wait --for=condition=ready --timeout=1m pod -l app=nsc-kernel -n ${NAMESPACE} TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:10Z level=info msg=pod/nsc-kernel-b48546f4-66vxt condition met TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:10Z level=info msg=kubectl wait --for=condition=ready --timeout=1m pod -l app=nse-kernel -n ${NAMESPACE} TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:10Z level=info msg=pod/nse-kernel-56d7bdb56f-lpgl8 condition met TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:10Z level=info msg=NSC=$(kubectl get pods -l app=nsc-kernel -n ${NAMESPACE} --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}') TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:10Z level=info msg=NSE=$(kubectl get pods -l app=nse-kernel -n ${NAMESPACE} --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}') TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:10Z level=info msg=kubectl exec ${NSC} -n ${NAMESPACE} -- ping -c 4 172.16.1.100 TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:24Z level=info msg=PING 172.16.1.100 (172.16.1.100): 56 data bytes

--- 172.16.1.100 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:24Z level=info msg=command terminated with exit code 1 TestRunHealSuite/TestLocal_nsmgr_restart=stderr
time=2021-11-25T16:14:24Z level=info msg=1 TestRunHealSuite/TestLocal_nsmgr_restart=exitCode
time=2021-11-25T16:14:24Z level=info msg=kubectl exec ${NSC} -n ${NAMESPACE} -- ping -c 4 172.16.1.100 TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:27Z level=info msg=PING 172.16.1.100 (172.16.1.100): 56 data bytes
64 bytes from 172.16.1.100: seq=0 ttl=64 time=1.321 ms
64 bytes from 172.16.1.100: seq=1 ttl=64 time=2.037 ms
64 bytes from 172.16.1.100: seq=2 ttl=64 time=0.566 ms
64 bytes from 172.16.1.100: seq=3 ttl=64 time=0.673 ms

--- 172.16.1.100 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.566/1.149/2.037 ms TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:27Z level=info msg=kubectl exec ${NSE} -n ${NAMESPACE} -- ping -c 4 172.16.1.101 TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:31Z level=info msg=PING 172.16.1.101 (172.16.1.101): 56 data bytes
64 bytes from 172.16.1.101: seq=0 ttl=64 time=0.710 ms
64 bytes from 172.16.1.101: seq=1 ttl=64 time=0.587 ms
64 bytes from 172.16.1.101: seq=2 ttl=64 time=0.531 ms
64 bytes from 172.16.1.101: seq=3 ttl=64 time=0.579 ms

--- 172.16.1.101 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.531/0.601/0.710 ms TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:31Z level=info msg=NSMGR=$(kubectl get pods -l app=nsmgr --field-selector spec.nodeName==${NODES[0]} -n nsm-system --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}') TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:31Z level=info msg=kubectl delete pod ${NSMGR} -n nsm-system TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:33Z level=info msg=pod "nsmgr-d667c" deleted TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:33Z level=info msg=kubectl wait --for=condition=ready --timeout=1m pod -l app=nsmgr --field-selector spec.nodeName==${NODES[0]} -n nsm-system TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:40Z level=info msg=pod/nsmgr-jq56p condition met TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:40Z level=info msg=kubectl exec ${NSC} -n ${NAMESPACE} -- ping -c 4 172.16.1.100 TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:44Z level=info msg=PING 172.16.1.100 (172.16.1.100): 56 data bytes
64 bytes from 172.16.1.100: seq=0 ttl=64 time=1.453 ms
64 bytes from 172.16.1.100: seq=1 ttl=64 time=0.758 ms

--- 172.16.1.100 ping statistics ---
4 packets transmitted, 2 packets received, 50% packet loss
round-trip min/avg/max = 0.758/1.105/1.453 ms TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:44Z level=info msg=kubectl exec ${NSE} -n ${NAMESPACE} -- ping -c 4 172.16.1.101 TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:14:58Z level=info msg=PING 172.16.1.101 (172.16.1.101): 56 data bytes

--- 172.16.1.101 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:14:58Z level=info msg=command terminated with exit code 1 TestRunHealSuite/TestLocal_nsmgr_restart=stderr
time=2021-11-25T16:14:58Z level=info msg=1 TestRunHealSuite/TestLocal_nsmgr_restart=exitCode
time=2021-11-25T16:14:58Z level=info msg=kubectl exec ${NSE} -n ${NAMESPACE} -- ping -c 4 172.16.1.101 TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:15:11Z level=info msg=PING 172.16.1.101 (172.16.1.101): 56 data bytes

--- 172.16.1.101 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:15:11Z level=info msg=command terminated with exit code 1 TestRunHealSuite/TestLocal_nsmgr_restart=stderr
time=2021-11-25T16:15:11Z level=info msg=1 TestRunHealSuite/TestLocal_nsmgr_restart=exitCode
time=2021-11-25T16:15:12Z level=info msg=kubectl exec ${NSE} -n ${NAMESPACE} -- ping -c 4 172.16.1.101 TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:15:25Z level=info msg=PING 172.16.1.101 (172.16.1.101): 56 data bytes

--- 172.16.1.101 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:15:25Z level=info msg=command terminated with exit code 1 TestRunHealSuite/TestLocal_nsmgr_restart=stderr
time=2021-11-25T16:15:25Z level=info msg=1 TestRunHealSuite/TestLocal_nsmgr_restart=exitCode
time=2021-11-25T16:15:25Z level=info msg=kubectl exec ${NSE} -n ${NAMESPACE} -- ping -c 4 172.16.1.101 TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:15:38Z level=info msg=PING 172.16.1.101 (172.16.1.101): 56 data bytes

--- 172.16.1.101 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:15:38Z level=info msg=command terminated with exit code 1 TestRunHealSuite/TestLocal_nsmgr_restart=stderr
time=2021-11-25T16:15:38Z level=info msg=1 TestRunHealSuite/TestLocal_nsmgr_restart=exitCode
time=2021-11-25T16:15:39Z level=info msg=kubectl exec ${NSE} -n ${NAMESPACE} -- ping -c 4 172.16.1.101 TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:15:52Z level=info msg=PING 172.16.1.101 (172.16.1.101): 56 data bytes

--- 172.16.1.101 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss TestRunHealSuite/TestLocal_nsmgr_restart=stdout
time=2021-11-25T16:15:52Z level=info msg=command terminated with exit code 1 TestRunHealSuite/TestLocal_nsmgr_restart=stderr
time=2021-11-25T16:15:52Z level=info msg=1 TestRunHealSuite/TestLocal_nsmgr_restart=exitCode
time=2021-11-25T16:15:52Z level=error msg=command didn't succeed until timeout cmd=kubectl exec ${NSE} -n ${NAMESPACE} -- ping -c 4 172.16.1.101
suite.go:130:
Error Trace:	suite.go:130
suite.gen.go:95
Error:      	Not equal:
expected: 0
actual  : 1
Test:       	TestRunHealSuite/TestLocal_nsmgr_restart
time="2021-11-25T16:15:53Z" level=error msg="nsc-kernel-b48546f4-66vxt: An error while retrieving logs: the server rejected our request for an unknown reason (get pods nsc-kernel-b48546f4-66vxt)"
time="2021-11-25T16:15:53Z" level=error msg="admission-webhook-k8s-5947ddf6b9-gcxjt: An error while retrieving logs: the server rejected our request for an unknown reason (get pods admission-webhook-k8s-5947ddf6b9-gcxjt)"
time="2021-11-25T16:15:53Z" level=error msg="nse-kernel-56d7bdb56f-lpgl8: An error while retrieving logs: the server rejected our request for an unknown reason (get pods nse-kernel-56d7bdb56f-lpgl8)"
time="2021-11-25T16:15:53Z" level=error msg="registry-k8s-8476674f64-qh5jv: An error while retrieving logs: the server rejected our request for an unknown reason (get pods registry-k8s-8476674f64-qh5jv)"
time="2021-11-25T16:15:53Z" level=error msg="forwarder-vpp-bwnx5: An error while retrieving logs: the server rejected our request for an unknown reason (get pods forwarder-vpp-bwnx5)"
time="2021-11-25T16:15:53Z" level=error msg="nsmgr-jq56p: An error while retrieving logs: the server rejected our request for an unknown reason (get pods nsmgr-jq56p)"
time="2021-11-25T16:15:53Z" level=error msg="forwarder-vpp-5xkr2: An error while retrieving logs: the server rejected our request for an unknown reason (get pods forwarder-vpp-5xkr2)"
time="2021-11-25T16:15:53Z" level=error msg="nsmgr-xr9qh: An error while retrieving logs: the server rejected our request for an unknown reason (get pods nsmgr-xr9qh)"
time=2021-11-25T16:15:53Z level=info msg=kubectl delete ns ${NAMESPACE} TestRunHealSuite/TestLocal_nsmgr_restart=stdin
time=2021-11-25T16:16:05Z level=info msg=namespace "ns-zbg9x" deleted TestRunHealSuite/TestLocal_nsmgr_restart=stdout
--- FAIL: TestRunHealSuite/TestLocal_nsmgr_restart (121.18s)

containers logs

Build

https://github.com/networkservicemesh/integration-k8s-gke/runs/4325852760?check_suite_focus=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASAP Highest priority problem bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants