Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When reconnecting, if using a zero backoff and failing to reconnect to
ovsdb for a period of 10 seconds, the logs are flooded with failed
attempts. After 5 failed attempts, just notify the user further failures
are suppressed.
An example here:
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_ovn-kubernetes/812/pull-ci-openshift-ovn-kubernetes-master-e2e-aws-ovn-upgrade/1456291255993503744/artifacts/e2e-aws-ovn-upgrade/gather-extra/artifacts/pods/openshift-ovn-kubernetes_ovnkube-master-fqnpj_ovnkube-master.log
Timestamps where reconnect starts and ends:
I1104 17:45:54.792590 1 client.go:264] "msg"="trying to connect" "database"="OVN_Northbound" "endpoint"="ssl:10.0.190.62:9641"
I1104 17:46:04.230090 1 client.go:219] "msg"="successfully connected" "database"="OVN_Northbound" "endpoint"="ssl:10.0.190.62:9641"
I1104 17:46:04.230107 1 client.go:240] "msg"="reconnected - restarting monitors" "database"="OVN_Northbound"
In between there tons of:
E1104 17:50:57.742817 1 client.go:1015] "msg"="failed to reconnect" "error"="unable to connect to any endpoints: failed to connect to ssl:10.0.168.127:9642: failed to open connection: dial tcp 10.0.168.127:9642: connect: connection refused. failed to connect to ssl:10.0.190.62:9642: endpoint is not leader. failed to connect to ssl:10.0.245.88:9642: endpoint is not leader" "database"="OVN_Southbound"
every few milliseconds