Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to 4.11.0-0.okd-2023-01-14-152430 fails with OAuthServerRouteEndpointAccessibleControllerAvailable 172.30.0.10:53: server misbehaving dial tcp #1491

Closed
jeffmccune opened this issue Feb 4, 2023 · 3 comments

Comments

@jeffmccune
Copy link

Describe the bug

Reprinting Cluster State:
When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
ClusterID: efe54d11-2d9b-416e-8898-485fcbfd7c34
ClusterVersion: Updating to "4.11.0-0.okd-2023-01-14-152430" from "4.11.0-0.okd-2022-12-02-145640" for 3 hours: Unable to apply 4.11.0-0.okd-2023-01-14-152430: the cluster operator authentication has not yet successfully rolled out
ClusterOperators:
        clusteroperator/authentication is not available (OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.ocp.ois.run/healthz": dial tcp: lookup oauth-openshift.apps.ocp.ois.run on 172.30.0.10:53: server misbehaving) because OAuthServerConfigObservationDegraded: failed to apply IDP keycloak config: dial tcp: lookup sso.apps.ocp.ois.run on 172.30.0.10:53: server misbehaving
OAuthServerRouteEndpointAccessibleControllerDegraded: Get "https://oauth-openshift.apps.ocp.ois.run/healthz": dial tcp: lookup oauth-openshift.apps.ocp.ois.run on 172.30.0.10:53: server misbehaving
`oc get co`
❯ oc get co
NAME                                       VERSION                          AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.11.0-0.okd-2023-01-14-152430   False       False         True       171m    OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.ocp.ois.run/healthz": dial tcp: lookup oauth-openshift.apps.ocp.ois.run on 172.30.0.10:53: server misbehaving
baremetal                                  4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
cloud-controller-manager                   4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
cloud-credential                           4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
cluster-autoscaler                         4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
config-operator                            4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
console                                    4.11.0-0.okd-2023-01-14-152430   True        False         False      22h
csi-snapshot-controller                    4.11.0-0.okd-2023-01-14-152430   True        False         False      68d
dns                                        4.11.0-0.okd-2022-12-02-145640   True        False         False      73d
etcd                                       4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
image-registry                             4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
ingress                                    4.11.0-0.okd-2023-01-14-152430   True        False         False      68d
insights                                   4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
kube-apiserver                             4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
kube-controller-manager                    4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
kube-scheduler                             4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
kube-storage-version-migrator              4.11.0-0.okd-2023-01-14-152430   True        False         False      37d
machine-api                                4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
machine-approver                           4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
machine-config                             4.11.0-0.okd-2022-12-02-145640   True        False         False      22h
marketplace                                4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
monitoring                                 4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
network                                    4.11.0-0.okd-2022-12-02-145640   True        False         False      73d
node-tuning                                4.11.0-0.okd-2023-01-14-152430   True        False         False      168m
openshift-apiserver                        4.11.0-0.okd-2023-01-14-152430   True        False         False      171m
openshift-controller-manager               4.11.0-0.okd-2023-01-14-152430   True        False         False      57d
openshift-samples                          4.11.0-0.okd-2023-01-14-152430   True        False         False      172m
operator-lifecycle-manager                 4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
operator-lifecycle-manager-catalog         4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
operator-lifecycle-manager-packageserver   4.11.0-0.okd-2023-01-14-152430   True        False         False      68d
service-ca                                 4.11.0-0.okd-2023-01-14-152430   True        False         False      73d
storage                                    4.11.0-0.okd-2023-01-14-152430   True        False         False      73d

Version

UPI

How reproducible

Log bundle

@jeffmccune
Copy link
Author

jeffmccune commented Feb 4, 2023

DNS appears to be working fine inside and outside the cluster.

oc -n openshift-dns rsh dns-default-88w6vc
sh-4.4# dig oauth-openshift.apps.ocp.ois.run @172.30.0.10

; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> oauth-openshift.apps.ocp.ois.run @172.30.0.10
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50328
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
; COOKIE: dc336dd7780f0cf0 (echoed)
;; QUESTION SECTION:
;oauth-openshift.apps.ocp.ois.run. IN   A

;; ANSWER SECTION:
oauth-openshift.apps.ocp.ois.run. 263 IN CNAME  ocp.ois.run.
ocp.ois.run.            263     IN      A       65.102.23.41

;; Query time: 3 msec
;; SERVER: 172.30.0.10#53(172.30.0.10)
;; WHEN: Sat Feb 04 19:48:03 UTC 2023
;; MSG SIZE  rcvd: 157

@jeffmccune
Copy link
Author

From within the authentication operator pod DNS isn't working:

❯ oc -n openshift-authentication-operator rsh authentication-operator-7c6799dfc7-22vnx
sh-4.4# dig oauth-openshift.apps.ocp.ois.run @172.30.0.10

; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> oauth-openshift.apps.ocp.ois.run @172.30.0.10
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 12701
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
; COOKIE: 9f57b570315bc49d (echoed)
;; QUESTION SECTION:
;oauth-openshift.apps.ocp.ois.run. IN   A

;; Query time: 2002 msec
;; SERVER: 172.30.0.10#53(172.30.0.10)
;; WHEN: Sat Feb 04 19:51:12 UTC 2023
;; MSG SIZE  rcvd: 73

@jeffmccune
Copy link
Author

Restarted the openshift-authentication-operator deployment. DNS started working and the upgrade started progressing again.

kubectl  rollout restart deployment -n openshift-authentication-operator authentication-operator
oc -n openshift-authentication-operator rsh authentication-operator-85d6584895-vglss
sh-4.4# dig +short oauth-openshift.apps.ocp.ois.run @172.30.0.10
ocp.ois.run.
65.102.23.41

Upgrade started progressing again:

❯ oc get co
NAME                                       VERSION                          AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.11.0-0.okd-2023-01-14-152430   True        True          False      12s     OAuthServerDeploymentProgressing: deployment/oauth-openshift.openshift-authentication: observed generation is 20, desired generation is 21.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant