Skip to content

Commit

Permalink
blocked-edges/4.3.0-rc.3: Allow 4.3.0-rc.0 -> 4.3.0-rc.3
Browse files Browse the repository at this point in the history
Baked in edges:

  $ oc adm release info quay.io/openshift-release-dev/ocp-release:4.3.0-rc.0-x86_64 | grep Upgrades
    Upgrades: 4.2.13
  $ oc adm release info quay.io/openshift-release-dev/ocp-release:4.3.0-rc.3-x86_64 | grep Upgrades
    Upgrades: 4.2.16, 4.3.0-rc.0, 4.3.0-rc.1, 4.3.0-rc.2

The wide 'from' regexp was appropriate for 4.3.0-rc.0, which had no
4.3 update sources.  But rc.3 does have update sources, and we want to
allow 4.3.0-rc.0 -> 4.3.0-rc.3, because it is not impacted by the
4.2->4.3 GCP update bug.  The overly-strict regexp was from 6d3db09
(Blocking edges to candidate 4.3.0-rc.3, 2020-01-23, #34).

Also expand the referenced bugs to for the blocked 4.2 -> 4.3 edges:

* Update hangs with [1]:

    Working towards 4.3.0...: 13% complete

  and machine-config going Degraded=True with RequiredPoolsFailed:

    Unable to apply 4.3.0-...: timed out waiting for the condition
    during syncRequiredMachineConfigPools: pool master has not
    progressed to latest configuration: controller version mismatch
    for rendered-master-6c22... expected 23a6... has d780... retrying

  Fixed in 4.2 with MCO 31fed93 [2] and in 4.2 with MCO 25bb6ae [3].

    $ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.2.14 | grep machine-config
      machine-config-operator                       https://github.com/openshift/machine-config-operator                       d780d197a9c5848ba786982c0c4aaa7487297046
    $ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.2.16 | grep machine-config
      machine-config-operator                       https://github.com/openshift/machine-config-operator                       31fed93186c9f84708f5cdfd0227ffe4f79b31cd

  So the 4.2 fix was in 4.2.16.

    $ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.3.0-rc.0 | grep machine-config
      machine-config-operator                       https://github.com/openshift/machine-config-operator                       23a6e6fb37e73501bc3216183ef5e6ebb15efc7a
    $ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.3.0-rc.3 | grep machine-config
      machine-config-operator                       https://github.com/openshift/machine-config-operator                       25bb6aeb58135c38a667e849edf5244871be4992

  So the 4.3 fix was new in rc.3.

* Updates hang with FailedCreatePodSandBox events in the
  openshift-ingress namespace like [4]:

    pod/router-default-...: Failed create pod sandbox: rpc error: code
    = Unknown desc = failed to create pod network sandbox
    k8s_router-default-..._openshift-ingress_...(...): Multus: error
    adding pod to network "openshift-sdn": delegateAdd: error invoking
    DelegateAdd - "openshift-sdn": error in getting result from
    AddNetwork: CNI request failed with status 400: 'failed to run
    IPAM for ...: failed to run CNI IPAM ADD: failed to allocate for
    range 0: no IP addresses available in range set: <ip1>-<ip2>

  Fixed in 4.2 with MCO 9366460 [5] and in 4.3 with MCO 311a01e [6].

    $ git --no-pager log --first-parent --oneline -4 origin/release-4.2
    6e0df82c (origin/release-4.2) Merge pull request openshift#1347 from openshift-cherrypick-robot/cherry-pick-1285-to-release-4.2
    93664600 Merge pull request openshift#1362 from rphillips/fixes/1787581_4.2
    bd358bb7 Merge pull request openshift#1323 from openshift-cherrypick-robot/cherry-pick-1320-to-release-4.2
    31fed931 Merge pull request openshift#1358 from runcom/osimageurl-race-42

  so the 4.2 fix was after 4.2.16's 31fed93186.

    $ git --no-pager log --first-parent --oneline -8 origin/release-4.3
    3ad3a836 (origin/release-4.3) Merge pull request openshift#1399 from celebdor/haproxy-v4v6
    25503eee Merge pull request openshift#1353 from russellb/1211-4.3-backport
    67ab306b Merge pull request openshift#1426 from mandre/ssc43
    d74f56fe Merge pull request openshift#1410 from retroflexer/manual-cherry-pick-from-master
    207cc171 Merge pull request openshift#1406 from openshift-cherrypick-robot/cherry-pick-1396-to-release-4.3
    25bb6aeb Merge pull request openshift#1359 from runcom/osimageurl-race-43
    311a01e8 Merge pull request openshift#1361 from rphillips/fixes/1787581_4.3
    23a6e6fb Merge pull request openshift#1348 from openshift-cherrypick-robot/cherry-pick-1285-to-release-4.3

  So the 4.3 fix was between rc.0's 23a6e6fb37 and rc.3's 25bb6aeb58
  (see 'release info' calls in the previous list entry for those
  commit hashes).

* Update CI fails with [7,8]:

    Could not reach HTTP service through <ip>:80 after 2m0s

  and authentication going Degraded=True with RouteHealthDegradedFailedGet:

    RouteHealthDegraded: failed to GET route: dial tcp <ip>:443:
    connect: connection refused

  Fixed in 4.2 with SDN 677b3a8 [9] and in 4.3 with SDN 74a8aee [10].

    $ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.2.16 | grep ' node '
      node                                          https://github.com/openshift/sdn                                           770cb7bf922a721bc6c62af5490439d6174036fe
    $ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.2.14 | grep ' node '
      node                                          https://github.com/openshift/sdn                                           770cb7bf922a721bc6c62af5490439d6174036fe
    $ git --no-pager log --first-parent --oneline -4 origin/release-4.2
    098a6410 (origin/release-4.2) Merge pull request #95 from danwinship/fork-k8s-client-go-4.2
    9955a65b Merge pull request #72 from juanluisvaladas/too_many_dns_queries_42
    677b3a80 Merge pull request #90 from openshift-cherrypick-robot/cherry-pick-81-to-release-4.2
    770cb7bf Merge pull request #73 from danwinship/egressip-cleanup-4.2

  So the fix landed after 4.2.16's 770cb7bf.

    $ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.3.0-rc.0 | grep ' sdn '
      sdn                                           https://github.com/openshift/sdn                                           d4e36d5019ef0e130e0d246581508821a7322753
    $ git --no-pager log --first-parent --oneline -5 origin/release-4.3
    490a574e (origin/release-4.3) Merge pull request openshift#98 from openshift-cherrypick-robot/cherry-pick-96-to-release-4.3
    85ab1033 Merge pull request #78 from openshift-cherrypick-robot/cherry-pick-57-to-release-4.3
    d4e36d50 Merge pull request #85 from openshift-cherrypick-robot/cherry-pick-84-to-release-4.3
    dabc4ef5 Merge pull request #83 from dougbtv/backport-build-use-host-local
    74a8aee3 Merge pull request #81 from openshift-cherrypick-robot/cherry-pick-79-to-release-4.3

  So the fix landed before rc.0's d4e36d50.

* GCP update CI fails with [11]:

    Could not reach HTTP service through <ip>:80 after 2m0s

  in 4.2.16 -> 4.3.0-rc.0 [12], 4.2.16 -> 4.3.0-rc.3 [13,14,15], and
  4.2.18 -> 4.3.1 [16].  This doesn't happen every time though; at
  least one 4.2.16 -> 4.3.0-rc.3 has passed on GCP [17].  We don't
  have a root-cause yet, but the final failure matches [8] discussed
  above.

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1786993
[2]: openshift/machine-config-operator#1358 (comment)
[3]: openshift/machine-config-operator#1359 (comment)
[4]: https://bugzilla.redhat.com/show_bug.cgi?id=1787635
[5]: openshift/machine-config-operator#1362 (comment)
[6]: openshift/machine-config-operator#1361 (comment)
[7]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade/214#1:build-log.txt%3A414
[8]: https://bugzilla.redhat.com/show_bug.cgi?id=1781763
[9]: openshift/sdn#90 (comment)
[10]: openshift/sdn#81 (comment)
[11]: https://bugzilla.redhat.com/show_bug.cgi?id=1785457
[12]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade/216
[13]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade/232
[14]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade/233
[15]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade/234
[16]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade/286
[17]: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade/230
  • Loading branch information
wking committed Feb 6, 2020
1 parent 5ef7cc3 commit 1cf3ef3
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 2 deletions.
5 changes: 4 additions & 1 deletion blocked-edges/4.3.0-rc.0.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
to: 4.3.0-rc.0
from: .*
# 4.2 -> 4.3 updates occasionally hit RequiredPoolsFailed degradation: https://bugzilla.redhat.com/show_bug.cgi?id=1786993
# 4.2 -> 4.3 updates occasionally hit FailedCreatePodSandBox events, fixed in rc.3, but in neither 4.2.16 nor rc.0: https://bugzilla.redhat.com/show_bug.cgi?id=1787635
# 4.2 -> 4.3 updates occasionally hit RequiredPoolsFailed degradation, fixed in 4.2.16 and rc.3, but in neither 4.2.13 nor rc.0: https://bugzilla.redhat.com/show_bug.cgi?id=1786993
# 4.2 -> 4.3 updates occasionally hit RouteHealthDegraded degradation, fixed in rc.0, but not in 4.2.16: https://bugzilla.redhat.com/show_bug.cgi?id=1790704
# 4.2.* -> 4.3.0-rc.0 Sometimes workloads on GCP are unreachable during 4.2.x to 4.3.0 upgrade sometimes: https://bugzilla.redhat.com/show_bug.cgi?id=1793635
4 changes: 3 additions & 1 deletion blocked-edges/4.3.0-rc.3.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
to: 4.3.0-rc.3
from: .*
from: 4\.2\..*
# 4.2 -> 4.3 updates occasionally hit FailedCreatePodSandBox events, fixed in rc.3, but not in 4.2.16: https://bugzilla.redhat.com/show_bug.cgi?id=1787635
# 4.2 -> 4.3 updates occasionally hit RouteHealthDegraded degradation, fixed in rc.0, but not in 4.2.16: https://bugzilla.redhat.com/show_bug.cgi?id=1790704
# 4.2.* -> 4.3.0-rc.3 Sometimes workloads on GCP are unreachable during 4.2.x to 4.3.0 upgrade sometimes: https://bugzilla.redhat.com/show_bug.cgi?id=1793635

0 comments on commit 1cf3ef3

Please sign in to comment.