Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IBMCloud: Use unique mutex for Client retrieval #6241

Merged

Conversation

cjschaef
Copy link
Member

A shared mutex for all ibmcloud.metadata functions, including the
public Client method, deadlocks calls. Since the Client is a public
method, we should still use a mutex, but it should be a unique
mutex to prevent deadlocking.

@rvanderp3
Copy link
Contributor

/assign @rvanderp3

@rvanderp3
Copy link
Contributor

/hold
waiting for tests

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 18, 2022
@jstuever
Copy link
Contributor

/assign

A shared mutex for all ibmcloud.metadata functions, including the
public Client method, deadlocks calls. Since the Client is a public
method, we should still use a mutex, but it should be a unique
mutex to prevent deadlocking.
@cjschaef
Copy link
Member Author

e2e-ibmcloud install failure due to
time="2022-08-18T20:33:42Z" level=error msg="failed to initialize the cluster: Cluster operators image-registry, storage are not available"
Which might be a timing issue in regard to IBM Cloud LB's delaying the CCM from proper worker node labels for the Storage operator
"msg":"Failed to initialize driver...","name":"ibm-vpc-block-csi-driver","CSIDriverName":"IBM VPC block driver","error":"Controller_Helper: Failed to initialize node metadata: error: One or few required node label(s) is/are missing
and potentially the same reason for issues with image-registry
http.response.status=503

Will retest once I push up changes recommended with mutex

@cjschaef
Copy link
Member Author

Install worked as expected

INFO[2022-08-18T23:01:30Z] Running step e2e-ibmcloud-ipi-install-install. 
INFO[2022-08-18T23:47:00Z] Step e2e-ibmcloud-ipi-install-install succeeded after 45m30s. 

Failure due to conformance tests, which is currently expected

INFO[2022-08-19T00:42:52Z] Step e2e-ibmcloud-openshift-e2e-test failed after 55m32s. 
INFO[2022-08-19T00:42:52Z] Step phase test failed after 55m32s.

@rvanderp3
Copy link
Contributor

/hold cancel
/lgtm

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Aug 19, 2022
@jeffnowicki
Copy link
Contributor

@jstuever this should be ready for your approval... thanks!

@jstuever
Copy link
Contributor

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 19, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jstuever

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 19, 2022
@barbacbd
Copy link
Contributor

/test yaml-lint
/test verify-vendor
/test verify-codegen
/test unit
/test tf-lint
/test shellcheck
/test okd-verify-codegen
/test okd-unit
/test okd-images
/test images
/test golint

@MayXuQQ
Copy link
Contributor

MayXuQQ commented Aug 22, 2022

/test aro-unit
/test e2e-aws
/test gofmt
/test govet
/test e2e-ibmcloud

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 22, 2022

@cjschaef: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-ibmcloud 61ff6c8 link false /test e2e-ibmcloud

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit e781753 into openshift:master Aug 22, 2022
@cjschaef cjschaef deleted the ibmcloud_fix_client_mutex branch August 22, 2022 13:41
barbacbd added a commit to barbacbd/installer that referenced this pull request Aug 22, 2022
commit e781753
Merge: b08fe18 61ff6c8
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Mon Aug 22 06:50:07 2022 +0000

    Merge pull request openshift#6241 from cjschaef/ibmcloud_fix_client_mutex

    IBMCloud: Use unique mutex for Client retrieval

commit b08fe18
Merge: fc94a6c af8bace
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Sat Aug 20 04:16:30 2022 +0000

    Merge pull request openshift#6047 from jcpowermac/upi_image_remove_govc

    OCPBUGS-262: UPI image download govc rate limit failure

commit fc94a6c
Merge: 44e71ef 322d4fe
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Fri Aug 19 14:46:34 2022 +0000

    Merge pull request openshift#6234 from jkyros/allow-mco-consume-multiple-images

    Extract the image-references file from the release, pass it to `machine-config-operator`

commit 44e71ef
Merge: 3e65614 6941459
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Fri Aug 19 05:32:41 2022 +0000

    Merge pull request openshift#6217 from clnperez/vpc_ccon_sharing

    powervs: allow VPC, Cloud connection, and NW re-use

commit 3e65614
Merge: a8482db 6591508
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Thu Aug 18 23:02:46 2022 +0000

    Merge pull request openshift#6208 from cjschaef/bz_2109800

    Bug 2109800: IBMCloud: Allow metrics traffic

commit 61ff6c8
Author: Christopher J Schaefer <cjschaef@us.ibm.com>
Date:   Thu Aug 18 12:42:18 2022 -0500

    IBMCloud: Use unique mutex for Client retrieval

    A shared mutex for all ibmcloud.metadata functions, including the
    public Client method, deadlocks calls. Since the Client is a public
    method, we should still use a mutex, but it should be a unique
    mutex to prevent deadlocking.

commit a8482db
Merge: e1959c2 486b5e5
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Wed Aug 17 22:41:05 2022 +0000

    Merge pull request openshift#6183 from cjschaef/ibmcloud_byon_enablement

    IBMCloud: BYON Enablement

commit e1959c2
Merge: 3caf5ba 3ebd708
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Wed Aug 17 01:39:22 2022 +0000

    Merge pull request openshift#6147 from bkhadars/resource_validation

    IBMCLOUD Power VS: Updated Max and Min machinepool resource limits

commit 3caf5ba
Merge: 91c2221 04b3f70
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Tue Aug 16 19:55:33 2022 +0000

    Merge pull request openshift#6199 from AnnaZivkovic/consume_user_specified_tags_on_control_plane

    GCP: Added user specified tags on control plane instances

commit 322d4fe
Author: John Kyros <79665180+jkyros@users.noreply.github.com>
Date:   Mon Jul 18 11:05:46 2022 -0500

    Extract release's image-references file for MCO

    Previously all the MCO needed was the machine-os-content image, which
    became osImageURL, but now that we have new format images, extension
    containers, and probably a pair (one new format image, one extension
    container) per RHCOS major version (rhel-coreos-8, rhel-coreos-9, etc)
    we need to consume more than one image out of the payload.

    As a result, the MCO needs the image-references file so that it can read all of the
    OS images contained within and select the one it prefers.

    This dumps the image-references file out of the payload so the MCO can
    consume it, and also adds an argument to the `machine-config-operator`
    call to consume it.

commit 91c2221
Merge: b297102 85f2832
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Tue Aug 16 16:08:47 2022 +0000

    Merge pull request openshift#5657 from patrickdillon/azurestack-techpreview

    Azure Stack UPI Docs: Remove Feature Gate CRs

commit b297102
Merge: 3db9150 cbc80e9
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Mon Aug 15 16:56:29 2022 +0000

    Merge pull request openshift#6212 from RyanJenkins99/412_bootimage_bump

    Bug 2115790: bump RHCOS 4.12 bootimage metadata

commit 3db9150
Merge: 6e8c9f9 a6aea18
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Sun Aug 14 02:31:02 2022 +0000

    Merge pull request openshift#6210 from clnperez/ibmcloud-tf-bump-1.44.2

    terraform: ibmcloud: bump provider version

commit 6e8c9f9
Merge: 977b2bb 1647e35
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Sat Aug 13 07:49:10 2022 +0000

    Merge pull request openshift#6077 from r4f4/golang-1.18

    Revendor with golang 1.18

commit 977b2bb
Merge: 8e925f2 c55b118
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Fri Aug 12 17:03:23 2022 +0000

    Merge pull request openshift#6163 from r4f4/upi-go-118

    Update installer images to use golang 1.18

commit 8e925f2
Merge: fdd3902 3051e7d
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Fri Aug 12 00:52:27 2022 +0000

    Merge pull request openshift#6166 from jstuever/cors2036

    GCP: Add networkProjectID parameter to install-config.

commit 486b5e5
Author: Christopher J Schaefer <cjschaef@us.ibm.com>
Date:   Mon Mar 14 16:11:52 2022 -0500

    IBMCloud: BYON Enablement

    Setup enablement for Bring Your Own Network support on IBM Cloud
    using IPI.

commit 6941459
Author: Christy Norman <christy@linux.vnet.ibm.com>
Date:   Wed Aug 10 16:08:43 2022 -0700

    terraform: ibmcloud: bump provider version

    this is being added as a seperate PR, and whehen it merges, i'll remove this commit. this is included so that the tests run

    Signed-off-by: Christy Norman <christy@linux.vnet.ibm.com>

commit e38e6cd
Author: Christy Norman <christy@linux.vnet.ibm.com>
Date:   Wed Aug 10 16:07:19 2022 -0700

    powervs: allow vpc and nw re-use

    because Power VS has some networking limitations, we have been restricted to one cluster/zone for IPI on Power VS. this PR allows re-using an existing VPC, Cloud connection (the GRE tunnel wrapper that allows us to communicate with IBM Cloud components from the Power VS DCs), and a Power Network. It should be noted that the intended use case is that the user pre-creates the items. If resources from a previous cluster are re-used, and destroy is called on that cluster, the destroy would fail before completion, and the Cloud connection would be deleted for the other clusters. We'll document what is supported in official documentation.

    Signed-off-by: Christy Norman <christy@linux.vnet.ibm.com>

commit fdd3902
Merge: 26611ff f648e75
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Thu Aug 11 17:57:44 2022 +0000

    Merge pull request openshift#6211 from clnperez/storage-tier-def

    powervs: default to tier1 storage

commit 26611ff
Merge: 2bc06e1 bea338c
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Thu Aug 11 07:53:14 2022 +0000

    Merge pull request openshift#6008 from stbenjam/yq

    Download `yq` in upi installer containers

commit 04b3f70
Author: AnnaZivkovic <github@squid.systems>
Date:   Wed Aug 3 17:46:37 2022 -0700

    GCP: Added user specified tags on control plane instances

    This is the second step to enabling an IPI cluster using Shared VPC where the service account does not have sufficient permissions to create firewall rules in the Host Project while maintaining cluster-specific firewall rules.
    Depends on openshift#6185

commit a6aea18
Author: Christy Norman <christy@linux.vnet.ibm.com>
Date:   Wed Aug 10 10:20:53 2022 -0700

    vendor dir

    the new files under vendor

    Signed-off-by: Christy Norman <christy@linux.vnet.ibm.com>

commit 5845fe9
Author: Christy Norman <christy@linux.vnet.ibm.com>
Date:   Wed Aug 10 10:20:21 2022 -0700

    terraform: ibmcloud: bump provider version

    to get network_name from ibm_pi_dhcps for the Power VS provider

    Signed-off-by: Christy Norman <christy@linux.vnet.ibm.com>

commit cbc80e9
Author: Ryan Jenkins <ryanjenkins1299@gmail.com>
Date:   Wed Aug 10 15:55:05 2022 +0100

    Bug 2115790: bump RHCOS 4.12 bootimage metadata

    These changes will update the RHCOS 4.12 boot image metadata in the
    installer which includes the fixes for the following BZs:

    BZ 2115528 This will bump bootimage to include latest rpm-ostree

    Changes generated with:

    ```
    plume cosa2stream --target data/data/coreos/rhcos.json --distro rhcos --url https://rhcos.mirror.openshift.com/art/storage/releases --no-signatures x86_64=412.86.202208101039-0 aarch64=412.86.202208101040-0 s390x=412.86.202208090843-0 ppc64le=412.86.202208090152-0
    ```

commit 2bc06e1
Merge: 1b2e5da 2cf4300
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Wed Aug 10 13:18:28 2022 +0000

    Merge pull request openshift#5834 from shiftstack/remove-loadbalancer-settings

    Remove LoadBalancer settings from cloud provider config

commit f648e75
Author: Christy Norman <christy@linux.vnet.ibm.com>
Date:   Tue Aug 9 17:37:55 2022 -0700

    powervs: default to tier1 storage

    this should speed up our deploys a bit

    Signed-off-by: Christy Norman <christy@linux.vnet.ibm.com>

commit 1b2e5da
Merge: 9b77bfc 8694be3
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Tue Aug 9 23:22:04 2022 +0000

    Merge pull request openshift#6202 from sadasu/update-platform-approvers

    Add SPLAT members to platform reviewer/approver aliases

commit 9b77bfc
Merge: 3febdea fbff435
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Tue Aug 9 22:01:02 2022 +0000

    Merge pull request openshift#6178 from dongchen126/fix-region-list

    Alibaba: Use a static region list

commit 2cf4300
Author: Maysa Macedo <maysa.macedo95@gmail.com>
Date:   Fri Apr 22 13:12:12 2022 +0200

    Remove LoadBalancer settings from cloud provider config

    Since the cluster-cloud-controller-manager-operator is
    enforcing all the required configurations, there is no
    need to specify those anymore in the installer. This
    commit removes all the uneeded configurations.

commit 3051e7d
Author: Jeremiah Stuever <jstuever@redhat.com>
Date:   Mon Jul 25 14:45:07 2022 -0700

    GCP: Add networkProjectID parameter to install-config.

    This change adds the networkProjecID parameter to the GCP platform in
    the install-config. When set, it also validates the project exists and
    that the network and subnetworks belong to this project.

commit 8694be3
Author: Sandhya Dasu <sdasu@redhat.com>
Date:   Thu Aug 4 16:55:37 2022 -0400

    Add SPLAT members to platform reviewer/approver aliases

commit 6591508
Author: Christopher J Schaefer <cjschaef@us.ibm.com>
Date:   Tue Aug 2 15:47:17 2022 -0500

    Bug 2109800: IBMCloud: Allow metrics traffic

    Allow metrics traffic to flow through the cluster, including the
    worker nodes, to resolve issues where the KCM and KS metrics are
    not accessible.

    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2109800

commit 1647e35
Author: Rafael Fonseca <r4f4rfs@gmail.com>
Date:   Tue Aug 2 15:26:47 2022 +0200

    Revendor with golang 1.18

commit 8d6187c
Author: Rafael Fonseca <r4f4rfs@gmail.com>
Date:   Thu Jul 7 15:21:48 2022 +0200

    Use golang 1.18

commit 3ebd708
Author: Basheer Khadarsabgar <basheer.8765@gmail.com>
Date:   Thu Jul 21 17:58:29 2022 +0530

    Updated Max and Min machinepool resource limits of Power VS

commit fbff435
Author: sunhui <wb-sh373163@alibaba-inc.com>
Date:   Thu Jul 28 11:37:12 2022 +0800

    Alibaba: use a static region list

    The list of regions obtained through the API may include new regions that
    are not supported, so explicitly define a list of supported regions

    Signed-off-by: sunhui <wb-sh373163@alibaba-inc.com>

commit c55b118
Author: Rafael Fonseca <r4f4rfs@gmail.com>
Date:   Mon Jul 25 21:53:55 2022 +0200

    Update installer images to use golang 1.18

commit af8bace
Author: Joseph Callen <jcallen@redhat.com>
Date:   Fri Jun 24 10:53:22 2022 -0400

    UPI image download govc rate limit failure

    Removing the downloading of `govc` to
    workaround rate limit errors
    and replacing with govc docker image

commit bea338c
Author: Stephen Benjamin <stephen@redhat.com>
Date:   Tue Jun 28 06:51:03 2022 -0400

    Restore python yq and write go yq to different path

commit be6b440
Author: Stephen Benjamin <stephen@redhat.com>
Date:   Wed Jun 15 08:22:34 2022 -0400

    Download `yq` in upi installer containers

    Various CI steps use the upi-installer container for it's access to the
    aws cli tools among other things.  However, most of those steps also
    curl `yq` directly from GitHub.  We can save ourselves some headaches
    when GitHub is down by just embedding the binary in the image already.

    Note, this also removes the python yq which I am not sure if it's used,
    but I can't find a place where it is.

commit 85f2832
Author: patrickdillon <padillon@redhat.com>
Date:   Wed Feb 23 21:25:38 2022 -0500

    Azure Stack UPI Docs: Remove Feature Gate CRs

    Updates the instructions to have users remove any credential requests
    for feature-gated resources. Creating such credentials will cause
    installs to fail as the bootstrap node continuously fails to create
    a resource in a namespace that does not exist.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants