Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change to GHCR image path #348

Merged
merged 4 commits into from
Jul 25, 2021

Conversation

martinkennelly
Copy link
Member

Signed-off-by: Martin Kennelly martin.kennelly@intel.com

@zshi-redhat I can make further updates to this by creating a new tagged release, so we can update the rest of the deployment artifacts. Currently, we only have a version package for 'latest'.

@martinkennelly
Copy link
Member Author

@zshi-redhat I saved a new release "draft" v3.3.2 to this repository if you agree with this patch. Feel free to publish it. When you are done, I will update the rest of the links in this PR.

@martinkennelly
Copy link
Member Author

@adrianchiris Could you take a look at the daft release I created and if you approve of it, publish it? The reason I want a new release to be created is to invoke the Github workflow to create a new tagged release package. I will then update this PR pointing to the newly created GHCR image.

@adrianchiris
Copy link
Contributor

done !

@martinkennelly
Copy link
Member Author

@adrianchiris Needed to add additional arch's to make this PR consistent. I also added creation of a 'stable' tag to allow us not to keep updating the k8 deployments & docs but also to allow consumers to stick to releases but not the latest. See commit message for more details.

Do you have HW to test the ARM image?

@martinkennelly martinkennelly changed the title WIP: Change to GHCR image path Change to GHCR image path May 6, 2021
@zshi-redhat
Copy link
Collaborator

@martinkennelly Just back from PTO, what else need to be done here? looks like the v3.2.2 release has been published.

@martinkennelly
Copy link
Member Author

martinkennelly commented May 7, 2021

@zshi-redhat I think we are fine for review now.
I think, I will, unfortunately have to create another patch version release once this is merged in-order to generate all the new images... I think this is ok.

@adrianchiris
Copy link
Contributor

I also added creation of a 'stable' tag to allow us not to keep updating the k8 deployments & docs but also to allow consumers to stick to releases but not the latest. See commit message for more details.

There is a risk here if for some reason we need to update the yaml for a new release, then users who have an older template running with stable, would unexpectedly fail.

Do you have HW to test the ARM image?

I have hardware to check if the ARM image runs but not with sriov , k8s

@adrianchiris
Copy link
Contributor

@martinkennelly I have build the Dockerfile.arm64 and ran DP

root@cloud-dev-03-arm:~# docker run -v /root/config.json:/config.json sriov-dp --config-file=/config.json
I0602 13:22:47.518342       1 manager.go:54] Using Deprecated Device Plugin Registry Path
I0602 13:22:47.518483       1 main.go:44] resource manager reading configs
I0602 13:22:47.518692       1 manager.go:86] raw ResourceList: {
        "resourceList": [{
            "resourceName": "mlnx_rdma",
            "resourcePrefix": "mellanox.com",
            "selectors": {
                "vendors": ["15b3"],
                "isRdma": false
            }
        }
    ]
}
I0602 13:22:47.518803       1 factory.go:168] net device selector for resource mlnx_rdma is &{DeviceSelectors:{Vendors:[15b3] Devices:[] Drivers:[] PciAddresses:[]} PfNames:[] RootDevices:[] LinkTypes:[] DDPProfiles:[] IsRdma:false NeedVhostNet:false}
I0602 13:22:47.518850       1 manager.go:106] unmarshalled ResourceList: [{ResourcePrefix:mellanox.com ResourceName:mlnx_rdma DeviceType:netDevice Selectors:0x4000112cc0 SelectorObj:0x400033a820}]
I0602 13:22:47.518927       1 manager.go:193] validating resource name "mellanox.com/mlnx_rdma"
I0602 13:22:47.518940       1 main.go:60] Discovering host devices
I0602 13:22:47.592734       1 netDeviceProvider.go:78] netdevice AddTargetDevices(): device found: 0000:03:00.0 02              Mellanox Technolo...    MT42822 BlueField-2 integrated Connec...
I0602 13:22:47.592864       1 netDeviceProvider.go:78] netdevice AddTargetDevices(): device found: 0000:03:00.1 02              Mellanox Technolo...    MT42822 BlueField-2 integrated Connec...
I0602 13:22:47.592946       1 main.go:66] Initializing resource servers
I0602 13:22:47.592971       1 manager.go:112] number of config: 1
I0602 13:22:47.592984       1 manager.go:115] 
I0602 13:22:47.592991       1 manager.go:116] Creating new ResourcePool: mlnx_rdma
I0602 13:22:47.593002       1 manager.go:117] DeviceType: netDevice
W0602 13:22:47.594339       1 pciNetDevice.go:81] unable to get PF name "no interface name found for device 0000:03:00.0"
W0602 13:22:47.595341       1 pciNetDevice.go:81] unable to get PF name "no interface name found for device 0000:03:00.1"
I0602 13:22:47.595387       1 factory.go:108] device added: [pciAddr: 0000:03:00.0, vendor: 15b3, device: a2d6, driver: mlx5_core]
I0602 13:22:47.595398       1 factory.go:108] device added: [pciAddr: 0000:03:00.1, vendor: 15b3, device: a2d6, driver: mlx5_core]
I0602 13:22:47.595426       1 manager.go:145] New resource server is created for mlnx_rdma ResourcePool
I0602 13:22:47.595444       1 main.go:72] Starting all servers...

@zshi-redhat
Copy link
Collaborator

I also added creation of a 'stable' tag to allow us not to keep updating the k8 deployments & docs but also to allow consumers to stick to releases but not the latest. See commit message for more details.

There is a risk here if for some reason we need to update the yaml for a new release, then users who have an older template running with stable, would unexpectedly fail.

How do we expect user to run the stable deployment?

Option A:

  1. checkout a branch with specific tag
  2. create the deployment with daemonset yaml template (contains image url corresponding to the tag)

Option B:

  1. stay in the current (not latest) checkout
  2. create the deployment using the stable tagged image

For A, we need to push the image url with release tag and update the daemonset yaml for each tagged release.
For B, it has the issue that new daemonset yaml may change away from current checkout and new stable image won't be pulled automatically (image pull policy is IfNotPresent).

@martinkennelly
Copy link
Member Author

I also added creation of a 'stable' tag to allow us not to keep updating the k8 deployments & docs but also to allow consumers to stick to releases but not the latest. See commit message for more details.

There is a risk here if for some reason we need to update the yaml for a new release, then users who have an older template running with stable, would unexpectedly fail.

I think I understand your concern. Ill give you an example to show I understand.

If we have a "stable" tag and a consumer consumes this tag for SRIOV DP v3.x.x, tests it, works fine, but later on we change a bunch of things in V4 of this software and now that consumers deployment could possibly break.

So, what we need is a Version X stable tag? so "V3" tag instead of "stable" and this tag would consume all updates for V3. I think this is a better idea than having a "stable" tag covering all future tagged releases. Thanks @adrianchiris

As Zenghui said, we would also need to change the image pull policy to guarantee they are getting the latest "V3".

@zshi-redhat WDYT? I think, its more conservative to have a MAJOR version number tag that consumes all updates for a particular major version of this project..

@zshi-redhat
Copy link
Collaborator

I also added creation of a 'stable' tag to allow us not to keep updating the k8 deployments & docs but also to allow consumers to stick to releases but not the latest. See commit message for more details.

There is a risk here if for some reason we need to update the yaml for a new release, then users who have an older template running with stable, would unexpectedly fail.

I think I understand your concern. Ill give you an example to show I understand.

If we have a "stable" tag and a consumer consumes this tag for SRIOV DP v3.x.x, tests it, works fine, but later on we change a bunch of things in V4 of this software and now that consumers deployment could possibly break.

So, what we need is a Version X stable tag? so "V3" tag instead of "stable" and this tag would consume all updates for V3. I think this is a better idea than having a "stable" tag covering all future tagged releases. Thanks @adrianchiris

As Zenghui said, we would also need to change the image pull policy to guarantee they are getting the latest "V3".

@zshi-redhat WDYT? I think, its more conservative to have a MAJOR version number tag that consumes all updates for a particular major version of this project..

Agree, so we will have the following:

master image: latest x86
master arch image: latest + arch

tagged image: specific tag (e.g. v3.1) x86
tagged arch image: specific tag on arch (e.g. v3.1-amd64)
stable tagged arch image: latest major tag on arch (e.g. stable-amd64 == v3.3.2-amd64)

@ipatrykx
Copy link
Contributor

ipatrykx commented Jul 6, 2021

@martinkennelly I somewhat like the idea of 'Version X' stable tag. However I think i understand that a bit differently than @zshi-redhat . Could you pleas clarify, as for me it would look like (following the @zshi-redhat example here):

master image: latest x86
master arch image: latest + arch

tagged image: specific tag (e.g. v3.1) x86
tagged arch image: specific tag on arch (e.g. v3.1-amd64)
stable tagged arch image: latest major tag on arch (e.g. stable-v3-amd64 == v3.3.2-amd64 AND stable-v4-amd64 == v4.0.2-amd64 )

Am I correct? If so, this brings a question here - how long would you like to support previous version after new version is being released (in our example how long would you like to support V3 if V4 was already released)? By 'support' I mean building the images, backporting bugfixes, etc.).

username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Build and push sriov-network-device-plugin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naive question, what will actually cause the image to be built for ARM64 ?
previously, in travis, you would have asked for an arm64/ppc VM and just build your image there.

With GH actions you get a x86_64 VM and run docker build.
while buildx supports multi-platform build, you need to explicitly request for it right ?

Here is what i found : https://github.com/docker/build-push-action/blob/master/docs/advanced/multi-platform.md

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrianchiris You're correct. It wasn't correct originally. Corrected now.

@adrianchiris
Copy link
Contributor

should we perhaps aim to split this PR to:

  1. PR for build of multi-platform images via GH actions on push to master and on tag creation (like being done today for x86_64)
  2. Issue to discuss how we want to manage future releases and versions of images in our yamls. currently we are working on master branch only, creating tags for releases. no backports, updating yamls occasionally (which i imagine we should do on a regular basis)

@martinkennelly martinkennelly changed the title Change to GHCR image path WIP: Change to GHCR image path Jul 22, 2021
@martinkennelly martinkennelly changed the title WIP: Change to GHCR image path Change to GHCR image path Jul 23, 2021
@martinkennelly
Copy link
Member Author

martinkennelly commented Jul 23, 2021

@adrianchiris I reverted the "stable" tag changes to allow for more discussion on what we want to support in the future and not to delay this PR. I will open an issue to continue this discussion when this PR has concluded.

See commits for delta from your original review.

I tested this change on my fork and it produced the following artifacts: https://github.com/martinkennelly/sriov-network-device-plugin/pkgs/container/sriov-network-device-plugin/versions

martinkennelly and others added 4 commits July 23, 2021 17:19
Add workflows for ARM & ppc64le image creation.
Create new "stable" image tag for each arch which can allow
anyone consuming our software to stick to a static image
tag.
Set K8 artifacts images to reference GHCR.

Update README and K8 artifacts to consume GHCR image.

Signed-off-by: Martin Kennelly <martin.kennelly@intel.com>
Utilise emulate to build POWER8/ARM images on
x86 host.

Signed-off-by: Martin Kennelly <mkennell@pm.me>
Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Signed-off-by: Martin Kennelly <mkennell@redhat.com>
@martinkennelly
Copy link
Member Author

rebased against master

@adrianchiris
Copy link
Contributor

LGTM, thanks Martin !

@adrianchiris adrianchiris merged commit 8277524 into k8snetworkplumbingwg:master Jul 25, 2021
@Eoghan1232 Eoghan1232 mentioned this pull request Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants