Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: Set arguments based on CPU architecture #796

Merged
merged 2 commits into from
Oct 31, 2024

Conversation

zeeke
Copy link
Member

@zeeke zeeke commented Oct 25, 2024

Kernel arguments like intel_iommu=on does not have sense
on AMD or ARM systems and some user might complain about
their presence, though they are likely to be harmless.

Also, on ARM systems the iommu.passthrough parameter is the
one to use [1].

Improve GHWLib to bridge CPU information from the library.
Add CpuInfoProviderInterface and inject it into the GenericPlugin
to implement the per CPU vendor logic.

Update github.com/jaypipes/ghw to include

[1] https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L2343

Copy link

Thanks for your PR,
To run vendors CIs, Maintainers can use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs, Maintainers can use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@zeeke zeeke force-pushed the us/OCPBUGS-43654 branch 5 times, most recently from 8287e4c to 5a7e2c2 Compare October 25, 2024 13:02
@coveralls
Copy link

coveralls commented Oct 25, 2024

Pull Request Test Coverage Report for Build 11591346372

Details

  • 44 of 85 (51.76%) changed or added relevant lines in 9 files are covered.
  • 10 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.07%) to 45.465%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/platforms/openstack/openstack.go 0 2 0.0%
pkg/plugins/generic/generic_plugin.go 18 21 85.71%
pkg/host/internal/lib/ghw/ghw.go 0 4 0.0%
pkg/host/internal/lib/ghw/mock/mock_ghw.go 7 12 58.33%
pkg/host/mock/mock_host.go 0 11 0.0%
pkg/host/internal/cpu/cpu.go 5 21 23.81%
Files with Coverage Reduction New Missed Lines %
controllers/drain_controller.go 4 67.1%
pkg/host/internal/lib/ghw/mock/mock_ghw.go 6 63.33%
Totals Coverage Status
Change from base Build 11589482683: -0.07%
Covered Lines: 6792
Relevant Lines: 14939

💛 - Coveralls

Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm !

@adrianchiris
Copy link
Collaborator

adrianchiris commented Oct 28, 2024

@zeeke any idea why k8 CI is failing ? i see it creates a test pod and its stuck in pending state but i dont see it in downloaded artifacts

@SchSeba
Copy link
Collaborator

SchSeba commented Oct 28, 2024

Yep I was trying to look into the same issue and can't find the artifacts

@SchSeba
Copy link
Collaborator

SchSeba commented Oct 29, 2024

Hi @adrianchiris I check this one manually and this is the problem

Events:
  Type     Reason                  Age   From               Message
  ----     ------                  ----  ----               -------
  Normal   Scheduled               29s   default-scheduler  Successfully assigned sriov-conformance-testing/testpod-dkm8x to opr-k8s2-2-worker-0.opr-k8s2-2.lab
  Warning  FailedCreatePodSandBox  28s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_testpod-dkm8x_sriov-conformance-testing_ec9d3883-5506-4dbd-9209-650714e4f5dc_0(1f7b373205c116ac65ae9dea567b42dcf158ea95086ac250d3fcbeb861ee16cf): error adding pod sriov-conformance-testing_testpod-dkm8x to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"1f7b373205c116ac65ae9dea567b42dcf158ea95086ac250d3fcbeb861ee16cf" Netns:"/var/run/netns/0493246d-fcb0-43f4-b5ad-a1dc1b0a8607" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=sriov-conformance-testing;K8S_POD_NAME=testpod-dkm8x;K8S_POD_INFRA_CONTAINER_ID=1f7b373205c116ac65ae9dea567b42dcf158ea95086ac250d3fcbeb861ee16cf;K8S_POD_UID=ec9d3883-5506-4dbd-9209-650714e4f5dc" Path:"" ERRORED: error configuring pod [sriov-conformance-testing/testpod-dkm8x] networking: [sriov-conformance-testing/testpod-dkm8x/ec9d3883-5506-4dbd-9209-650714e4f5dc:cbr0]: error adding container to network "cbr0": plugin type="flannel" failed (add): loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: no such file or directory
': StdinData: {"capabilities":{"portMappings":true},"clusterNetwork":"/host/etc/cni/net.d/10-flannel.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","type":"multus-shim"}
  Warning  FailedCreatePodSandBox  16s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_testpod-dkm8x_sriov-conformance-testing_ec9d3883-5506-4dbd-9209-650714e4f5dc_0(504afd72b732253ad2392e43d8333d7fee0c9c1f434011afdba9a9fd072e7327): error adding pod sriov-conformance-testing_testpod-dkm8x to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"504afd72b732253ad2392e43d8333d7fee0c9c1f434011afdba9a9fd072e7327" Netns:"/var/run/netns/9d4a2323-d27e-4891-827b-96b0f7361f06" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=sriov-conformance-testing;K8S_POD_NAME=testpod-dkm8x;K8S_POD_INFRA_CONTAINER_ID=504afd72b732253ad2392e43d8333d7fee0c9c1f434011afdba9a9fd072e7327;K8S_POD_UID=ec9d3883-5506-4dbd-9209-650714e4f5dc" Path:"" ERRORED: error configuring pod [sriov-conformance-testing/testpod-dkm8x] networking: [sriov-conformance-testing/testpod-dkm8x/ec9d3883-5506-4dbd-9209-650714e4f5dc:cbr0]: error adding container to network "cbr0": plugin type="flannel" failed (add): loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: no such file or directory
': StdinData: {"capabilities":{"portMappings":true},"clusterNetwork":"/host/etc/cni/net.d/10-flannel.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","type":"multus-shim"}
  Warning  FailedCreatePodSandBox  2s  kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_testpod-dkm8x_sriov-conformance-testing_ec9d3883-5506-4dbd-9209-650714e4f5dc_0(0f06a33555520c3c1acebe92c69969c11791201ad1d33b47e4be723a8228a1a9): error adding pod sriov-conformance-testing_testpod-dkm8x to CNI network "multus-cni-network": plugin type="multus-shim" name="multus-cni-network" failed (add): CmdAdd (shim): CNI request failed with status 400: 'ContainerID:"0f06a33555520c3c1acebe92c69969c11791201ad1d33b47e4be723a8228a1a9" Netns:"/var/run/netns/7eee1a81-5b22-4b85-ac1d-352d32ed88e6" IfName:"eth0" Args:"IgnoreUnknown=1;K8S_POD_NAMESPACE=sriov-conformance-testing;K8S_POD_NAME=testpod-dkm8x;K8S_POD_INFRA_CONTAINER_ID=0f06a33555520c3c1acebe92c69969c11791201ad1d33b47e4be723a8228a1a9;K8S_POD_UID=ec9d3883-5506-4dbd-9209-650714e4f5dc" Path:"" ERRORED: error configuring pod [sriov-conformance-testing/testpod-dkm8x] networking: [sriov-conformance-testing/testpod-dkm8x/ec9d3883-5506-4dbd-9209-650714e4f5dc:cbr0]: error adding container to network "cbr0": plugin type="flannel" failed (add): loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: no such file or directory
': StdinData: {"capabilities":{"portMappings":true},"clusterNetwork":"/host/etc/cni/net.d/10-flannel.conflist","cniVersion":"0.3.1","logLevel":"verbose","logToStderr":true,"name":"multus-cni-network","type":"multus-shim"}

not related to the PR for sure..
I am trying to fix the CI issue

zeeke added 2 commits October 30, 2024 11:28
Kernel arguments like `intel_iommu=on` does not have sense
on AMD or ARM systems and some user might complain about
their presence, though they are likely to be harmless.

Also, on ARM systems the `iommu.passthrough` parameter is the
one to use [1].

Improve `GHWLib` to bridge CPU information from the library.
Add `CpuInfoProviderInterface` and inject it into the GenericPlugin
to implement the per CPU vendor logic.

[1] https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/kernel-parameters.txt#L2343

Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
To include
- jaypipes/ghw#387

Signed-off-by: Andrea Panattoni <apanatto@redhat.com>
Copy link
Collaborator

@SchSeba SchSeba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

nice work!

@SchSeba SchSeba merged commit 2b02ba1 into k8snetworkplumbingwg:master Oct 31, 2024
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants