Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip config if policy not applied yet #287

Merged
merged 1 commit into from
Jul 14, 2022

Conversation

rollandf
Copy link
Contributor

@rollandf rollandf commented Apr 26, 2022

There is a stage when the SriovNetworkNodeState is initializing
where the spec is empty because the SriovNetworkNodePolicyReconciler
did not yet applied the policies.

It can cause a non required action by the plugins,
that will try to apply the empty spec by resetting the NIC for example.

The config daemon will not run the plugins if the generation is 1
and the Spec.Interfaces is empty.

Solves issue #283

For e2e tests, the wait timeout to get to initial Sync state has been
increased.
This change is needed as now the config daemon will not apply on "empty"
spec until the SriovNetworkNodePolicyReconciler will iterate on the Interfaces.
The reconcile loop interval is 5 minutes, so the test timeout needed to be increased.

Signed-off-by: Fred Rolland frolland@nvidia.com

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

Copy link
Collaborator

@e0ne e0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@adrianchiris
Copy link
Collaborator

/test-all

2 similar comments
@abdallahyas
Copy link
Contributor

/test-all

@e0ne
Copy link
Collaborator

e0ne commented Apr 28, 2022

/test-all

pkg/daemon/daemon.go Outdated Show resolved Hide resolved
@adrianchiris
Copy link
Collaborator

one minor nit, otherwise LGTM

@adrianchiris
Copy link
Collaborator

/test-all

@github-actions
Copy link

github-actions bot commented May 8, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@abdallahyas
Copy link
Contributor

abdallahyas commented May 8, 2022

@adrianchiris @rollandf i am not sure if it is related, but in last CI failure, the sriovnetworknodestate was missing the syncStatus field, see sriovnetworknodestate,
we also see


I0508 11:08:50.751581    1394 writer.go:163] setNodeStateStatus(): syncStatus: , lastSyncError: 
I0508 11:08:56.009989    1394 daemon.go:311] get item: 1
I0508 11:08:56.010049    1394 daemon.go:401] nodeStateSyncHandler(): new generation is 1
I0508 11:08:56.023458    1394 daemon.go:424] nodeStateSyncHandler(): Interface policy spec not yet set
I0508 11:08:56.023491    1394 daemon.go:346] Successfully synced
I0508 11:08:56.023509    1394 daemon.go:309] worker queue size: 0
I0508 11:08:56.116046    1394 daemon.go:284] Run(): period refresh
I0508 11:08:56.121430    1394 daemon.go:985] tryCreateSwitchdevUdevRule()
I0508 11:08:56.121583    1394 daemon.go:1043] tryCreateNMUdevRule()
I0508 11:09:11.010630    1394 daemon.go:311] get item: 1
I0508 11:09:11.010672    1394 daemon.go:401] nodeStateSyncHandler(): new generation is 1
I0508 11:09:11.019890    1394 daemon.go:424] nodeStateSyncHandler(): Interface policy spec not yet set
I0508 11:09:11.019917    1394 daemon.go:346] Successfully synced
I0508 11:09:11.019928    1394 daemon.go:309] worker queue size: 0

in the config daemon logs, even though the VFs were configured, see kind worker interfaces

@abdallahyas
Copy link
Contributor

/test-all

1 similar comment
@abdallahyas
Copy link
Contributor

/test-all

@github-actions
Copy link

github-actions bot commented May 9, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@abdallahyas
Copy link
Contributor

/test-all

@github-actions
Copy link

github-actions bot commented May 9, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

github-actions bot commented May 9, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@rollandf
Copy link
Contributor Author

rollandf commented May 9, 2022

/test-all

1 similar comment
@abdallahyas
Copy link
Contributor

/test-all

@@ -32,7 +32,7 @@ var _ = Describe("Operator", func() {
Eventually(func() *cluster.EnabledNodes {
sriovInfos, _ = cluster.DiscoverSriov(clients, testNamespace)
return sriovInfos
}, timeout, interval).ShouldNot(BeNil())
}, Timeout*15, RetryInterval*20).ShouldNot(BeNil())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, sriovnetworknodestate will be reconciled by config daemon only after the controller set the spec (empty or other)

reason for timeout increase is for controller to resync nodestate with the actual spec and allow config daemon to reconcile.

see
#305

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rollandf question : do we need this now that we are updating sync status immediately to success on generation 1 ?

Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@adrianchiris
Copy link
Collaborator

@SchSeba @Eoghan1232 can we merge this ?

@adrianchiris
Copy link
Collaborator

/test-all

@github-actions
Copy link

github-actions bot commented Jul 4, 2022

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@adrianchiris
Copy link
Collaborator

/test-all

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@adrianchiris
Copy link
Collaborator

/test-all

Copy link
Collaborator

@SchSeba SchSeba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a small nit beside that LGTM

@@ -278,6 +278,11 @@ func (r *SriovNetworkNodePolicyReconciler) syncSriovNetworkNodeState(np *sriovne
return fmt.Errorf("failed to get SriovNetworkNodeState: %v", err)
}
} else {
if len(found.Status.Interfaces) == 0 {
logger.Info("SriovNetworkNodeState Status Interfaces are empty. Skip update of policies in spec")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you please add here "name", ns.Name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@rollandf
Copy link
Contributor Author

/test-all

1 similar comment
@e0ne
Copy link
Collaborator

e0ne commented Jul 12, 2022

/test-all

@rollandf
Copy link
Contributor Author

@SchSeba Can you please take another look?

@SchSeba
Copy link
Collaborator

SchSeba commented Jul 13, 2022

/lgtm

@github-actions github-actions bot added the lgtm label Jul 13, 2022
Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final question about latest changes and i think we can merge

@@ -32,7 +32,7 @@ var _ = Describe("Operator", func() {
Eventually(func() *cluster.EnabledNodes {
sriovInfos, _ = cluster.DiscoverSriov(clients, testNamespace)
return sriovInfos
}, timeout, interval).ShouldNot(BeNil())
}, Timeout*15, RetryInterval*20).ShouldNot(BeNil())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rollandf question : do we need this now that we are updating sync status immediately to success on generation 1 ?

There is a stage when the SriovNetworkNodeState is initializing
where the spec is empty because the SriovNetworkNodePolicyReconciler
did not yet applied the policies.

It can cause a non required action by the plugins,
that will try to apply the empty spec by resetting the NIC for example.

The config daemon will not run the plugins if the generation is 1
and the Spec.Interfaces is empty.

Solves issue k8snetworkplumbingwg#283

For e2e tests, the wait timeout to get to initial Sync state has been
increased.
This change is needed as now the config daemon will not apply on "empty"
spec until the SriovNetworkNodePolicyReconciler will iterate on the Interfaces.
The reconcile loop interval is 5 minutes, so the test timeout needed to be increased.

Signed-off-by: Fred Rolland <frolland@nvidia.com>
@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@adrianchiris
Copy link
Collaborator

/test-all

Copy link
Collaborator

@adrianchiris adrianchiris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been a long one :) LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants