Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce DPU Network Operator #890

Merged

Conversation

pliurh
Copy link
Contributor

@pliurh pliurh commented Sep 7, 2021

Signed-off-by: Peng Liu

@openshift-ci openshift-ci bot requested review from danwinship and dougbtv September 7, 2021 13:30
@pliurh pliurh force-pushed the dpu-ovnkube-operator branch from bb7d3fe to e021d60 Compare September 7, 2021 13:55

The DPU OVNKube Operator will be responsible for enable the switchdev mode on
BF2 cards. We assume all the BF2 cards will be put into a custom
MachineConfigPool,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comma should be a period, or else there's something missing at the end?

Who is responsible for the custom MachineConfigPool? When are they supposed to create it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is responsible for the custom MachineConfigPool? When are they supposed to create it?

That is a good question. In my assumption, the cluster-admin shall label the nodes and create the MCP. WDYT?


We will create a namespaced CRD OVNKubeConfig.smartnic.openshift.io for the
DPU OVNKube Operator. One custom resource of this CRD shall be created for
the tenant cluster. The operator will take this custom resource as input to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's created "for the tenant cluster", but it's created in the infra cluster, which again probably needs to be clarified.

How is the admin expected to create this resource? We should probably provide at least a script or something to grab the necessary information from the tenant cluster and create the resource in the infra cluster...

Copy link
Contributor Author

@pliurh pliurh Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get the question. This operator only needs the kubeconfig of the tenant cluster, then it can access the apiserver to grab all the info. Do you mean we need a script to generate the kubeconfig file for the tenant cluster?

In my assumption, the whole process shall be like this:

  1. After infra cluster installation, the admin labels the DPUs which he wants to enable hardware offload.
  2. The admin creates a custom MCP that contains all the selected DPUs.
  3. The admin configures the CNO to deprovision the ovn-kube components from the DPUs.
  4. The admin installs the DPU operator and creates an OVNKubeConfig CR with the poolName but leaves kubeConfigFile blank. The operator then does the host network configuration.
  5. Install the tenant cluster. Scale up the x86 workers with DPU.
  6. The CNO of the tenant cluster puts those x86 hosts into smart-nic host mode.
  7. The admin configures VFs on x86 node with the Sriov Network Operator.
  8. The admin stores the kubeconfig file in the infra cluster then adds kubeConfigFile to the existing OVNKubeConfig CR. The operator installs the ovn-kube components on the selected DPUs.

@pliurh pliurh force-pushed the dpu-ovnkube-operator branch 2 times, most recently from f918873 to 0eaa1c7 Compare September 14, 2021 07:09
- enable the switchdev mode on DPUs.
- add PF representor to the br-ex bridge, so that the x86 host can connect to
the cluster network of the tenant cluster.
- Define the ovn-kube upgrade mechanism in such a deployment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the ovn-kube upgrade mechanism is discussed in this doc (we talked about the ovnkube operator upgrade in the lifecycle management section).
How can we ensure the version of ovnkube-node in infra cluster is compatible with the version of ovn-kube components in tenant cluster during cluster upgrade or during ovnkube operator installation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, should this OVNKube operator fetch the image information from tenant cluster as well and use the image to spin up ovnkube-node in infra cluster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assume the two clusters will always be in the same version, and upgrade together? Then we can let the operator use the images which are currently used in the infra cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to define the version compatibility between infra and tenant cluster, like what is supported and not supported. For example, we may say that 4.x tenant is compatible with 4.x+1 infra or 4.x+n infra.

This would allow new feature be introduced without updating tenant cluster. for example, the update of rhcos image in infra cluster (openvswitch, driver etc) may bring in new features, Operator designed to run on infra cluster would also benefit from it.

Thinking loudly, if we were to move ovn-central to infra cluster, it would further decouple the infra and tenant in terms of cluster or ovnkube versions. which allow more features be added without updating tenant cluster.

## Design Details

The DPU OVNKube Operator will fetch the necessary information from the
tenant cluster, then create the following objects in the infra cluster:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add that OVNKube operator will watch those runtime ovn-k8s changes in tenant cluster and reflect that accordingly in the infra clusters? Like whenever the configMap or Secret are changed in tenant cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a good point. AFAIK, those ovnkube objects can not be changed after cluster installation because CNO doesn't allow them to. But the certificates do expire, we need to watch the change in the tenant cluster.

Copy link
Contributor

@zshi-redhat zshi-redhat Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One example I had in mind was the ipsec enablement, it can be done after initial cluster installation and ovnkube Operator will need to render the ipsec daemonset in infra cluster.


## Design Details

The DPU OVNKube Operator will fetch the necessary information from the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this Operator need to handle kube API compatibility between infra and tenant clusters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that necessary? I assume both clusters shall in the same version.

Copy link
Contributor

@zshi-redhat zshi-redhat Sep 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow tenant and infra to be in different cluster versions (which I think we should if not in the initial delivery), then yes.

@pliurh pliurh force-pushed the dpu-ovnkube-operator branch from 0eaa1c7 to 16af993 Compare September 15, 2021 06:59
## Motivation

Normally, the Cluster Network Operator watches the Network customer resource,
then provision the ovn-kube components to the Openshift cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally CNO manages the ovn-kube components. If this new operator is taking over that role for the infra cluster, do we need to document somewhere that CNO needs to be changed to not manage OVN-Kube on infra clusters and what's the CNO trigger for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This operator is not taking take over that role for the infra cluster, but managing the ovnkube that servers the tenant cluster. The CNO will still manage the ovnkube of the infra cluster on non-BF2 nodes.

@zshi-redhat Shall we mention these CNO changes in #732

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a similar patch in CNO like this openshift/cluster-network-operator#1138, to isolate the CNO managed ovnkube from being scheduled on DPU in infra cluster. For example, to introduce the label network.operator.openshift.io/dpu which will be used by CNO in infra cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pliurh yes, THE #732 needs to be updated.

@danwinship
Copy link
Contributor

can you move this into the enhancements/network/dpu/ directory (which should exist now in master)?

you may need to add some more dummy sections to make the latest markdownlint job happy too...

@pliurh pliurh force-pushed the dpu-ovnkube-operator branch from d052542 to 52d1b2e Compare November 4, 2021 03:11
Signed-off-by: Peng Liu <Peng Liu>
@pliurh pliurh force-pushed the dpu-ovnkube-operator branch from 52d1b2e to 6caff80 Compare November 4, 2021 03:28
@pliurh pliurh changed the title Introduce DPU OVNKube Operator Introduce DPU Network Operator Nov 4, 2021
@pliurh
Copy link
Contributor Author

pliurh commented Nov 4, 2021

can you move this into the enhancements/network/dpu/ directory (which should exist now in master)?

you may need to add some more dummy sections to make the latest markdownlint job happy too...

Done

@danwinship
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 4, 2021
@danwinship
Copy link
Contributor

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 8, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 8, 2021
@openshift-merge-robot openshift-merge-robot merged commit cf1d902 into openshift:master Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants