-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
external control plane #309
external control plane #309
Conversation
Thanks for your PR,
To skip the vendors CIs use one of:
|
pkg/utils/cluster.go
Outdated
infra := &configv1.Infrastructure{} | ||
err := c.Get(context.TODO(), types.NamespacedName{Name: infraResourceName}, infra) | ||
if err != nil { | ||
return false, err | ||
return "", err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add context information to the error:
return "", err | |
return "", fmt.Errorf("can't get Infrastructure [%s]: %w, infraResourceName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
519ff98
to
65be425
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
@zshi-redhat can you explain abit more about the use-case ? |
@adrianchiris The use case is for managed kubernetes services (similar to aws EKS or google GKE), where the kube control plane components (apiserver, etcd, kube-scheduler etc) are hosted by cloud service provider in the provider managed kubernetes cluster and only workload components (worker nodes) are visible to end user. In such case, end user can still deploy sriov operator on the worker node by accessing k8s apiserver managed externally. DPU two-cluster design can take advantage of this as well, where infra and tenant control planes are hosted in a separate k8s cluster (management cluster) to reduce the footprint of master nodes. dpu-network-operator runs in the management cluster which has access to both infra and tenant APIs. |
65be425
to
7e0a91d
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
7e0a91d
to
6fcedfa
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
6fcedfa
to
569a819
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
operator: Exists | ||
- matchExpressions: | ||
- key: node-role.kubernetes.io/control-plane | ||
operator: Exists | ||
tolerations: | ||
{{ if .ExternalControlPlane }} | ||
- key: "node-role.kubernetes.io/worker" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worker taint ? that's a thing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
@@ -32,13 +32,21 @@ spec: | |||
requiredDuringSchedulingIgnoredDuringExecution: | |||
nodeSelectorTerms: | |||
- matchExpressions: | |||
{{ if .ExternalControlPlane }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case of external control plane maybe drop the entire 'requiredDuringScheduling....'
or alternatively just add another - matchExpression
as these are ORed
replacing one with another is not good IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropped requiredDuringScheduling
in External mode.
@@ -35,13 +35,21 @@ spec: | |||
requiredDuringSchedulingIgnoredDuringExecution: | |||
nodeSelectorTerms: | |||
- matchExpressions: | |||
{{ if .ExternalControlPlane }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
operator: Exists | ||
- matchExpressions: | ||
- key: node-role.kubernetes.io/control-plane | ||
operator: Exists | ||
tolerations: | ||
{{ if .ExternalControlPlane }} | ||
- key: "node-role.kubernetes.io/worker" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question about worker taint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed "node-role.kubernetes.io/worker"
@@ -47,6 +47,10 @@ spec: | |||
valueFrom: | |||
fieldRef: | |||
fieldPath: metadata.namespace | |||
- name: K8S_POD_NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- you already have POD_NAME defined below
- you can use
spec.nodeName
to expose Node name directly and avoid logic added to get node name from pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed K8S_POD_NAME
and added NODE_NAME
pkg/utils/cluster.go
Outdated
if err != nil { | ||
return false, err | ||
} | ||
switch role { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for switch here imo
if role == worker { return true, nil }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
pkg/utils/cluster.go
Outdated
} | ||
|
||
nodeList := &corev1.NodeList{} | ||
err = c.List(context.TODO(), nodeList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you gen get node by name no ? no need to list and iterate here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
pkg/utils/cluster.go
Outdated
infraResourceName = "cluster" | ||
infraResourceName = "cluster" | ||
workerNodeLabelKey = "node-role.kubernetes.io/worker" | ||
masterNodeLabelKey = "node-role.kubernetes.io/master" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about node-role.kubernetes.io/control-plane
?
in k8s 1.24 kubeadm now replaced master with control-plane (master label deprecated)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added support for both.
569a819
to
39b8086
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
39b8086
to
b4f74fc
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
Can you put your first comment in the commit message? Or something more elaborate? |
b4f74fc
to
51e05b3
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
added the comment in the commit message. |
/lgtm |
Hi @zshi-redhat will you please be able to rebase this PR? |
51e05b3
to
c045a1f
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
rebased, thanks for reminding! |
Pull Request Test Coverage Report for Build 3011830441
💛 - Coveralls |
@zshi-redhat We could label the workers in guest cluster with label "master" (such that the role will be master,worker). In that case, this would not be needed. For the sake of my understanding, is that correct? I still think it's nice to allow to run on workers so we will still want to have this PR in. |
@bn222 This may not be desired from my understanding, since the node pool is supposed to only contain worker nodes, not sure if there will be any side effects to other pods. btw, do we still need this PR? or is it going to be integreted to Tyler's? |
c045a1f
to
a034880
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
@adrianchiris PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple of nits otherwise LGTM
a034880
to
8136a82
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
8136a82
to
34e7ec1
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
The use case is for managed kubernetes services (similar to aws EKS or google GKE), where the kubernetes control plane components (apiserver, etcd, kube-scheduler etc) are hosted by cloud service provider in the provider managed kubernetes cluster and only workload components (worker nodes) are visible to end user. In such case, end user can still deploy sriov operator on the worker node by accessing k8s apiserver hosted externally. Signed-off-by: Zenghui Shi <zshi@redhat.com>
34e7ec1
to
711302e
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, lets wait for e2e test to pass
Enable sriov operator in the cluster w/o master nodes (k8s apiserver and other control plane components are managed externally).