Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove unneccessary ns validation #2066

Closed
wants to merge 1 commit into from

Conversation

zhixian82
Copy link
Contributor

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zhixian82
Once this PR has been reviewed and has the lgtm label, please assign andreyvelich for approval by writing /assign @andreyvelich in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gaocegege
Copy link
Member

/cc @andreyvelich @tenzen-y

@@ -79,25 +78,6 @@ func (v *ExperimentValidator) Handle(ctx context.Context, req admission.Request)
}
}

// After metrics collector sidecar injection in Job level done, delete validation for namespace labels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what sidecar injection in Job level means. This comment seemed to add in #828 3 years ago.
After that, @andreyvelich refactored the getKatibJob function (maybe, for job level injection?) in #1303.

@andreyvelich @johnugeorge Do you remember what this comment means?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think, this comment is still valid.

In Job level, we look at type of Job(TFJob, PytorchJob etc) and decide the job pod which needs metrics collector injection. In Pod level, we check pod level labels and decide if we need to inject metrics collector or not. Right now, we are doing pod level where inject metric collector in the pods that contain PrimaryPodLabels or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for clarifying! @johnugeorge

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tenzen-y I guess the idea behind Job level mutation was described here: https://github.com/kubeflow/katib/blob/master/docs/proposals/metrics-collector.md#mutating-webhook.
But since we dynamically support any kind of Kubernetes resources, we mutate only in the Pod level.

Copy link
Member

@tenzen-y tenzen-y Dec 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. I was thinking we cannot inject sidecars in Job level since we support any K8s resources, as you say.
Maybe, it would be good to remove the comment to avoid confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnugeorge @andreyvelich @tenzen-y Thanks for the explanation! I will close this pr

@johnugeorge
Copy link
Member

The reason behind namespace selector was that we need to identify namespaces that can run katib workloads. In order to reduce the performance effect(due to webhooks) for the whole cluster which can have non Kubeflow/Katib workloads as well, we were specifically looking for namespaces that has katib.kubeflow.org/metrics-collector-injection label. This change will have performance issues in a shared cluster

@@ -63,9 +63,6 @@ webhooks:
name: katib-controller
namespace: kubeflow
path: /mutate-pod
namespaceSelector:
matchLabels:
katib.kubeflow.org/metrics-collector-injection: enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't remove this namespaceSelector.
Otherwise, Katib Mutation Webhook will be executing in every Kubernetes namespace.

@zhixian82 zhixian82 closed this Dec 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants