-
Notifications
You must be signed in to change notification settings - Fork 137
Enable istio injection in Kubeflow namespace #131
Enable istio injection in Kubeflow namespace #131
Conversation
/assign @kkasravi |
@Bobgy This change looks good but before we can merge this change, I think we need to update kubeflow/manifests and disable ISTIO side car injection for most services to preserve the current behavior. |
@jlewi I didn't manage to get enough time to do this today. |
@jlewi Thanks! I will build a binary by myself first. /hold for kubeflow manifest change |
FYI, I built the binary directly, it's working well. Thanks! |
@jlewi We tried using Line 126 in 10c8271
kfctl , it errors
it seems it is still built for linux. How do we build one for Mac? /cc @gaoning777 |
Awesome. I guess it should be working from #162, then. |
4f1be7c
to
f9bf3d9
Compare
The last blocker: kubeflow/manifests#716 UPDATE: I found the fix, submitted new PR: kubeflow/manifests#804 |
Looks like a bunch of tests are failing. |
I read the error message saying there is an error running kustomize apply for 0.7-branch, because webhook doesn't respond. That's expected because with the flag turned on webhook is inaccessible by default. Only when using master branch, would the deploy work. |
It looks like two workflows failed. The first failure is the upgrade test. The build deploy step failed. Here are the logs The error is
The test is trying to apply the 0.7.0 config.
This looks like it might be related to the GKE 1.14 bug with workload identity. This should be fixed in the 0.7.1 manifests but I'm guessing not the 0.7.0 configs. @richardsliu what is the best way to fix the upgrade tests? @Bobgy Could you please file a P0 bug about the upgrade test failures and assign to @richardsliu Looking at our periodic test grid. It looks like the build_deploy step is passing. So you might need to rebase to pick up changes to fix the test. |
Thanks for the investigation, I will continue efforts on this after vacation. |
/retest |
@Bobgy do you know what the test failures are? Is it still TFJob? Did you try patching the test to not run with ISTIO side car injection? |
@jlewi, thanks for following up!
Yes, tfjob-smoke-test is still timing out without any explicit error messages.
I tried patching the test in b3ddd39, but now I understand I wasn't patching the right yaml. I should be patching TFJob, but I haven't found it yet. |
OK, now I understand this line: https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/kubeflow_kfctl/131/kubeflow-kfctl-presubmit/1239771165149368320/#1:build-log.txt%3A749 is actually calling test in https://github.com/kubeflow/tf-operator/blob/master/py/kubeflow/tf_operator/simple_tfjob_tests.py but I still don't know which version is it pulling, is it tf operator's master? Should I send a PR in tf operator repo to add that annotation? |
The google groups need to be sync'd manually (there's an open issue to fix that). So you may or may not have been added to the group and that could explain why you still don't have access (kubeflow/internal-acls#22). If you need to fix the test you will need to submit a PR to the repo that fixes the test. I believe its pulling the test from tf-operator at HEAD
|
Thanks for the help! I will try to make a PR for tf-operator test |
@Bobgy I just checked and it looks like the groups hadn't been sync'd since you submitted your PR to join ci-viewer team. I sync'd them and you should now be a member. |
/retest |
@jlewi Thanks! I verified I can see |
Fantastic work! Thanks for the persistence. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jlewi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@jlewi Thanks for your help! /unhold |
This reverts commit 6fc3f29.
* Enable istio injection in Kubeflow namespace * Test disabling sidecar in e2e test workflow * Update kfctl_e2e_workflow.py * Update kfctl_e2e_workflow.py * Update kfctl_e2e_workflow.py * Update kfctl_e2e_workflow.py
This reverts commit 6fc3f29.
Co-authored-by: Kebechet <noreply+kebechet@redhat.com>
Co-authored-by: Kebechet <noreply+kebechet@redhat.com>
/cc @jlewi
Can you help me review?
Resolves kubeflow/kubeflow#3866
This PR is imitating #123, which is similar
This change is