-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to v1.6.0-rc.1 #378
Migrate to v1.6.0-rc.1 #378
Conversation
Move `RequestAuthentication` policy creation to `iap-enabler` for git…
@@ -0,0 +1,64 @@ | |||
swagger: "2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Consider adding a comment or README to explain where is this file coming from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a generic template. It came from the legacy cloud-endpoints-controller, which is no longer maintained. I am afraid that additional reference might confuse people. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true that it might be risky to refer to non-maintained repository. However, we are still using such legacy template within our solution. We should still document where this template is coming from, so other maintainer knows that where to find it and the current state is repository not-maintained
. In situations like this, more documentation is better than less documentation.
Would you like to take the following actions?
- Create a README file within
kubeflow/common/iap-ingress
. - Explain where
swagger_template.yaml
andsetup_cloudendpoints.sh
are originating from - Explain that the
cloud-endpoints-controller
repository is no longer maintained. - We can also list potential improvement where we can move away from any legacy logic. I found a few materials below. Feel free to decide whether to include them:
openapi.yaml sample: https://github.com/GoogleCloudPlatform/endpoints-samples/blob/master/k8s/openapi.yaml
Single app example using IAP, GKE, Cloud endpoint: https://github.com/salrashid123/iap_endpoints_app
IAP ingress controller (might be deprecated as well): https://github.com/danisla/iapingress-controller/tree/master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Please, review.
## iap-enabler | ||
|
||
[IAP uses](https://cloud.google.com/iap/docs/signed-headers-howto) JSON Web Tokens ([JWT](https://jwt.io/introduction)) to make sure that a request to kubeflow is authorized. This protects kubeflow from IAP being accidentally disabled, misconfigured firewalls, and access from within the project. This *Deployment* applies a RequestAuthentication (**ingress-jwt**) to the kubeflow cluster based on the [policy.yaml template](./base/policy.yaml). | ||
|
||
## backend-updater | ||
|
||
HTTPS Load Balancing requires a [health check](https://cloud.google.com/load-balancing/docs/health-check-concepts) to determine if backend (**istio-ingressgateway**) responds to traffic. This *StatefulSet* updates the **iap-backendconfig** with the appropriate backend port and backend path for the corresponding health check. | ||
|
||
## cloud-endpoints-enabler | ||
|
||
This *Deployment* is introduced to replace cloud-endpoints-controller. It [establishes a cloud endpoint](https://cloud.google.com/endpoints/docs/openapi/get-started-kubernetes-engine-espv2) using OpenAPI specification. It uses [swagger_template.yaml](./base/swagger_template.yaml) to build an appropriate OpenAPI spec. This template was used in the original [cloud-endpoint-controller](https://github.com/danisla/cloud-endpoints-controller) (deprecated) in Kubeflow 1.5.1 and earlier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good writeup! As an additional information, you can share the link to kubeflow/common/iap-ingress/base/config-map.yaml where people can view and update iap-enabler/backend-updater/cloud-endpoints-enabler.
/lgtm It is an awesome implementation in this PR! Great work Ablai! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gkcalat, zijianjoy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
gcloud endpoints services add-iam-policy-binding ${ENDPOINT_NAME} \ | ||
--member serviceAccount:${SERVICE_ACCOUNTNAME} \ | ||
--role roles/servicemanagement.serviceController | ||
gcloud projects add-iam-policy-binding ${PROJECT} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need roles/cloudtrace.agent
?
If this is a new role for the admin SA at the project level, I think we need to move it to kf-admin-policy.yaml
:
apiVersion: iam.cnrm.cloud.google.com/v1beta1
kind: IAMPolicyMember
metadata:
name: KUBEFLOW-NAME-admin-cloudtraceagent # kpt-set: ${name}-admin-cloudtraceagent
spec:
member: serviceAccount:KUBEFLOW-NAME-admin@PROJECT.iam.gserviceaccount.com # kpt-set: serviceAccount:${name}-admin@${gcloud.core.project}.iam.gserviceaccount.com
role: roles/cloudtrace.agent
resourceRef:
apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1
kind: Project
external: projects/PROJECT # kpt-set: projects/${gcloud.core.project}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to enable Cloud Trace for troubleshooting, which we might actually disable for now as it doesn't seem to be a necessary feature. Your suggestion on moving it to the YAML file SGTM though. Thank you for your feedback!
sed "s|JWT_AUDIENCE|${JWT_AUDIENCE}|;s|ENDPOINT_NAME|${ENDPOINT_NAME}|;s|INGRESS_TARGET_IP|${INGRESS_TARGET_IP}|" /var/envoy-config/swagger_template.yaml > openapi.yaml | ||
|
||
# Deploy and enable the endpoint | ||
gcloud endpoints services deploy openapi.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my setup, the admin SA does not have enough permissions to execute this.
I had to grant it roles/serviceusage.serviceUsageAdmin
(see: https://cloud.google.com/service-usage/docs/access-control#predefined_roles)
Perhaps we need a new entry in kf-admin-policy.yaml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fabito.
Thank you for your feedback!
The purpose of cloud-endpoints-enabler is to create and activate a cloud endpoint during deployment. It's supposed to be deleted at the end of make apply
run. The behavior you observed is not intended, as we recommend running make apply
and choose necessary components in config.yaml instead of deploying each component separately.
As per permissions, I was not able to reproduce the error you mentioned. Could you create a separate issue with details about your GKE cluster?
Current approach clearly has room for improvement. Your contributions are very welcome!
gcloud config list | ||
gcloud auth list | ||
|
||
set_endpoint () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we exit early if it is already setup ?
set_endpoint () { | |
set_endpoint () { | |
gcloud endpoints services describe ${ENDPOINT_NAME} | |
if [ $? == 0 ] ; then | |
echo "${ENDPOINT_NAME} cloud endpoint already setup" | |
return 0 | |
fi |
What about scenarios where we want to update it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of cloud-endpoints-enabler is to create and activate a cloud endpoint during deployment. It's supposed to be deleted at the end of make apply
run. We do not support update scenario for now. @zijianjoy mentioned above an idea about having boolean flags to decide when to delete cloud-endpoints-enabler, iap-enabler, and backend-updater workloads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, got it.
We don't use the makefiles in our setup. We keep the manifests/kustomizations in git and use Fluxcd to update our fleet.
I need to think about how we can better handle the *enabler workloads...
Thanks for the clarification.
I apologize for the large PR. Many items needed to be changed in order to test it on GKE 1.22 and 1.21.