Migrate to v1.6.0-rc.1 #378

gkcalat · 2022-08-04T17:41:19Z

I apologize for the large PR. Many items needed to be changed in order to test it on GKE 1.22 and 1.21.

Update CHANGELOG (Kubeflow post-v1.5 Work items #360)
Fix ASM/istio ingress gateway issue (502 error on loadbalancer due to incorrect path on backend health check #371)
Migrate deprecated API calls (Support Kubernetes 1.22 #349)
Remove deprecated KFServing (Remove deprecated KFServing in v 1.6 #375)
Remove deprecated cloud-endpoints-controller (Remove deprecated cloud-endpoints-controller #377)
Includes earlier upgrades of cert-manger to v1.5.0 (Upgrade cert-manager to 1.5.0 #372) and knative to v1.2 (Upgrade knative to v 1.2 #373)

Tested on GKE 1.21 + Manifests v1.6.0-rc.1 + KFP 2.0.0-alpha.3
Tested on GKE 1.22 + Manifests v1.6.0-rc.1 + KFP 2.0.0-alpha.3

Move `RequestAuthentication` policy creation to `iap-enabler` for git…

…udPlatform#365.

…orm#365)

kubeflow/common/iap-ingress/base/ingress.yaml

kubeflow/common/iap-ingress/base/config-map.yaml

kubeflow/common/iap-ingress/Makefile

kubeflow/common/iap-ingress/base/config-map.yaml

zijianjoy · 2022-08-05T08:11:12Z

kubeflow/common/iap-ingress/base/swagger_template.yaml

@@ -0,0 +1,64 @@
+swagger: "2.0"


NIT: Consider adding a comment or README to explain where is this file coming from.

This is a generic template. It came from the legacy cloud-endpoints-controller, which is no longer maintained. I am afraid that additional reference might confuse people. Let me know what you think.

That is true that it might be risky to refer to non-maintained repository. However, we are still using such legacy template within our solution. We should still document where this template is coming from, so other maintainer knows that where to find it and the current state is repository not-maintained. In situations like this, more documentation is better than less documentation.

Would you like to take the following actions?

Create a README file within kubeflow/common/iap-ingress.

Explain where swagger_template.yaml and setup_cloudendpoints.sh are originating from

Explain that the cloud-endpoints-controller repository is no longer maintained.

We can also list potential improvement where we can move away from any legacy logic. I found a few materials below. Feel free to decide whether to include them:

openapi.yaml sample: https://github.com/GoogleCloudPlatform/endpoints-samples/blob/master/k8s/openapi.yaml
Single app example using IAP, GKE, Cloud endpoint: https://github.com/salrashid123/iap_endpoints_app
IAP ingress controller (might be deprecated as well): https://github.com/danisla/iapingress-controller/tree/master

Done. Please, review.

kubeflow/common/iap-ingress/base/config-map.yaml

kubeflow/common/iap-ingress/base/swagger_template.yaml

zijianjoy · 2022-08-16T21:07:08Z

kubeflow/common/iap-ingress/README.md

+## iap-enabler
+
+[IAP uses](https://cloud.google.com/iap/docs/signed-headers-howto) JSON Web Tokens ([JWT](https://jwt.io/introduction)) to make sure that a request to kubeflow is authorized. This protects kubeflow from IAP being accidentally disabled, misconfigured firewalls, and access from within the project. This *Deployment* applies a RequestAuthentication (**ingress-jwt**) to the kubeflow cluster based on the [policy.yaml template](./base/policy.yaml).
+
+## backend-updater
+
+HTTPS Load Balancing requires a [health check](https://cloud.google.com/load-balancing/docs/health-check-concepts) to determine if backend (**istio-ingressgateway**) responds to traffic. This *StatefulSet* updates the **iap-backendconfig** with the appropriate backend port and backend path for the corresponding health check.
+
+## cloud-endpoints-enabler
+
+This *Deployment* is introduced to replace cloud-endpoints-controller. It [establishes a cloud endpoint](https://cloud.google.com/endpoints/docs/openapi/get-started-kubernetes-engine-espv2) using OpenAPI specification. It uses [swagger_template.yaml](./base/swagger_template.yaml) to build an appropriate OpenAPI spec. This template was used in the original [cloud-endpoint-controller](https://github.com/danisla/cloud-endpoints-controller) (deprecated) in Kubeflow 1.5.1 and earlier.


Very good writeup! As an additional information, you can share the link to kubeflow/common/iap-ingress/base/config-map.yaml where people can view and update iap-enabler/backend-updater/cloud-endpoints-enabler.

zijianjoy · 2022-08-16T21:08:50Z

/lgtm
/approve

It is an awesome implementation in this PR! Great work Ablai!

google-oss-prow · 2022-08-16T21:09:00Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gkcalat, zijianjoy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [zijianjoy]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

fabito · 2022-08-17T00:56:32Z

kubeflow/common/iap-ingress/base/config-map.yaml

+      gcloud endpoints services add-iam-policy-binding ${ENDPOINT_NAME} \
+          --member serviceAccount:${SERVICE_ACCOUNTNAME} \
+          --role roles/servicemanagement.serviceController
+      gcloud projects add-iam-policy-binding ${PROJECT} \


Why do we need roles/cloudtrace.agent ?
If this is a new role for the admin SA at the project level, I think we need to move it to kf-admin-policy.yaml:

apiVersion: iam.cnrm.cloud.google.com/v1beta1 kind: IAMPolicyMember metadata: name: KUBEFLOW-NAME-admin-cloudtraceagent # kpt-set: ${name}-admin-cloudtraceagent spec: member: serviceAccount:KUBEFLOW-NAME-admin@PROJECT.iam.gserviceaccount.com # kpt-set: serviceAccount:${name}-admin@${gcloud.core.project}.iam.gserviceaccount.com role: roles/cloudtrace.agent resourceRef: apiVersion: resourcemanager.cnrm.cloud.google.com/v1beta1 kind: Project external: projects/PROJECT # kpt-set: projects/${gcloud.core.project}

This is to enable Cloud Trace for troubleshooting, which we might actually disable for now as it doesn't seem to be a necessary feature. Your suggestion on moving it to the YAML file SGTM though. Thank you for your feedback!

fabito · 2022-08-17T00:59:04Z

kubeflow/common/iap-ingress/base/config-map.yaml

+      sed "s|JWT_AUDIENCE|${JWT_AUDIENCE}|;s|ENDPOINT_NAME|${ENDPOINT_NAME}|;s|INGRESS_TARGET_IP|${INGRESS_TARGET_IP}|" /var/envoy-config/swagger_template.yaml > openapi.yaml
+
+      # Deploy and enable the endpoint
+      gcloud endpoints services deploy openapi.yaml


In my setup, the admin SA does not have enough permissions to execute this.
I had to grant it roles/serviceusage.serviceUsageAdmin (see: https://cloud.google.com/service-usage/docs/access-control#predefined_roles)

Perhaps we need a new entry in kf-admin-policy.yaml ?

After solving the permission issue above, I have a new endpoint deployed every 30 secs

Is this expected ?
Shouldn't we add a check and only deploy if necessary ?

Hi @fabito.
Thank you for your feedback!

The purpose of cloud-endpoints-enabler is to create and activate a cloud endpoint during deployment. It's supposed to be deleted at the end of make apply run. The behavior you observed is not intended, as we recommend running make apply and choose necessary components in config.yaml instead of deploying each component separately.

As per permissions, I was not able to reproduce the error you mentioned. Could you create a separate issue with details about your GKE cluster?

Current approach clearly has room for improvement. Your contributions are very welcome!

fabito · 2022-08-17T04:46:20Z

kubeflow/common/iap-ingress/base/config-map.yaml

+    gcloud config list
+    gcloud auth list
+
+    set_endpoint () {


Should we exit early if it is already setup ?

Suggested change

set_endpoint () {

set_endpoint () {

gcloud endpoints services describe ${ENDPOINT_NAME}

if [ $? == 0 ] ; then

echo "${ENDPOINT_NAME} cloud endpoint already setup"

return 0

fi

What about scenarios where we want to update it ?

The purpose of cloud-endpoints-enabler is to create and activate a cloud endpoint during deployment. It's supposed to be deleted at the end of make apply run. We do not support update scenario for now. @zijianjoy mentioned above an idea about having boolean flags to decide when to delete cloud-endpoints-enabler, iap-enabler, and backend-updater workloads.

Ok, got it.

We don't use the makefiles in our setup. We keep the manifests/kustomizations in git and use Fluxcd to update our fleet.
I need to think about how we can better handle the *enabler workloads...

Thanks for the clarification.

gkcalat and others added 24 commits July 6, 2022 10:47

Merge pull request #1 from kubeflow/master

b09010d

Move `RequestAuthentication` policy creation to `iap-enabler` for git…

Merge branch 'kubeflow:master' into master

2118483

Merge branch 'kubeflow:master' into master

8ebfe96

Merge branch 'kubeflow:master' into master

5c4e4e5

Update changelog.md. Closes GoogleCloudPlatform#360. Closes GoogleClo…

80469f5

…udPlatform#365.

Remove deprecated KFServing component. Closes GoogleCloudPlatform#375

0f2662f

Upgrade knative serving to v1.2.5, net-istio to 1.2 (GoogleCloudPlatf…

ee585c8

…orm#365)

Add comment about serving-crds.yaml

4221c30

Update the backend-updater workload to fix GoogleCloudPlatform#371

fc3bc45

Prevent recreation of iap-enabler and backend-updater

76dde6f

Migrate from authorization.k8s.io/v1beta1

5e08f5d

Migrate from networking.k8s.io/v1beta1

fc735cf

Migrate from rbac.authorization.k8s.io/v1beta1

3444e76

Update config-connector

b0f60ab

Clean up comments after removing KFServing

3b05197

Migrate from apiextensions.k8s.io/v1beta1

994a11e

Update README for config-controller

26cf798

Change pathType in ingress, fix typos

fa8491e

Bump upstream tags

4e5e319

Update CHANGELOG

49877d3

Migrate from cloud-endpoints-controller

d491a3b

Move cloud endpoint to deployments

c3e640c

Deprecate cloud-endpoints-controller

9411955

Update changelog. Closes GoogleCloudPlatform#377.

cbe4366

gkcalat requested a review from zijianjoy August 4, 2022 17:41

google-oss-prow bot added the size/XXL label Aug 4, 2022

Merge branch 'master' into updateChangelog

5407621

gkcalat changed the title ~~Migrate~~ Migrate to v1.6.0-rc.1 Aug 5, 2022

zijianjoy reviewed Aug 5, 2022

View reviewed changes

Move services activation to the website instructions

c9d293b

Add readme for iap-ingress component

f430ef7

zijianjoy reviewed Aug 16, 2022

View reviewed changes

google-oss-prow bot assigned zijianjoy Aug 16, 2022

google-oss-prow bot added the lgtm label Aug 16, 2022

google-oss-prow bot added the approved label Aug 16, 2022

google-oss-prow bot merged commit 3cdefe5 into GoogleCloudPlatform:master Aug 16, 2022

fabito reviewed Aug 17, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate to v1.6.0-rc.1 #378

Migrate to v1.6.0-rc.1 #378

gkcalat commented Aug 4, 2022

zijianjoy Aug 5, 2022

gkcalat Aug 7, 2022

zijianjoy Aug 8, 2022

gkcalat Aug 16, 2022

zijianjoy Aug 16, 2022

zijianjoy commented Aug 16, 2022

google-oss-prow bot commented Aug 16, 2022

fabito Aug 17, 2022 •

edited

Loading

gkcalat Aug 17, 2022

fabito Aug 17, 2022

fabito Aug 17, 2022 •

edited

Loading

gkcalat Aug 17, 2022

fabito Aug 17, 2022

gkcalat Aug 17, 2022

fabito Aug 19, 2022

-    set_endpoint () {
+    set_endpoint () {
+        gcloud endpoints services describe ${ENDPOINT_NAME}
+        if [ $? == 0 ] ; then
+            echo "${ENDPOINT_NAME} cloud endpoint already setup"
+            return 0
+        fi

Migrate to v1.6.0-rc.1 #378

Migrate to v1.6.0-rc.1 #378

Conversation

gkcalat commented Aug 4, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zijianjoy commented Aug 16, 2022

google-oss-prow bot commented Aug 16, 2022

fabito Aug 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabito Aug 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabito Aug 17, 2022 •

edited

Loading

fabito Aug 17, 2022 •

edited

Loading