Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Management cluster should not use CNRM in namespaced mode #13

Open
jlewi opened this issue May 15, 2020 · 3 comments
Open

Management cluster should not use CNRM in namespaced mode #13

jlewi opened this issue May 15, 2020 · 3 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented May 15, 2020

We are currently recommending installing CNRM in namespaced mode.

I'm not sure thats what we should recommend.

  • Namespace mode is inconvenient when managing multiple projects as we end up having to create
    multiple deployments of the CNRM system

  • Using ACM to install CNRM doesn't appear to install it in namespaced mode.

@issue-label-bot
Copy link

Issue Label Bot is not confident enough to auto-label this issue.
See dashboard for more details.

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the labels:

Label Probability
platform/gcp 0.82
kind/bug 0.62
area/front-end 0.54

Please mark this comment with 👍 or 👎 to give our bot feedback!
Links: app homepage, dashboard and code for this bot.

jlewi pushed a commit to jlewi/manifests that referenced this issue Jul 29, 2020
* The management blueprint should have its own KptFile
  * Prior to this PR there was only a KptFile at gcp/
  * This doesn't work because for the management cluster we
    only pull the package gcp/v2/management

* Related to GoogleCloudPlatform/kubeflow-distribution#102
* Related to GoogleCloudPlatform/kubeflow-distribution#93

* For CNRM Switch to workload identity and stop using namespace mode for CNRM; GoogleCloudPlatform/kubeflow-distribution#13

  * Using namespace mode is just extra complexity because we have to install
    a separate copy of the CNRM controller for every project.
    * The only reason to do really do that is if you want to use different
      GCP service accounts to manage different projects. Typically that's
      not what we do.
    * With workload identity we have 1 namespace per project but they
      all use the same GCP SA so the GCP sa can just be authorized to
      access multiple projects or a folder as needed.

* Update the resources to the v1beta1 spec for use with AnthosCLI

  * It looks like anthoscli requires a NodePool resource
  * With the v1beta1 specs we need to add the annotation gke.cluster.io = "bootstrap://" so that anthoscli is able to probably group the resources.

* Move cnrm-install iam and services into kustomize packages
  * This way we can hydrate them like we do other manifests

* Fix the setters and substitutions for CNRM to make them unique per name
  * This way we could potentially have multiple management clusters per project
    which if nothing else will be useful for testing.
jlewi pushed a commit to jlewi/gcp-blueprints that referenced this issue Jul 29, 2020
…CNRM

* management/instance needs a Kptfile to work with the latest versions of kpt

* Per GoogleCloudPlatform#13 we don't want to run CNRM in namespace mode because this burdensome
  instead we use workload identity mode; i.e. the same GCP sa to administer
  multiple projects.

Related to GoogleCloudPlatform#13 - Use workload identity mode
Related to GoogleCloudPlatform#102 Fix blueprint

* Remove cluster and nodepool patches from instance; we aren't actually patching anything.
jlewi pushed a commit to jlewi/website that referenced this issue Jul 29, 2020
* Instructions should reference the Makefile
* We will now install CNRM in workload identity mode not namespace mode
  per GoogleCloudPlatform/kubeflow-distribution#13
jlewi pushed a commit to jlewi/website that referenced this issue Jul 29, 2020
* Instructions should reference the Makefile
* We will now install CNRM in workload identity mode not namespace mode
  per GoogleCloudPlatform/kubeflow-distribution#13
k8s-ci-robot pushed a commit to kubeflow/website that referenced this issue Jul 29, 2020
* Instructions should reference the Makefile
* We will now install CNRM in workload identity mode not namespace mode
  per GoogleCloudPlatform/kubeflow-distribution#13
k8s-ci-robot pushed a commit to kubeflow/manifests that referenced this issue Jul 30, 2020
…RM. (#1432)

* Fix management blueprint kptfile and stop using namespace mode for CNRM.

* The management blueprint should have its own KptFile
  * Prior to this PR there was only a KptFile at gcp/
  * This doesn't work because for the management cluster we
    only pull the package gcp/v2/management

* Related to GoogleCloudPlatform/kubeflow-distribution#102
* Related to GoogleCloudPlatform/kubeflow-distribution#93

* For CNRM Switch to workload identity and stop using namespace mode for CNRM; GoogleCloudPlatform/kubeflow-distribution#13

  * Using namespace mode is just extra complexity because we have to install
    a separate copy of the CNRM controller for every project.
    * The only reason to do really do that is if you want to use different
      GCP service accounts to manage different projects. Typically that's
      not what we do.
    * With workload identity we have 1 namespace per project but they
      all use the same GCP SA so the GCP sa can just be authorized to
      access multiple projects or a folder as needed.

* Update the resources to the v1beta1 spec for use with AnthosCLI

  * It looks like anthoscli requires a NodePool resource
  * With the v1beta1 specs we need to add the annotation gke.cluster.io = "bootstrap://" so that anthoscli is able to probably group the resources.

* Move cnrm-install iam and services into kustomize packages
  * This way we can hydrate them like we do other manifests

* Fix the setters and substitutions for CNRM to make them unique per name
  * This way we could potentially have multiple management clusters per project
    which if nothing else will be useful for testing.

* Add workload identity pool to the management cluster.

* Management nodepool should set workloadMetadataConfig so that we run the workload identity servers.

* Fix.
k8s-ci-robot pushed a commit that referenced this issue Jul 30, 2020
…CNRM (#105)

* management/instance needs a Kptfile to work with the latest versions of kpt

* Per #13 we don't want to run CNRM in namespace mode because this burdensome
  instead we use workload identity mode; i.e. the same GCP sa to administer
  multiple projects.

Related to #13 - Use workload identity mode
Related to #102 Fix blueprint

* Remove cluster and nodepool patches from instance; we aren't actually patching anything.
jlewi pushed a commit to jlewi/manifests that referenced this issue Jul 31, 2020
* The management blueprint should have its own KptFile
  * Prior to this PR there was only a KptFile at gcp/
  * This doesn't work because for the management cluster we
    only pull the package gcp/v2/management

* Related to GoogleCloudPlatform/kubeflow-distribution#102
* Related to GoogleCloudPlatform/kubeflow-distribution#93

* For CNRM Switch to workload identity and stop using namespace mode for CNRM; GoogleCloudPlatform/kubeflow-distribution#13

  * Using namespace mode is just extra complexity because we have to install
    a separate copy of the CNRM controller for every project.
    * The only reason to do really do that is if you want to use different
      GCP service accounts to manage different projects. Typically that's
      not what we do.
    * With workload identity we have 1 namespace per project but they
      all use the same GCP SA so the GCP sa can just be authorized to
      access multiple projects or a folder as needed.

* Update the resources to the v1beta1 spec for use with AnthosCLI

  * It looks like anthoscli requires a NodePool resource
  * With the v1beta1 specs we need to add the annotation gke.cluster.io = "bootstrap://" so that anthoscli is able to probably group the resources.

* Move cnrm-install iam and services into kustomize packages
  * This way we can hydrate them like we do other manifests

* Fix the setters and substitutions for CNRM to make them unique per name
  * This way we could potentially have multiple management clusters per project
    which if nothing else will be useful for testing.
k8s-ci-robot pushed a commit to kubeflow/manifests that referenced this issue Jul 31, 2020
…stop using namespace #1437: Fix cloudresourcemanager service; missing ApiVersion. Cherry pick of #1432 #1437 on v1.1-branch. #1432: Fix management blueprint kptfile and stop using namespace #1437: Fix cloudresourcemanager service; missing ApiVersion. (#1439)

* Fix management blueprint kptfile and stop using namespace mode for CNRM.

* The management blueprint should have its own KptFile
  * Prior to this PR there was only a KptFile at gcp/
  * This doesn't work because for the management cluster we
    only pull the package gcp/v2/management

* Related to GoogleCloudPlatform/kubeflow-distribution#102
* Related to GoogleCloudPlatform/kubeflow-distribution#93

* For CNRM Switch to workload identity and stop using namespace mode for CNRM; GoogleCloudPlatform/kubeflow-distribution#13

  * Using namespace mode is just extra complexity because we have to install
    a separate copy of the CNRM controller for every project.
    * The only reason to do really do that is if you want to use different
      GCP service accounts to manage different projects. Typically that's
      not what we do.
    * With workload identity we have 1 namespace per project but they
      all use the same GCP SA so the GCP sa can just be authorized to
      access multiple projects or a folder as needed.

* Update the resources to the v1beta1 spec for use with AnthosCLI

  * It looks like anthoscli requires a NodePool resource
  * With the v1beta1 specs we need to add the annotation gke.cluster.io = "bootstrap://" so that anthoscli is able to probably group the resources.

* Move cnrm-install iam and services into kustomize packages
  * This way we can hydrate them like we do other manifests

* Fix the setters and substitutions for CNRM to make them unique per name
  * This way we could potentially have multiple management clusters per project
    which if nothing else will be useful for testing.

* Add workload identity pool to the management cluster.

* Management nodepool should set workloadMetadataConfig so that we run the workload identity servers.

* Fix.

* Fix cloudresourcemanager service; missing ApiVersion.

Related to: GoogleCloudPlatform/kubeflow-distribution#102
@jlewi
Copy link
Contributor Author

jlewi commented Aug 2, 2020

We should cherry pick the fixes onto the 1.1 branch and then we can close this.

jlewi pushed a commit to jlewi/gcp-blueprints that referenced this issue Aug 12, 2020
…CNRM

* management/instance needs a Kptfile to work with the latest versions of kpt

* Per GoogleCloudPlatform#13 we don't want to run CNRM in namespace mode because this burdensome
  instead we use workload identity mode; i.e. the same GCP sa to administer
  multiple projects.

Related to GoogleCloudPlatform#13 - Use workload identity mode
Related to GoogleCloudPlatform#102 Fix blueprint

* Remove cluster and nodepool patches from instance; we aren't actually patching anything.
k8s-ci-robot pushed a commit that referenced this issue Aug 13, 2020
…use workload identity #109: Update instructions for using ACM. #113: ACM: notebook controller needs to use istio ingress Cherry pick of #105 #109 #113 on v1.1-branch. #105: Management blueprint; add kptfile and use workload identity #109: Update instructions for using ACM. #113: ACM: notebook controller needs to use istio ingress (#122)

* Management blueprint; add kptfile and use workload identity mode for CNRM

* management/instance needs a Kptfile to work with the latest versions of kpt

* Per #13 we don't want to run CNRM in namespace mode because this burdensome
  instead we use workload identity mode; i.e. the same GCP sa to administer
  multiple projects.

Related to #13 - Use workload identity mode
Related to #102 Fix blueprint

* Remove cluster and nodepool patches from instance; we aren't actually patching anything.

* Update instructions for using ACM.

* Use a kpt function to remove namespace from non namespace scoped
  objects

* Use yq to attach backend config to the ingress.

* Remove the iap enabler pod; this is a partial work around for #14

  * The IAP enabler pod will try to update the ISTIO security policy
    which will conflict with ACM. So we disable it for now even though
    that means we have to manually update the health check.

* Switch to using a structured repo with ACM (#29)

  * Add a script to rewrite the YAML files in the appropriate structure

  * If we don't use a structured repository we end up with problems because
    resources in different namespaces but with the same name will be written
    to the same file.

* Add a hack to create the kube-system namespace as part of the ACM deployment.

  * Now that we are using structured repositories we need to have
    a namespace directory with a namespace.yaml for kube-system
    in order to install resources in that namespace.

Related to #4 - use ACM to deploy Kubeflow

* ACM: notebook controller needs to use istio ingress istio-system/ingressgateway

* Related to #111
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant