-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a management cluster for GCP blueprints #644
Comments
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
i created the cluster
We should check in the configs before closing this issue. |
* Management cluster is a cluster running Cloud Config connector which can be used to create GCP resources. * This PR checks in the config for cluster kf-ci-management. We also setup a namespace to administer resources in project kubeflow-ci-deployment Fix kubeflow#644
There were a couple bugs in the permission setup for CNRM that will be fixed in subsequent PR. Also one of the problems we ran into was that the service account we used with CNRM lives in project kubeflow-ci-deployment and ended up getting GC'd. To fix this We can switch to using the kubeflow-testing service account. Also per #650 I put the project "kubeflow-ci-deployment" into a subfolder. We can use the folder to grant "kubeflow-testing" permission on project "kubeflow-ci-deployment" so that we don't have to worry about it being GCD. |
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
* Create a simple script to deploy Kubeflow using the GCP blueprint. This is basically just a wrapper around make commands. * This is the first step in setting up auto deployments of the GCP blueprint for CI purposes. * Fix some bugs in the management cluster that popped up while testing the blueprint * Fix CNRM install for the kubeflow-ci-deployment namespace. CNRM wasn't properly configured to administer that namespace. The appropriate role bindings weren't being created in the correct namespaces and the statefulset was using the host project and not the managed project. * See kubeflow#644 for reference on the management cluster settup * We should use the kubeflow-testing@kubeflow-ci service account and not a service account owned by project kubeflow-ci-deployment as the latter is being GC'd by our cleanup ci scripts which breaks the management cluster. * Also per kubeflow#644 permissions are now set at the folder level to prevent the permissions from being GC'd.
) * Create a simple script to deploy Kubeflow using the GCP blueprint. This is basically just a wrapper around make commands. * This is the first step in setting up auto deployments of the GCP blueprint for CI purposes. * Fix some bugs in the management cluster that popped up while testing the blueprint * Fix CNRM install for the kubeflow-ci-deployment namespace. CNRM wasn't properly configured to administer that namespace. The appropriate role bindings weren't being created in the correct namespaces and the statefulset was using the host project and not the managed project. * See #644 for reference on the management cluster settup * We should use the kubeflow-testing@kubeflow-ci service account and not a service account owned by project kubeflow-ci-deployment as the latter is being GC'd by our cleanup ci scripts which breaks the management cluster. * Also per #644 permissions are now set at the folder level to prevent the permissions from being GC'd.
* We want to install ACM on the kf-ci-management cluster in project kubeflow-ci so that we can start using GitOps to manage CI infrastructure. * Related to kubeflow#644
* We want to install ACM on the kf-ci-management cluster in project kubeflow-ci so that we can start using GitOps to manage CI infrastructure. * Related to kubeflow#644 * Remove status from the cleanup ci job. This breaks ACM sync. * Add a cluster selector to ACM so that we only install the auto-deploy namespace on the appropriate cluster. * Add an annotation to all auto-deploy tasks so we only synchronize them to the appropriate cluster.
Related to: GoogleCloudPlatform/kubeflow-distribution#13 I don't think we want to run CNRM in namespace mode as it makes it difficult for the management cluster to administer multiple projects. Uninstall the namespace specific components.
Create a new service service account to administer ci projects.
|
Made ci-projects-manager@kubeflow-ci.iam.gserviceaccount.com an owner of folder ci-projects |
Grant workload identity.
|
Delete the old version of CNRM.
|
Install CNRM 1.9 in workload identity mode
|
* Install ACM on the Kubeflow CI management cluster. * We want to install ACM on the kf-ci-management cluster in project kubeflow-ci so that we can start using GitOps to manage CI infrastructure. * Related to #644 * Remove status from the cleanup ci job. This breaks ACM sync. * Add a cluster selector to ACM so that we only install the auto-deploy namespace on the appropriate cluster. * Add an annotation to all auto-deploy tasks so we only synchronize them to the appropriate cluster. * * configsync directory for management cluster should be located in the management directory. * * Create namespace issue-label-bot-dev in the kf-ci-management cluster. This namespace will be used to administer the issue-label-bot-dev project. * Fix annotations. * * Add tekton to our ACM repo. * We will eventually want to use ACM to manage our Tekton installs. * On our CI managment cluster we currently don't need tekton. However nomos is giving us sync errors because the Tekton CRDs don't exist. * The CI management cluster shouldn't use KCC in namespace mode. * Using namespace mode is annoying when creating new projects because we need to setup a new service account for every project. * Much simpler to create a single service account with permission to administer a folder.
We should create a management cluster to run and deploy GCP blueprints as well as other test infrastructure.
The text was updated successfully, but these errors were encountered: