Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: refactor KFP managed storage a separate package. Fixes #275 #272

Merged

Conversation

Bobgy
Copy link
Contributor

@Bobgy Bobgy commented May 12, 2021

Fixes #275

@Bobgy
Copy link
Contributor Author

Bobgy commented May 12, 2021

/assign @zijianjoy
I haven't troubleshooted the setup, creating the PR for early review.

Makefile Outdated
Comment on lines 3 to 5
kpt cfg set -R kubeflow name KUBEFLOW-NAME

kpt cfg set -R management name MANAGEMENT-NAME
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good separation! we might want to also separate the name setter to kubeflow-name and management-name to avoid conflict: People usually deploy Kubeflow cluster right after finishing Management cluster deployment, so there will be overwriting on environment variable name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I'd also want to rename them when looking at this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to do a separate PR though, because this will cause many changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should do it in a separate PR. It is not a blocking issue for release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, but it's a little hard -- also I start to feel that setting a name for kubeflow/mangement folder seems like a good enough abstraction, WDUT?

@@ -62,7 +62,6 @@ Provide actual value for the following variables in `env.sh`:
```
KF_NAME=<kubeflow-cluster-name>
KF_PROJECT=<gcp-project-id>
KF_DIR=<current-kubeflow-directory-path>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for deployment documentation. Should we remove them from documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KF_DIR is a bash only env var, so I think we can only keep it in documentation. It's kind of problematic in env.sh, because we might want to run env.sh from different working directories.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, then we should probably extract this line in documentation, and ask user to run manually.

kubeflow/common/managed-storage/Makefile Outdated Show resolved Hide resolved
kubeflow/common/managed-storage/Makefile Show resolved Hide resolved
kubeflow/env.sh Outdated
# * is unique within your project.
# * alphanumeric characters and hyphen "-" only.
# * starts with a letter.
# * ends with an alphanumeric character.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have a length requirement too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't remember the limit off the top of my head. We can add it later (non-release blocker)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, it is no more than 25 characters, I think it is a calculated result because this name is used as a prefix for longer variables. But I don't know the actual formula for getting number 25.

kubeflow/env.sh Show resolved Hide resolved
kubeflow/kpt-set.sh Show resolved Hide resolved
Comment on lines 51 to 58
CLOUDSQL_NAME="${KF_NAME}-kfp"
BUCKET_NAME="${KF_PROJECT}-kfp"
# common/managed-storage deploys specified CloudSQL and Cloud Storage bucket.
kpt cfg set common/managed-storage cloudsql-name "${CLOUDSQL_NAME}"
kpt cfg set common/managed-storage bucket-name "${BUCKET_NAME}"
# apps/pipelines uses specified CloudSQL and Cloud Storage bucket.
kpt cfg set apps/pipelines cloudsql-name "${CLOUDSQL_NAME}"
kpt cfg set apps/pipelines bucket-name "${BUCKET_NAME}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Note that we currently paste the whole content of env.sh and kpt-set.sh in https://www.kubeflow.org/docs/distributions/gke/deploy/deploy-cli/#environment-variables.

I can foresee that it becomes harder and harder to keep shell script, README and kubeflow.org documentation in sync when this file grows. I recommend using this file as the source of truth by documentation, to prevent out-of-sync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a reasonable choice

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Yuan for confirming! I will update documentation accordingly

@Bobgy
Copy link
Contributor Author

Bobgy commented May 13, 2021

/assign @zijianjoy
I revamped according to comments and verified the deployment works e2e now.

@Bobgy Bobgy changed the title feat: refactor KFP managed storage a separate package feat: refactor KFP managed storage a separate package. Fixes #275 May 13, 2021
exit 1
fi

echo "Deleting all GCP resources will cause destruction of all services and data on this cluster. Confirm? [y/N]";
Copy link
Collaborator

@zijianjoy zijianjoy May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT:... will cause destruction of all services and data except Cloud SQL instance and GCS bucket.

Copy link
Collaborator

@zijianjoy zijianjoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thank you Yuan for the awesome work and detailed documentation! Only a few NITs, merging this PR and we can follow up separately for comments.

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Bobgy, zijianjoy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-robot google-oss-robot merged commit d983318 into GoogleCloudPlatform:master May 13, 2021
@Bobgy Bobgy deleted the kfp-separate-storage branch May 14, 2021 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Separate cloud-sql, bucket from KFP, document its behavior, update Makefile.
3 participants