-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GCP support to Cortex #1655
Conversation
# Conflicts: # cli/cmd/gcp.go # go.mod # pkg/operator/config/config.go
This reverts commit 31c07ba.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had some comments, but overall, LGTM! Can't approve the PR because I'm the author of this.
MetadataRoot string `json:"metadata_root"` | ||
ProjectID string `json:"project_id"` | ||
ProjectKey string `json:"project_key"` | ||
LocalModelCaches []*LocalModelCache `json:"local_model_cache"` // local only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may be able to get rid of LocalModelCaches
too from the API struct. This is only required internally when deploying an API and not on the serving container. Just like it is the case with CuratedModelResoures
. Not exactly in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deliahu at a second look, it seems like LocalModelCaches
is still necessary due to one reason: When re-deploying an API of whose model's contents have changed, without the LocalModelCaches
present in the API spec, the API wouldn't get updated. The function responsible for this is areAPIsEqual
in the local/api.go
file.
Assuming the above situation doesn't exist, one small setback which I think falls outside the scope of this PR is that we'd need to start using the model IDs from the running containers to determine if a model can be removed from the Cortex cache or not - right now, we are relying on the API spec to tell us that. I think keeping this in the API spec is better because a container is ephemeral, whereas the API spec will be there after a reboot as well - just as the model will still be there.
Because of the first reason listed in this comment, I think the presence of LocalModelCaches
in the API spec is actually justified.
S3Path bool `json:"s3_path"` | ||
Versions []int64 `json:"versions"` | ||
S3Path bool `json:"s3_path"` | ||
GCSPath bool `json:"gcs_path"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GCS notation is potentially confusing. In some places, we go with gs
and in others with gcs
. What is the difference?
A: gs is the name of the bucket type on GCP (i.e. gs://models/onnx/
) and GCS is the name of the bucket storage system on GCP. We should probably name these as gs
to keep it consistent with the S3 alternative and to eliminate the potential confusion.
Not a showstopper for this PR.
@@ -1,5 +1,3 @@ | |||
#!/bin/bash | |||
|
|||
# Copyright 2020 Cortex Labs, Inc. | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason for addition: when importing the cortex package outside the container (in a dev setup), this __init__.py
seems to be required. The rename that git is reporting is inaccurate.
@@ -788,7 +676,7 @@ def find_ondisk_model_info(lock_dir: str, model_name: str) -> Tuple[List[str], L | |||
|
|||
class TFSModelLoader(mp.Process): | |||
""" | |||
Monitors the S3 path(s)/dir and continuously updates the models on TFS. | |||
Monitors the cloud path(s)/dir (S3 or GS) and continuously updates the models on TFS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same confusion about GS vs GCS. Can be fixed later on.
Closes #114, closes #1600, closes #1602, closes #1616, closes #1624, closes #1530, closes #1633, closes #1646
checklist:
make test
andmake lint