Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GCP #114

Closed
deliahu opened this issue May 29, 2019 · 8 comments · Fixed by #1655
Closed

Support GCP #114

deliahu opened this issue May 29, 2019 · 8 comments · Fixed by #1655
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@deliahu
Copy link
Member

deliahu commented May 29, 2019

Notes

  • Update some docs (e.g. how to install GPUs)
@deliahu deliahu added enhancement New feature or request v0.5 labels May 29, 2019
@deliahu deliahu added this to the v0.5 milestone May 29, 2019
@deliahu deliahu removed this from the v0.5 milestone Jun 5, 2019
@deliahu deliahu removed the v0.5 label Jun 5, 2019
@deliahu deliahu added the v0.6 label Jun 25, 2019
@deliahu deliahu removed the v0.6 label Jun 27, 2019
@amutz
Copy link

amutz commented Nov 19, 2019

When GCP support is added, is there any intention of supporting deploying models to Google Cloud Run?

It is their hosted Knative and is dead simple serverless containers. We use GCR to evaluate predictions in production and are extremely happy with it.

@ospillinger
Copy link
Member

Hey Andrew, thanks for your feedback! I agree that Knative and GCR are super relevant and we just spent some time evaluating them. That being said, we built Cortex on top of Kubernetes to simplify supporting all cloud providers so our current plan is to run on GKE but that could definitely change. Our technology decisions are primarily focused on optimizing developer experience and minimizing the cost for running inference at scale so we try to avoid managed services that cost more as usage increases. We also try to abstract the underlying infrastructure services to allow users to focus on ML, so ideally whether we run on GKE or GCR shouldn't matter too much. Are there any features in GCR that you think Cortex is missing?

@amutz
Copy link

amutz commented Nov 23, 2019

With Cortex on GKE will I still have to reason about capacity? Selecting number of nodes for peak load? Will I be able to scale to zero nodes when there is no load?

@ospillinger
Copy link
Member

Cortex doesn't run on GKE yet, I'm sorry if I mislead you. The latest version of Cortex can autoscale based on load up to the maximum number of nodes that you set. Our next release will include the ability to autoscale the inference nodes to 0 if no deployments are running though there will always be a Cortex operator (server) node unless you spin down the entire cluster. The operator resource requests are small so an inexpensive instance is sufficient. When we add GCP support, all this functionality will be ported over.

@sam-writer
Copy link

Is it possible to know which tickets need to be done to support GKE? Or does that work still need to be identified?

@ospillinger
Copy link
Member

Hey @sam-qordoba, we have some initial thoughts, but we have not fully scoped it out yet. I'd be happy to jump on a call to discuss if you'd like, feel free to email me at omer@cortexlabs.com

@gautamchitnis
Copy link

I am interested in helping cortex support GCP as I am currently planning to use GCP for my product deployment. Do let me know how I can contribute.

@ospillinger
Copy link
Member

Hi @gautamchitnis, thank you for offering to contribute! We still don't know exactly when we'll have time to work on GCP support. If you're interested in contributing in other ways please feel free to email me at omer@cortexlabs.com and we can set up some time to chat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants