-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Staging Hub deployment for Pangeo #599
Comments
I've deployed a hub... sort of. k8s isn't able to mount the NFS server and I'm not sure if it's because I missed a step or because of the private cluster #597 (comment) |
I believe this is no longer blocked. Now we need to have a team discussion about whether the NFS strategy used in the PR is the right strategy to use in general. I've updated this issue to mark it as-such. Check out @sgibson91's main question here: |
A staging hub exists https://staging.pangeo.2i2c.cloud/ But spawning of the user server fails which means the NFS still needs some tweaking. Not sure if that needs to happen in #597 or #613 |
congrats @sgibson91 :-) 🚀 could we define a hand-off plan for this issue while you're away? I tried updating the top comment so it's clear what the next steps are...what's the information that could make it easiest for somebody else to finish up the NFS stuff? |
The first thing that needs to be done is fixing the spawn failure #597 (comment) There's some discussion going on here about behaviour, but I think that needs a decision before it can be implemented #597 (comment) We should also figure out if that work needs to happen in #597 or #613. If it can go in #597, then I think #613 could be merged. Or maybe at this point it's just better to open up a new PR and start afresh anyway. |
update: in the sprint planning meeting today, we discussed that, now that NFS is ready to go (#50) we should be ready to review this PR and merge it in, and then ask Pangeo folks to take a look at the hub and make sure it looks good. In a future step, we will finish up #598 and deploy it, but that's not necessary for the initial deployment |
This actually fails (it didn't used to!)
Trying to
I think our timebox for using google file store expired, so i'm going to abandon in-cluster NFS and go that way. |
Ran into issues with in-cluster NFS, so 2i2c-org#599 (comment) Fixes 2i2c-org#599
Thanks so much for all the hard work here! After Yuvi's ping on Slack, I just tried logging in. I clicked login and got redirected to authorize a new github app (iam-login-something). Once redirected back to
Our previous cluster was configured to allow all users from the group https://github.com/orgs/pangeo-data/teams/us-central1-b-gcp to be able to log in. It would be great to use that same group here. Let me know how I can help. |
I think we need to add @rabernat here And then he can add other admins etc just until we get the GitHub teams auth working |
Ah ok, thanks for clarifying. No worries. |
@rabernat try now |
Ok so I will continue to post feedback on this issue, as suggested by Chris. Item 1: There are no choices of machine type on startup. Compare this to the Profile List on https://us-central1-b.gcp.pangeo.io/. This is important because some users (like my class) just need a small machine while others (like researchers) need lots of memory. |
Item 2: My home directory is not there. It would be great if we could migrate over the home directories from the old cluster. Since both clusters are using GC Filestore, perhaps this is trivial: just mount the same volume on the new cluster. But since it lives in a different project, maybe that doesn't work. |
Item 3: Hub is not configured for requester-pays access to cloud data. I discovered this by running the first few cells of this notebook, specifically from intake import open_catalog
cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/ocean.yaml")
ds = cat["sea_surface_height"].to_dask() raises
|
I'll try to capture some of @rabernat's suggestions in subsequent issues so that we don't lose track of them. Note that when I try to log-in I'm running into a "scale-up" error: (I selected the smallest machine type) |
Another note - if I go to |
@choldgraf ah, the second smallest one works for me. Let's isolate and tweak the sizes until they all work. Can we use #652 to track and close this? |
@yuvipanda sounds good - I think that once #651 is merged we can consider this one closed (actually it should close automatically), and can then focus on specific improvements to the staging hub in separate issues |
Description
We should deploy a staging hub for Pangeo that has the same infrastructure setup on less-costly infrastructure. This may also generate some other tasks that we need to accomplish in order to get the base infrastructure running.
Benefit
This will help us iterate more quickly and get feedback from the Pangeo team. It will also be a place where we can stage changes in the future without affecting
prod
, since Pangeo is a more complex and dynamic setup than most of our community hubs.Tasks to complete
Updates
The text was updated successfully, but these errors were encountered: