Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bump daskhub 4.4.2 --> 4.5.4 #875

Merged
merged 7 commits into from
Nov 24, 2020
Merged

Conversation

scottyhq
Copy link
Member

addresses #859

This would bump the following charts:

dask-gateway 0.8 —>0.9
jupyterhub 0.9.1 —> 0.10.2
@TomAugspurger would be great to know if there are major changes to be aware of here. I don't see release notes here https://github.com/dask/dask-gateway.

On the jupyterhub side there a lot of changes (@consideRatio)
https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/master/CHANGELOG.md

Note: if we don't want to do this version bump, perhaps we should stick to building images with dask-gateway 0.8 pinned so that we can continue upgrading other python packages?

cc @tjcrone

@TomAugspurger
Copy link
Member

I'm OK with updating, though we'll want to synchronize this update with updates to the image to ensure we have dask-gateway 0.9 everywhere.

@scottyhq
Copy link
Member Author

we'll want to synchronize this update with updates to the image to ensure we have dask-gateway 0.9 everywhere.

Hey Tom, the issue is that we've already bumped to gateway 0.9 in the images with pangeo-data/pangeo-docker-images#154 (so this is blocking updates to other packages)

@scottyhq
Copy link
Member Author

Also linking to some good CI update recommendations on the binderhubs pangeo-data/pangeo-binder#178 (comment)

@TomAugspurger
Copy link
Member

TomAugspurger commented Nov 16, 2020 via email

@scottyhq
Copy link
Member Author

We could perhaps comment out the OOI deployment for now if we think it’ll cause issues, but wouldn’t want to leave it like that for too long.

I'd prefer to get everyone on board with this before proceeding. There is no way currently to set the helm chart versions for each hub individually, and commenting out the CI would be annoying in case OOI does want to make other changes. So @tjcrone are you OK bumping this? At best, everything goes without a hitch, at worst there is some helm upgrade issue such as things we've encountered in the past that require manually deleting some resources etc.

@scottyhq
Copy link
Member Author

also tested new images on binderhubs and things seem to be working well pangeo-data/pangeo-binder#167 (comment)

@scottyhq
Copy link
Member Author

@tjcrone @rabernat - what's the status of the ooi hub currently? I see the ooi image was bumped, but i'd really like to merge this so that we can use updated packages on AWS. Options are:

  1. comment out OOI temporarily on CI so that we can test this upgrade on AWS and GCP
  2. merge and sort out issues (if any) for all hubs on staging

@TomAugspurger
Copy link
Member

Just a heads up that this likely would cause issues on the OOI deployment. JupyterHub 0.10 (which is included in this daskhub) enables a security feature that makes Dask Gateway unable to talk to the Hub. This security feature is apparently ignored on GCP where I initially tested it, but is honored on AKS. I'm not sure about Amazon.

The fix is at dask/helm-chart#142 (comment) (disable that check for now, as it was in the older JupyterHub). I'll push a commit later.

@tjcrone
Copy link
Contributor

tjcrone commented Nov 21, 2020

Sorry I don't know how I missed this. I would be happy to help with this but I need the OOI deployment for a workshop on Monday. @TomAugspurger, how difficult do you think it will be to get OOI onboard with this? Thanks.

@consideRatio
Copy link
Member

@TomAugspurger all k8s clusters that enforce NetworkPolicy resources will have this issue until dask-gateway adds a label on a pod that is configured to communicate with the JupyterHub Helm charts hub pod.

When a GKE cluster is created, there is a checkbox about installing Calico, that will make that GKE cluster enforce NetworkPolicy limitations on allowed network traffic. AKS likely had such enforcement by default, and on Amazon it may depend on how you setup you k8s cluster.

@tjcrone
Copy link
Contributor

tjcrone commented Nov 21, 2020

Thanks for this information @consideRatio. @TomAugspurger and @scottyhq, I'm around all weekend and I'm happy to help with this as needed. If we cannot get OOI working this weekend we would need to roll back for my workshop Monday morning. Thanks.

@scottyhq
Copy link
Member Author

If we cannot get OOI working this weekend we would need to roll back for my workshop Monday morning. Thanks.

I suggest waiting until after the Monday workshop. Happy to troubleshoot AWS-related things next week.

@tjcrone
Copy link
Contributor

tjcrone commented Nov 22, 2020

Okay sounds good @scottyhq. I should be around this week and will keep an eye on this thread. Happy to help as needed and very sorry for my delayed response. Crazy few months.

@TomAugspurger
Copy link
Member

Thanks. So the plan is to merge this Monday or perhaps Tuesday.

The network security policy is disabled for all the hubs, since it's in the shared values.yaml.

@tjcrone
Copy link
Contributor

tjcrone commented Nov 23, 2020

My workshop is finished. It went great! Thanks for everyone's help. We are ready to go on this merge when you are.

@scottyhq
Copy link
Member Author

Thanks for the update @tjcrone , glad it went well!

I'm going to go ahead and merge this into staging and we'll see how it pans out.

@scottyhq scottyhq merged commit bed9bf2 into pangeo-data:staging Nov 24, 2020
@tjcrone
Copy link
Contributor

tjcrone commented Nov 25, 2020

I'm a little confused because both prod and staging were even at b90effc yesterday. But it looks like we merge committed to bed9bf2, which brought in a bunch of old stuff like my postBuild, which shouldn't be needed anymore. I'm guessing that a rebase merge would have been the right move here for this particular PR. We may need to rewrite again, but maybe there is a plan here that I'm not getting?

@scottyhq
Copy link
Member Author

ugh. yeah, i don't know. i couldn't merge due to a conflict, so I used the web editor to fix it and then did a standard 'merge commit' into staging...

@tjcrone
Copy link
Contributor

tjcrone commented Nov 25, 2020

Okay well it's Git so anything can be fixed! No worries. I would say that most of the time a PR to upstream with conflicts needs to be considered carefully, and probably restarted, because there is probably something not right. I would rather fix a conflict on my fork rather than try to do it on the PR, and that means rebasing the commits on your fork and trying the PR again. But a rebase merge here might have worked as well I don't know.

@tjcrone
Copy link
Contributor

tjcrone commented Nov 25, 2020

I use a git log alias that really helps me understand the branching/merging situation in local/remote/upstream. Maybe you would be interested in trying it?
Screen Shot 2020-11-24 at 8 00 09 PM

@tjcrone
Copy link
Contributor

tjcrone commented Nov 25, 2020

The alias is in my dotfiles: https://github.com/tjcrone/dotfiles/blob/master/.gitconfig. I don't remember where I got it but I probably cannot live without it now. Anyway, I'm off for the evening I am happy to help fix this tomorrow if you like. Please let me know. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants