Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support native JupyterHub OAuthenticator in 2i2c-managed hubs #706

Merged
merged 22 commits into from
Sep 29, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
ff9cf07
Enable reading in extra, secret config when not using auth0 in deployer
sgibson91 Sep 23, 2021
3093a35
Enable additionalProperties for auth0 key in chart schema
sgibson91 Sep 23, 2021
d45b1d7
Add a helm chart schema for secrets/config/hubs
sgibson91 Sep 23, 2021
92887fc
Add some docs on setting up auth using GitHubOAuthenticator in Jupyte…
sgibson91 Sep 23, 2021
fe7cd40
Fix typo in docs
sgibson91 Sep 23, 2021
99e7f82
Add a `required: false` field for `auth0.connection`
sgibson91 Sep 23, 2021
f98a63c
Merge branch 'jupyterhub-oauth-github' of github.com:sgibson91/pilot-…
sgibson91 Sep 23, 2021
4f53a4c
Unconditionally read in secret config if it exists
sgibson91 Sep 23, 2021
ffb4b95
Unset hardcoded authenticator_class config, conditionally set it in d…
sgibson91 Sep 23, 2021
c2587f5
Update docs to reflect most recent fixes
sgibson91 Sep 23, 2021
7d1c6d7
Remove whitelist, update admonition on switching providers
sgibson91 Sep 23, 2021
abaa223
Add some more doc fixes
sgibson91 Sep 23, 2021
81c40be
Add links
sgibson91 Sep 27, 2021
93a792a
Update docs/howto/configure/auth-management.md
sgibson91 Sep 27, 2021
969869a
Add example of /hub/oauth_callback url
sgibson91 Sep 27, 2021
e3f88f0
Switch admonition blocks for notes
sgibson91 Sep 27, 2021
2646ecf
Update deployer/hub.py
sgibson91 Sep 27, 2021
ba6735e
Merge branch 'master' into jupyterhub-oauth-github
sgibson91 Sep 27, 2021
fe4ab01
Remove stray `users` key from code snippet in docs
sgibson91 Sep 28, 2021
c937694
Give more explicit fallback when reading secret config
sgibson91 Sep 29, 2021
5ca78c9
Clarify a note in docs
sgibson91 Sep 29, 2021
b1d6aac
Update dynamically set authenticator_class to be CustomOAuthenticator…
sgibson91 Sep 29, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion config/hubs/schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ properties:
- basehub
- daskhub
auth0:
additionalProperties: false
additionalProperties: true
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
type: object
description: |
All hubs use Auth0 for authentication, and we dynamically fetch the credentials
Expand All @@ -176,6 +176,7 @@ properties:
Authentication method users of the hub can use to log in to the hub.
We support a subset of the [connectors](https://auth0.com/docs/identityproviders)
that auth0 supports
required: false
application_name:
type: string
description: |
Expand Down
20 changes: 19 additions & 1 deletion deployer/hub.py
Original file line number Diff line number Diff line change
Expand Up @@ -348,6 +348,7 @@ def get_generated_config(self, auth_provider: KeyProvider, secret_key):
# FIXME: We're hardcoding Auth0OAuthenticator here
# We should *not*. We need dictionary merging in code, so
# these can all exist fine.
generated_config['jupyterhub']['hub']['config']['JupyterHub']['authenticator_class'] = 'CustomOAuthenticator'
generated_config['jupyterhub']['hub']['config']['Auth0OAuthenticator'] = auth_provider.get_client_creds(client, self.spec['auth0']['connection'])

return self.apply_hub_template_fixes(generated_config, secret_key)
Expand Down Expand Up @@ -427,13 +428,29 @@ def deploy(self, auth_provider, secret_key, skip_hub_health_test=False):
subprocess.check_call(["helm", "dep", "up", "daskhub"])
os.chdir("..")

# Check if this cluster has any secret config. If yes, read it in.
secret_config_path = Path(os.getcwd()) / "secrets/config/hubs" / f'{self.cluster.spec["name"]}.cluster.yaml'

if os.path.exists(secret_config_path):
with decrypt_file(secret_config_path) as decrypted_file_path:
with open(decrypted_file_path) as f:
secret_config = yaml.load(f)

hubs = secret_config["hubs"]
secret_hub_config = next((hub for i, hub in enumerate(hubs) if hubs[i]["name"] == self.spec["name"]), {"config": {}})
secret_hub_config = secret_hub_config["config"]
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
else:
secret_hub_config = {}

generated_values = self.get_generated_config(auth_provider, secret_key)

with tempfile.NamedTemporaryFile(mode='w') as values_file, tempfile.NamedTemporaryFile(mode='w') as generated_values_file:
with tempfile.NamedTemporaryFile(mode='w') as values_file, tempfile.NamedTemporaryFile(mode='w') as generated_values_file, tempfile.NamedTemporaryFile(mode='w') as secret_values_file:
json.dump(self.spec['config'], values_file)
json.dump(generated_values, generated_values_file)
json.dump(secret_hub_config, secret_values_file)
values_file.flush()
generated_values_file.flush()
secret_values_file.flush()

cmd = [
'helm', 'upgrade', '--install', '--create-namespace', '--wait',
Expand All @@ -444,6 +461,7 @@ def deploy(self, auth_provider, secret_key, skip_hub_health_test=False):
# we should put the config from config/hubs last.
'-f', generated_values_file.name,
'-f', values_file.name,
'-f', secret_values_file.name,
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
]

print(f"Running {' '.join(cmd)}")
Expand Down
121 changes: 113 additions & 8 deletions docs/howto/configure/auth-management.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Manage authentication

[auth0](https://auth0.com) provides authentication for all hubs here. It can
## Auth0

[auth0](https://auth0.com) provides authentication for the majority of 2i2c hubs. It can
be configured with many different [connections](https://auth0.com/docs/identityproviders)
that users can authenticate with - such as Google, GitHub, etc.

Expand Down Expand Up @@ -38,19 +40,122 @@ So we want to manage authentication by:
config:
jupyterhub:
auth:
# will be renamed allowedlist in future JupyterHub
whitelist:
users:
allowed_users:
# WARNING: THESE USER LISTS MUST MATCH (for now)
- user1@gmail.com
- user2@gmail.com
admin:
users:
admin_users:
# WARNING: THESE USER LISTS MUST MATCH (for now)
- user1@gmail.com
- user2@gmail.com
```

```{admonition} Switching auth
Switching authentication for a pre-existing hub will simply create new usernames. Any pre-existing users will no longer be able to access their accounts (although administrators will be able to do so). If you have pre-existing users and want to switch the hub authentication, rename the users to the new auth pattern (e.g. convert github handles to emails).
```
Switching authentication providers (e.g. from GitHub to Google) for a pre-existing hub will simply create new usernames. Any pre-existing users will no longer be able to access their accounts (although administrators will be able to do so). If you have pre-existing users and want to switch the hub authentication, rename the users to the new auth pattern (e.g. convert github handles to emails).
```

## Native JupyterHub OAuthenticator for GitHub Orgs and Teams

```{note}
This setup is currently only supported for communities that **require** authentication via a GitHub organisation or team.
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved

We may update this policy in the future.
```

For communities that require authenticating users against [a GitHub organisation or team](https://docs.github.com/en/organizations), we instead use the [native JupyterHub OAuthenticator](https://github.com/jupyterhub/oauthenticator).
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
Presently, this involves a few more manual steps than the `auth0` setup described above.

1. **Create a GitHub OAuth App.**
This can be achieved by following [GitHub's documentation](https://docs.github.com/en/developers/apps/building-oauth-apps/creating-an-oauth-app).
- Use the "Switch account" button at the top of your settings page to make sure you have `2i2c-org` selected.
That way, the app will be owned by the `2i2c-org` GitHub org, rather than your personal GitHub account.
- When naming the application, please follow the convention `<CLUSTER_NAME>-<HUB_NAME>` for consistency, e.g. `2i2c-staging` is the OAuth app for the staging hub running on the 2i2c cluster.
- The Homepage URL should match that in the `domain` field of the appropriate `*.cluster.yaml` file in the `pilot-hubs` repo.
- The authorisation callback URL is the homepage url appended with `/hub/oauth_callback`. For example, `staging.pilot.2i2c.cloud/hub/oauth_callback`.
- Once you have created the OAuth app, make a new of the client ID, generate a client secret and then hold on to these values for a future step

2. **Create or update the appropriate secret config file under `secrets/config/hubs/*.cluster.yaml`.**
You should add the following config to this file, pasting in the client ID and secret you generated in step 1.

```yaml
hubs:
- name: HUB_NAME
config:
jupyterhub:
hub:
config:
GitHubOAuthenticator:
client_id: CLIENT_ID
client_secret: CLIENT_SECRET
```

````{note}
Add the `basehub` key between `config` and `jupyterhub` for `daskhub` deployments.
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
For example:

```yaml
hubs:
- name: HUB_NAME
config:
basehub:
jupyterhub:
...
```
````

```{note}
Make sure this is encrypted with `sops` before committing it to the repository!
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved

`sops -i -e secrets/config/hubs/*.cluster.yaml`
```

3. **Edit the non-secret config under `config/hubs`.**
You should make sure the matching hub config takes one of the following forms.

To authenticate against a GitHub organisation:
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved

```yaml
hubs:
- name: HUB_NAME
auth0:
enabled: false
... # Other config
config:
jupyterhub:
hub:
config:
JupyterHub:
authenticator_class: github
GitHubOAuthenticator:
oauth_callback_url: https://{{ HUB_DOMAIN }}/hub/oauth_callback
allowed_organizations:
- 2i2c-org
- ORG_NAME
scope:
- read:user
```

To authenticate against a GitHub Team:

```yaml
hubs:
- name: HUB_NAME
auth0:
enabled: false
... # Other config
config:
jupyterhub:
hub:
config:
JupyterHub:
authenticator_class: github
GitHubOAuthenticator:
oauth_callback_url: https://{{ HUB_DOMAIN }}/hub/oauth_callback
allowed_organizations:
- 2i2c-org:tech-team
- ORG_NAME:TEAM_NAME
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
scope:
- read:org
```

4. Run the deployer as normal to apply the config.
5 changes: 0 additions & 5 deletions hub-templates/basehub/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -211,9 +211,6 @@ jupyterhub:
image:
name: quay.io/2i2c/pilot-hub
tag: '0.0.1-n1159.h5b045cd'
config:
JupyterHub:
authenticator_class: oauthenticator.auth0.Auth0OAuthenticator
nodeSelector:
hub.jupyter.org/node-purpose: core
networkPolicy:
Expand Down Expand Up @@ -317,8 +314,6 @@ jupyterhub:
resp['name'] = resp['name'].split('|')[-1]
return resp

c.JupyterHub.authenticator_class = CustomOAuthenticator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not get this removal... don't you need to keep this in auth0 enabled clusters?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it might be that this line

https://github.com/sgibson91/pilot-hubs/blob/fe4ab0101b6900997b9c1c0fc90e62b1b57e1720/deployer/hub.py#L351

should be the following instead

generated_config['jupyterhub']['hub']['config']['JupyterHub']['authenticator_class'] = 'CustomOAuthenticator'

I wasn't sure and was a bit confused, hence why I asked for my work to be checked in this discussion thread #706 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure the alternative you wrote in the previous comment will work in your pangeo (non-auth0) case.
What about doing all the stuff under 06-custom-authenticator conditional if auth0 is configured? As it is done here with the scratch_bucket?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it might be that this line

https://github.com/sgibson91/pilot-hubs/blob/fe4ab0101b6900997b9c1c0fc90e62b1b57e1720/deployer/hub.py#L351

should be the following instead

generated_config['jupyterhub']['hub']['config']['JupyterHub']['authenticator_class'] = 'CustomOAuthenticator'

@sgibson91, I believe so too!

At some point, I did this chart about how different configs are put together. I'm not sure if it's super accurate now, but for me, at least it provides a visual starting point, though I'm still a bit confused about them (hence, my forever-taking PR with the staff admin users 🙄 )

One thing that's not accurate in that chart (I think), is that although values.yaml template is the first one that gets loaded, , the extra_config in values.yaml is actually the last one that get loaded by the hub. At least that's what I understand from the hub logs first few lines:

Loading extra config: 01-working-dir
Loading extra config: 02-prometheus
Loading extra config: 03-no-setuid
Loading extra config: 04-custom-theme
Loading extra config: 05-custom-admin
Loading extra config: 06-custom-authenticator
Loading extra config: 07-cloud-storage-bucket

I am not sure the alternative you wrote in the previous comment will work in your pangeo (non-auth0) case.

@damianavila, it shouldn't be set for non-auth0 cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @GeorgianaElena! I will update that line and see what people think. Also very nice chart! 😉


@damianavila for clarity's sake, we now need to dynamically set the value of JupyterHub.authenticator_class based on which authentication method we are using, hence why these values are being removed from the chart. So when using auth0, authenticator_class will be set in the deployer in this block:

https://github.com/sgibson91/pilot-hubs/blob/fe4ab0101b6900997b9c1c0fc90e62b1b57e1720/deployer/hub.py#L336-L352

And when using the native GitHubOAuthenticator, we set the class in the *.cluster.yaml hub config, as demonstrated in the accompanying docs, e.g. Step 3 here: https://github.com/sgibson91/pilot-hubs/blob/jupyterhub-oauth-github/docs/howto/configure/auth-management.md#native-jupyterhub-oauthenticator-for-github-orgs-and-teams


What about doing all the stuff under 06-custom-authenticator conditional if auth0 is configured? As it is done here with the scratch_bucket?

I am not convinced that method will work. If I understand the code correctly, the highest level config key that get_config reads in is everything under jupyterhub and the auth0.enabled key is much higher in the hierarchy of keys, adjacent to hubs, so I'm not sure the z2jh module can even see it?

Image to clarify what I mean by the hierarchy:

config

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the authenticator_class for auth0 hubs in b1d6aac

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks @GeorgianaElena and @sgibson91 for the discussion and the clarifications!
I now understand why I did not catch it... I totally missed this line. This is why I proposed the alternative "conditional approach" (without any testing, of course 😉 ).
In fact, I was expecting that auth0 "enabled" property to have some representation in the schema (I guess that is another discussion to have...), and since I did not find it there, and because I did not check some lines above 🤦 , I assumed there was not a conditional approach.
I concur with the last commit you pushed @sgibson91.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I'm not sure the z2jh module can even see it?

Yep, you would need to have somehow an auth under custom as well for being able to catch it... but no need to test that now 😜 .

07-cloud-storage-bucket: |
from z2jh import get_config
cloud_resources = get_config('custom.cloudResources')
Expand Down
24 changes: 24 additions & 0 deletions secrets/config/hubs/schema.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
$schema: 'http://json-schema.org/draft-07/schema#'
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
type: object
additionalProperties: false
properties:
hubs:
type: array
description: |
Each item here is additional config for a hub deployed to this cluster.
required:
- name
- config
items:
- type: object
additionalProperties: false
properties:
name:
type: string
description: |
Name of the hub. This will be used to determine
the namespace the hub is deployed to
config:
type: object
description: |
YAML configuration containing secrets that is passed through to helm.