Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RBAC for NFS shared group creation/loading #2563

Closed

Conversation

viniciusdc
Copy link
Contributor

@viniciusdc viniciusdc commented Jul 11, 2024

Reference Issues or PRs

closes #2431

This adds the ability to manage group shared directory creation based on the availability of the shared-directory resource/scope in any associated jupyterhub client role bonded to a group.

All default groups analyst, developer, admin, for backward compatibility, will now have the following role assigned to them:

create-shared-directory-role:
   resource: shared-directory
   scopes: write:shared

which is then parsed along the jupyterhub authenticator, and later grabbed by the render_profiles function to determine which group will be mounted into the user spawned profile.

  • Add a new jupyterhub client role, named create-shared-directory-role, and assigns it to default groups
  • add a new variable and key to the keycloak-client terraform module to allow the passage of role attributes
  • add new _fetch_and_map_roles and user_group_role_mapping attributes to the KeyCloakOAuthenticator class to help reduce some duplicated code and create a way for the spawner object to access the keycloak group/roles information during runtime

What does this implement/fix?

Put a x in the boxes that apply

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds a feature)
  • Breaking change (fix or feature that would cause existing features not to work as expected)
  • Documentation Update
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no API changes)
  • Build related changes
  • Other (please describe):

Testing

  • Did you test the pull request locally?
  • Did you add new tests?

Any other comments?

TODO: Lacks testing for now

@viniciusdc
Copy link
Contributor Author

The new role can be seen below, and a manually added one with a different name to showcase that this can be easily expanded, and the role's name is not mandatory.
Captura de tela de 2024-07-11 16 34 01

A Jupiter instance for a user whose access to analyst has been revoked and a new test group with the above custom role is now included. -- showcases that the mounting based on the presence of the role/resource is being correctly parsed.
Captura de tela de 2024-07-11 16 33 28

@viniciusdc viniciusdc requested a review from aktech July 11, 2024 19:48
@viniciusdc
Copy link
Contributor Author

The edge case where a group had the above role and was later the role was removed from it: the shared directory for the same should not be removed as it could have user data and that might get lost and cause unexpected behaviour.

This edge case is not handled currently in the code. Still, the data for a non-mounted group should persist indefinitely unless manually removed with root permissions from the FS -- consequently. we kind satisfy this requirement

name = "grafana"
name = var.client_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes #2428, thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krassowski I will open a small PR to address that later, something was going wrong with this branch when testing, so I removed that one from now

Copy link
Member

@aktech aktech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick pass, need to take a deeper look, when the deployment succeeds.

I was unable to deploy on an existing local cluster, with the following error

Loading /usr/local/etc/jupyterhub/secret/values.yaml
No config at /usr/local/etc/jupyterhub/existing-secret/values.yaml
Loading extra config: 01-theme.py
Loading extra config: 02-spawner.py
Loading extra config: 03-profiles.py
Loading extra config: 04-auth.py
[W 2024-07-22 13:47:12.438 JupyterHub app:1697] JupyterHub.extra_handlers is deprecated in JupyterHub 3.1. Please use JupyterHub services to register additional HTTP endpoints.
[I 2024-07-22 13:47:12.438 JupyterHub app:3286] Running JupyterHub version 5.0.0
[I 2024-07-22 13:47:12.438 JupyterHub app:3316] Using Authenticator: builtins.KeyCloakOAuthenticator
[I 2024-07-22 13:47:12.438 JupyterHub app:3316] Using Spawner: jhub_apps.spawner.spawner_creation.JHubSpawner
[I 2024-07-22 13:47:12.438 JupyterHub app:3316] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-5.0.0
[I 2024-07-22 13:47:12.482 JupyterHub <string>:119] Loading managed roles
[I 2024-07-22 13:47:12.517 JupyterHub <string>:162] Mapping roles with groups and users..
[I 2024-07-22 13:47:12.565 JupyterHub <string>:162] Mapping roles with groups and users..
[I 2024-07-22 13:47:12.587 JupyterHub <string>:154] Loaded 8 roles from keycloak..
    
[W 2024-07-22 13:47:12.594 JupyterHub roles:185] Role uma_authorization will have no scopes
[W 2024-07-22 13:47:12.594 JupyterHub roles:185] Role default-roles-nebari will have no scopes
[W 2024-07-22 13:47:12.594 JupyterHub roles:185] Role offline_access will have no scopes
[W 2024-07-22 13:47:12.594 JupyterHub roles:185] Role jupyterhub_admin will have no scopes
[W 2024-07-22 13:47:12.595 JupyterHub roles:185] Role dask_gateway_admin will have no scopes
[W 2024-07-22 13:47:12.595 JupyterHub roles:185] Role create-shared-directory-role will have no scopes
[W 2024-07-22 13:47:12.595 JupyterHub roles:185] Role dask_gateway_developer will have no scopes
[W 2024-07-22 13:47:12.595 JupyterHub roles:185] Role jupyterhub_developer will have no scopes
[I 2024-07-22 13:47:12.614 JupyterHub app:2872] Creating service jupyterhub-idle-culler without oauth.
[I 2024-07-22 13:47:12.617 JupyterHub app:2872] Creating service dask-gateway without oauth.
[I 2024-07-22 13:47:12.619 JupyterHub app:2872] Creating service monitoring without oauth.
[I 2024-07-22 13:47:12.620 JupyterHub app:2872] Creating service Argo without oauth.
[I 2024-07-22 13:47:12.622 JupyterHub app:2872] Creating service Users without oauth.
[I 2024-07-22 13:47:12.624 JupyterHub app:2872] Creating service Environments without oauth.
[I 2024-07-22 13:47:12.626 JupyterHub app:2872] Creating service Monitoring without oauth.
[I 2024-07-22 13:47:12.633 JupyterHub app:2834] Creating service japps with oauth_client_id=service-japps
[I 2024-07-22 13:47:12.679 JupyterHub provider:663] Updating oauth client service-japps
[I 2024-07-22 13:47:12.699 JupyterHub <string>:119] Loading managed roles
[I 2024-07-22 13:47:12.714 JupyterHub <string>:162] Mapping roles with groups and users..
[I 2024-07-22 13:47:12.737 JupyterHub <string>:162] Mapping roles with groups and users..
[I 2024-07-22 13:47:12.754 JupyterHub <string>:154] Loaded 8 roles from keycloak..
    
[I 2024-07-22 13:47:12.773 JupyterHub app:2514] Deleted 5 stale group role assignments previously added by an authenticator
/opt/conda/lib/python3.9/site-packages/jupyterhub/app.py:2520: SAWarning: DELETE statement on table 'group_role_map' expected to delete 5 row(s); 2 were matched.  Please set confirm_deleted_rows=False within the mapper configuration to prevent this warning.
  db.commit()
[E 2024-07-22 13:47:12.775 JupyterHub app:3852]
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.9/site-packages/jupyterhub/app.py", line 3849, in launch_instance_async
        await self.initialize(argv)
      File "/opt/conda/lib/python3.9/site-packages/jupyterhub/app.py", line 3337, in initialize
        await self.init_role_assignment()
      File "/opt/conda/lib/python3.9/site-packages/jupyterhub/app.py", line 2520, in init_role_assignment
        db.commit()
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1451, in commit
        self._transaction.commit(_to_root=self.future)
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
        self._prepare_impl()
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
        self.session.flush()
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3444, in flush
        self._flush(objects)
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3584, in _flush
        transaction.rollback(_capture_exception=True)
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
        compat.raise_(
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
        raise exception
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3544, in _flush
        flush_context.execute()
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
        rec.execute(self)
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 579, in execute
        self.dependency_processor.process_saves(uow, states)
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1182, in process_saves
        self._run_crud(
      File "/opt/conda/lib/python3.9/site-packages/sqlalchemy/orm/dependency.py", line 1207, in _run_crud
        raise exc.StaleDataError(
    sqlalchemy.orm.exc.StaleDataError: DELETE statement on table 'group_role_map' expected to delete 2 row(s); Only 0 were matched.                                                                                                                                   


for role in data:
for group in role["groups"]:
group = str(group).replace("/", "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group names can consist on multiple /, e.g.

Screenshot 2024-07-22 at 2 39 26 pm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, a subgroup name can also be exposed with dashes; thanks for the reminder. I will slice the first entry of the string instead.

@viniciusdc
Copy link
Contributor Author

I was unable to deploy on an existing local cluster, with the following error

I tested from a local deployment only... I will test a cloud one as well (I will test a fresh deploy then redeploy based on this branch) to try to replicate the issue you had

@aktech
Copy link
Member

aktech commented Jul 22, 2024

I tested from a local deployment only... I will test a cloud one as well (I will test a fresh deploy then redeploy based on this branch) to try to replicate the issue you had

I think local deployment should be fine. I would suggest to deploy locally from the develop branch and then update that deployment via your branch.

@viniciusdc
Copy link
Contributor Author

viniciusdc commented Jul 26, 2024

I've got the same error when deploying this branch over a fresh deploy from develop. Based on the error message, it looks like a problem occurs when migrating the tables. I understand what is happening in the DB but I find weird that just expanding the role mapping would cause that

@viniciusdc
Copy link
Contributor Author

Okay, the changes I made to auth.py were the reason this was falling the DB commit. I am not sure where exactly yet, but I assume something might've called the internal twice, which leads to a deletion in the SQLite of a resource that has been removed but not fully committed.

@viniciusdc
Copy link
Contributor Author

Uhm... I was able to circunvent the issue above, by removing the changes made to both the profile and auth files of jupyterhub, and just keeping the new roles, but then conda-store is now incapable of keeping its migration state in its DB...


Traceback (most recent call last):
  File "/opt/conda/envs/conda-store-server/bin/conda-store-server", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/traitlets/config/application.py", line 1074, in launch_instance
    app.initialize(argv)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/traitlets/config/application.py", line 118, in inner
    return method(app, *args, **kwargs)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_store_server/server/app.py", line 202, in initialize
    dbutil.upgrade(self.conda_store.database_url)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_store_server/dbutil.py", line 96, in upgrade
    command.upgrade(config=alembic_cfg, revision=revision)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/alembic/command.py", line 403, in upgrade
    script.run_env()
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/alembic/script/base.py", line 583, in run_env
    util.load_python_file(self.dir, "env.py")
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 95, in load_python_file
    module = load_module_py(module_id, path)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/alembic/util/pyfiles.py", line 113, in load_module_py
    spec.loader.exec_module(module)  # type: ignore
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_store_server/alembic/env.py", line 86, in <module>
    run_migrations_online()
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/conda_store_server/alembic/env.py", line 80, in run_migrations_online
    context.run_migrations()
  File "<string>", line 8, in run_migrations
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/alembic/runtime/environment.py", line 948, in run_migrations
    self.get_context().run_migrations(**kw)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/alembic/runtime/migration.py", line 615, in run_migrations
    for step in self._migrations_fn(heads, self):
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/alembic/command.py", line 392, in upgrade
    return script._upgrade_revs(revision, rev)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/alembic/script/base.py", line 447, in _upgrade_revs
    with self._catch_revision_errors(
  File "/opt/conda/envs/conda-store-server/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/opt/conda/envs/conda-store-server/lib/python3.10/site-packages/alembic/script/base.py", line 283, in _catch_revision_errors
    raise util.CommandError(resolution) from re
alembic.util.exc.CommandError: Can't locate revision identified by 'bf065abf375b'
Stream closed EOF for dev/nebari-conda-store-server-cf7bbd9cd-hmrxq (conda-store-server)

@viniciusdc
Copy link
Contributor Author

I will open a new branch directly from develop again, and try to re-introduce the changes one-by-one, as I think some very specific thing is getting in the way now, and I am not being able to split the states from within this PR.

@viniciusdc
Copy link
Contributor Author

The problem disapeared after I did the above, I will compare the changes just out of curiosity, but will reopen this PR using the new branch that's also working as commented above.

@viniciusdc
Copy link
Contributor Author

closing on behalf of #2593

@viniciusdc viniciusdc closed this Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
3 participants