Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework OpenID Connect integration #4272

Merged
merged 23 commits into from
Jun 18, 2024

Conversation

sergei-maertens
Copy link
Member

@sergei-maertens sergei-maertens commented May 6, 2024

Closes #4246 partly

Changes

  • Upgrade to mozilla-django-oidc-db 0.18.0
  • Upgrade to latest django-digid-eherkenning with OIDC support
  • Merged all the callback routes into a single one (opt-in, behind a feature flag)
  • Added upgrade note for changed callback URIs
  • Rewrote integration tests, using VCR.py and a minimal set of mocks

Checklist

Check off the items that are completed or not relevant.

  • Impact on features

    • Checked copying a form
    • Checked import/export of a form
    • Config checks in the configuration overview admin page
    • Problem detection in the admin email digest is handled
  • Release management

    • I have labelled the PR as "needs-backport" accordingly
  • I have updated the translations assets (you do NOT need to provide translations)

    • Ran ./bin/makemessages_js.sh
    • Ran ./bin/compilemessages_js.sh
  • Commit hygiene

    • Commit messages refer to the relevant Github issue
    • Commit messages explain the "why" of change, not the how

@sergei-maertens sergei-maertens force-pushed the feature/4246-refactor-return-flow branch 3 times, most recently from 4d839da to 104d722 Compare May 13, 2024 14:59
@sergei-maertens sergei-maertens force-pushed the feature/4246-refactor-return-flow branch 6 times, most recently from 7a7abf4 to d0253a4 Compare May 24, 2024 16:39
CHANGELOG.rst Outdated Show resolved Hide resolved
@sergei-maertens sergei-maertens force-pushed the feature/4246-refactor-return-flow branch from d0253a4 to d9a31ec Compare May 24, 2024 16:48
@sergei-maertens sergei-maertens force-pushed the feature/4246-refactor-return-flow branch from d9a31ec to 6b5d1ff Compare June 3, 2024 13:03
@sergei-maertens sergei-maertens force-pushed the feature/4246-refactor-return-flow branch 4 times, most recently from eafb265 to f4a392f Compare June 13, 2024 14:42
Copy link

codecov bot commented Jun 13, 2024

Codecov Report

Attention: Patch coverage is 98.97611% with 3 lines in your changes missing coverage. Please review.

Project coverage is 96.49%. Comparing base (5a67646) to head (163109c).
Report is 687 commits behind head on master.

Files with missing lines Patch % Lines
...ntication/contrib/digid_eherkenning_oidc/plugin.py 95.38% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4272      +/-   ##
==========================================
+ Coverage   96.28%   96.49%   +0.21%     
==========================================
  Files         731      719      -12     
  Lines       23846    23688     -158     
  Branches     2811     2799      -12     
==========================================
- Hits        22960    22858     -102     
+ Misses        613      562      -51     
+ Partials      273      268       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sergei-maertens sergei-maertens force-pushed the feature/4246-refactor-return-flow branch 4 times, most recently from 628ae3b to 8822ecc Compare June 17, 2024 12:44
@sergei-maertens sergei-maertens changed the title Refactor django-digid-oidc callback/return flow Rework OpenID Connect integration Jun 17, 2024
@sergei-maertens sergei-maertens marked this pull request as ready for review June 17, 2024 12:50
Added tag so that we can easily target or exclude
migration tests when running tests, to save time
during test-runs.

In another commit, the CI configuration can be
updated to run the migration tests as a separate
job to speed up test runs.
The previous configuration was geared towards running Keycloak locally
to test out full integration (manually, with runserver). This was not
really used by anyone, so the configuration got outdated.

Instead, the setup is now done so that the config is spun up every time
we re-record VCR cassettes. The Client ID and Secret are included in
the VCR.py tests and we record the full network traffic for OIDC-based
tests during tests to replay them on CI.

The README documents how to dump the 'fixture' again after making
changes in Keycloak's UI for when requirements change.
Added the digid-eherkenning[oidc] optional dependency group, which
includes the configuration models for DigiD/eHerkenning. This code was
extract from src/digid_eherkenning_oidc_generics and is (almost) a
drop-in replacement for Open Forms.

Also upgraded the mozilla-django-oidc-db package to make use of the
configuration/settings rework done there, so that we can cut down on
amount of code needed in our views/plugins.
All relevant code is moved into either mozilla-django-oidc-db or
django-digid-eherkenning, which have been upgraded already in the
previous commit.
Rather than the just-deleted local package.
* Added a factory for OIDCConfig mocking specifically for the keycloak
  setup in our docker-compose
* Added a login helper that performs the username/password login to
  keycloak for full end-to-end login flow tests. This is taken and
  adapted from our mozilla-django-oidc-db package.
The plugin has been reworked to make use of the new mechanics for
configuration customization in mozilla-django-oidc-db. The tests are
restructured to make use of VCR.py instead of manual mocks, giving
better confidence that things are actually working.

This should make the code and tests a lot more maintainable.
Rewrote the tests for OIDC integration in the admin, using VCR.py and
adapted to the changes made in the oidc library.
* Added an alternative way to handle the threading in the prefill module
  to not break test isolation. Rather than mocking specific logging
  code, the threading aspect itself is removed so that any queries
  run in the same transaction as the test itself.
* Updated the test itself to make use of the keyloak utils and VCR.py
  setup to remove a bunch of mocking.
The digid_eherkenning.oidc package defines the relevant models and
configuration to handle DigiD/eHerkenning via OIDC - the code in there
was moved over from Open Forms and cleaned up. Additional configuration
is added there to be able to capture the full authentication context -
this follows the specification written for the authentication context
data model.

We subclass those config models (as proxy models) to change the Python
behaviour without touching the DB tables:

* set up mechanism to use the legacy callback URLs, since these are
  registered with the identity provider and if we were to just change
  our URLs, that will cause outages. Users can opt into the new
  behaviour - a better construction via feature flags will be added
  later
* Add properties for the 'mandate claims' to extract more information
  about 'machtigen' from the configured claim fields. This property
  will be consumed in our custom authentication backend.
* Register a custom admin form so that the invididual configurations
  can only be disabled if no (Open Forms) forms make use of the
  respective authentication plugins.
The upstream library callback view now always sends the request to
an authentication backend, and the authentication backend itself must
decide whether to create a django user, or return a dummy user instead
(or plain reject the authentication flow if relevant).

The plugin backend is now added to the top of the backends to try since
the is-candidate check is inexpensive and highly situational to our
own proxy-models, allowing other OIDC backends that create (staff)
users to 'fall-through' as usual.

The backend ensures that no real django user is created and is set up
to extract all the necessary claims from the OIDC flow to be able to
'authenticate' the user and store the relevant information in the
session, so that (in a future PR) we can record this information in
our authentication info model(s).

It looks up configuration aspects from the respective config model
used in the authentication flow (the mechanism is provided by
mozilla-django-oidc-db), or if no such property exists on the model,
in the Django settings.
These tests were specified for each plugin, but the general mechanism
is the same and belongs at the authenticaton-module level.
…db mechanism

The plugin is completely restructured to do the same with a lot less
code and intermediate redirects. This should now remove all potential
ways for end-users to modify the redirect chain(s) by tampering with
URLs.

* Intermediate views are removed, instead the plugin now directly
  redirects to the OpenID Connect provider
* The plugin uses the init view construct from mozilla-django-oidc-db,
  which stores the requested configuration model in the session
* The callback view is now a single callback view that handles all 4
  potential config models, using the upstream library mechanism to
  'route' it after the real callback entry point. This massively
  simplifies the URL config and makes it possible to send all different
  configurations through the same endpoint while customizing the
  behaviour. Users need to opt-in to this, for backwards compatibility
  reasons.
* The data extraction mechanism for each config/plugin is updated so
  that we can capture additional data for the authentication context.
  This will be processed in another PR.
This adds the single-callback-url configuration for the OIDC-based
plugins and deprecates the plugin-specific callback endpoints.

Ideally, we'll have a single callback endpoint for the admin OIDC,
org-oidc and digid-eherkenning OIDC, but that will have to be done in
a separate PR, ideally in combination with the feature flag mechanism
so that new installations get the new behaviour while existing instances
continue using the legacy behaviour by default to prevent breaking
changes.
The hooks here are for the co-sign v1 flow, which are not used in v2 or
even the (planned) v3 flows. They can be removed in Open Forms 3.0.
* Updated the tests to use VCR.py rather than manually mocking aspects
  of the OIDC flow
* Moved the tests into the main plugin package rather than splitting
  them out for each plugin in a sub-package. The test structure is
  identical, so they are now individual test cases in a module
  instead.
* The tests now perform real logins against a Keycloak instance,
  making them easier to follow/debug and more reliable that they test
  the actual, desired behaviour.
* The OIDC cache is no more, it's just the plain solo cache now
* Updated the docker-compose to opt-in to new, unified OIDC callback
  endpoint(s).
Forgot to do this a long time ago.
Documented the default legacy behaviour of OIDC callback endpoints, and
update the configuration docs to make people favour the new settings.

A future PR will use a feature flag approach combined with detection
of new/existing instances to apply the most sensible configuration, so
that we don't need people to opt-in to the envvars.
@sergei-maertens sergei-maertens force-pushed the feature/4246-refactor-return-flow branch from 1292983 to 17d49da Compare June 17, 2024 14:20
Copy link
Contributor

@stevenbal stevenbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly questions, looks good overall 👍



@register("org-oidc")
@register(PLUGIN_IDENTIFIER)
class OIDCAuthentication(BasePlugin):
"""
Authentication plugin using the global mozilla-django-oidc-db (as used for the admin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this plugin is used for currently, is this an auth plugin that allows staff users to log in to the form?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precisely, for two possible reasons:

  • internal staff-only forms where they need to authenticate with their work account
  • forms that can be filled out by staff for someone else, when they call or have a physical appointment. Staff users can then enter the BSN or KVK if desired.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, didn't know about those options


# XXX: sort out callback endpoints
# deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be removed in a later PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - in Open Forms 3.0. We're tracking stuff to remove in #3283 - but also tend to add a DeprecationWarning marker so we can find the stuff we forgot to mention in the ticket by CTRL + F

Comment on lines +112 to +114
# "authorizee_branch_number": self.branch_number_claim,
# "service_id": self.mandate_service_id_claim,
# "service_uuid": self.mandate_service_uuid_claim,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for a future PR?

Copy link
Member Author

@sergei-maertens sergei-maertens Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed - I had them there in non-commented version first, but that caused problems because not all claims are expected to always be present (in particular, authorizee_branch_number).

It's part of #4246

USE_LEGACY_DIGID_EH_OIDC_ENDPOINTS = config(
"USE_LEGACY_DIGID_EH_OIDC_ENDPOINTS",
default=True,
)
USE_LEGACY_ORG_OIDC_ENDPOINTS = config(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that these feature flags being True only adds DeprecationWarnings? Can't we also exclude the actual urlpatterns if they are False?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's intentional - three reasons:

  • in Django, you want to defer settings module evaluation, so accessing settings at the module-level is a no-no (mozilla-django-oidc's urls.py setup is terrible for that), because...
  • using @override_settings(...) in tests only patches the settings, but it doesn't reload the URL conf, unless you specify a different ROOT_URLS, and at that point we're no longer testing our actual URL conf but some hypothetical thing
  • having both the new and legacy endpoints makes it easier to transition to the new situation - modifying the Redirect URIs int he OIDC Provider then doesn't require you to re-deploy the application with the updated envvars at the same time

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - they also return different URL names to reverse, it's not just a deprecation warning

Comment on lines +3 to +4
The integration tests for the OIDC flavour of DigiD/eHerkenning uses VCR.py. To (re-) record the
cassettes, perform the following steps (from the root of the repo):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to set any VCR envvars to make this happen? Or is this not needed because you delete the existing cassettes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleting them is the easiest approach. There are envvars, but I find they don't behave as expected. Most notably, they will just append new requests to existing cassettes rather than replace them.

Deleting the files and then re-recording results effectively in "updates", and the benefit is that you get nice diffs of what has changed (should typically only be Date headers, etc)

@stevenbal
Copy link
Contributor

Hmmm, when testing if the admin OIDC login still works, I seem to get an infinite redirect. Does this happen for you too?

2024-06-18 10:01:58,574 INFO django.server  "GET /oidc/callback/?state=sOrlYkNtqVnoPNTpq43JstzilkgnjHiV&session_state=09881689-0ecd-4b61-93ca-27216dbf9d34&iss=http%3A%2F%2Flocalhost%3A8080%2Frealms%2Ftest&code=7e72081f-df1c-4881-91fe-7f44911c564d.09881689-0ecd-4b61-93ca-27216dbf9d34.adf4ad83-4550-4619-9231-73bd8d700f45 HTTP/1.1" 302 0
2024-06-18 10:01:58,629 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:58,739 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:58,838 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:58,962 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,064 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,169 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,269 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,369 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,468 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,564 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,660 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,757 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,854 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,955 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,077 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,175 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,281 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,385 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,491 INFO django.server  "GET /admin/ HTTP/1.1" 302 0

@stevenbal stevenbal self-requested a review June 18, 2024 08:03
Copy link
Contributor

@stevenbal stevenbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to mark this as changes requested, I want to test EH/DigiD before I approve

This backed became obsolete, so it shouldn't be in this setting.
The system check didn't report it in dev mode because the default
there is to disable 2FA entirely.
@sergei-maertens
Copy link
Member Author

Hmmm, when testing if the admin OIDC login still works, I seem to get an infinite redirect. Does this happen for you too?

2024-06-18 10:01:58,574 INFO django.server  "GET /oidc/callback/?state=sOrlYkNtqVnoPNTpq43JstzilkgnjHiV&session_state=09881689-0ecd-4b61-93ca-27216dbf9d34&iss=http%3A%2F%2Flocalhost%3A8080%2Frealms%2Ftest&code=7e72081f-df1c-4881-91fe-7f44911c564d.09881689-0ecd-4b61-93ca-27216dbf9d34.adf4ad83-4550-4619-9231-73bd8d700f45 HTTP/1.1" 302 0
2024-06-18 10:01:58,629 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:58,739 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:58,838 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:58,962 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,064 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,169 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,269 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,369 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,468 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,564 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,660 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,757 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,854 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:01:59,955 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,077 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,175 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,281 INFO django.server  "GET /admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,385 INFO django.server  "GET /admin/login/?next=/admin/ HTTP/1.1" 302 0
2024-06-18 10:02:00,491 INFO django.server  "GET /admin/ HTTP/1.1" 302 0

Investigated - I get this behaviour when the OIDC user is not made staff and I created #4401 for follow-up.

@stevenbal stevenbal self-requested a review June 18, 2024 10:14
@sergei-maertens sergei-maertens merged commit 33dbac9 into master Jun 18, 2024
31 checks passed
@sergei-maertens sergei-maertens deleted the feature/4246-refactor-return-flow branch June 18, 2024 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Capture authentication context data
2 participants