Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for WAYFless URLs #172

Merged
merged 16 commits into from
Mar 24, 2022
Merged

Add support for WAYFless URLs #172

merged 16 commits into from
Mar 24, 2022

Conversation

vbessonov
Copy link
Contributor

@vbessonov vbessonov commented Jan 30, 2022

Caveats

The WAYFless URL functionality has not been fully tested. See caveats comment below in this PR.

Description

This PR adds support for WAYFless URLs.

There is a new called setting SAML WAYFless URL Template used for configuring acquisition link templates as described below:
image

Now it's available only for OPDS collections.
In order to add this config setting to other collection, you need:

  1. Add SAMLWAYFlessConfiguration to the collection's configuration settings list:
  • add SAMLWAYFlessConfiguration class as a parent of their configuration class:
class MyCollectionSettings(SAMLWAYFlessConfiguration):
  pass


class MyCollection:
  SETTINGS = ... + MyCollectionSettings.to_settings()
  • OR add SAMLWAYFlessConfiguration's properties to the collection's SETTINGS list directly:
class MyCollection:
  SETTINGS = ... + SAMLWAYFlessConfiguration.to_settings()
  1. Add mapping to CirculationAPI.default_fulfilment_post_processors_map:
 @property
    def default_fulfilment_post_processors_map(
        self,
    ) -> Dict[str, Type[CirculationFulfilmentPostProcessor]]:
        """Return a default mapping of protocols to fulfilment post-processors.

        :return: Mapping of protocols to fulfilment post-processors.
        """
        from core.opds_import import OPDSImporter

        from .saml.wayfless import SAMLWAYFlessAcquisitionLinkProcessor

        return {
            OPDSImporter.NAME: SAMLWAYFlessAcquisitionLinkProcessor,
        }

This will inject SAMLWAYFlessAcquisitionLinkProcessor to the fulfilment processing chain and will be calling it to do post-processing of FulfilmentInfo objects fulfilled by collection integrations.

Motivation and Context

The phrase "Where Are You From?" (WAYF) is often used to characterise identity provider discovery.

Generally speaking, a discovery service is a solution to the identity provider discovery problem, a longstanding problem in the federated identity management space when there are multiple identity providers available each corresponding to a specific organisation.

Discovery service provides a browser-based interface where a user selects his or her home organization (i.e., identity provider). A service provider uses this information to initiate SAML Web Browser SSO.

Please refer to examples below to understand how typical IdP discovery service pages look like:
image
imaged5c1400d11f3/Untitled.png)

To learn how a discovery service works, the SWITCH federation has an excellent series of demos that describe and illustrate how a discovery service integrates into a typical SAML flow.

A WAYFless URL, is specific to an institution with associated users and to a web-based service or resource. It enables a user from an institution to gain federated access to the service or resource in a way that bypasses the "Where Are You From?" (WAYF) page or Discovery Service step in SAML based authentication and access protocols.

Circulation Manager does not support WAYFLess URLs since it has it own unique built-in discovery service:

  • Circulation Manager mentions all the configured by admins IdPs in the authentication document
  • Mobile apps allow patrons to choose an IdP they want to authenticate against:

image

However, many prominent Ebook Content Vendors such as ProQuest, Springer, Taylor and Francis, FULCRUM, etc. use WAYFLess URLs.

How Has This Been Tested?

I tested the script using the SAML testbed :

  1. Comment out es, db, cm, cm-test services in docker-compose.yml.

  2. Run the SAML testbed:

docker-compose up -d
  1. Set up the local Circulation Manager as it's described in the SAML testbed's README file.

  2. Set SAML WAYFless URL Template configuration setting of opds.hilbertteam.net collection to https://fsso.springer.com/saml/login?idp={idp}&targetUrl={targetUrl}:
    image

  3. Run circulation-test:

pyenv install 2.7.17
pyenv virtualenv 2.7.17 circulation-test
pyenv activate circulation-test
pip install -U pip
pip install -r requirements.txt
cd ..
FLASK_APP = circulation-test/circulation_test/__init__.py
FLASK_ENV = development
FLASK_DEBUG = 0
export CM_ACS=http://localhost:6500/saml_callback
export CM_AUTHENTICATION_DOCUMENT_URL=http://localhost:6500/authentication_document
export CM_LOANS_URL=http://localhost:6500/SAML/loans
export CM_GROUPS_URL=http://localhost:6500/SAML/groups

flask run

NOTE: Replace http://localhost:6500 with the local CM's URL.

  1. Authenticate using different credentials from a list of built-in users.

  2. In circulation-test first borrow A Dictionary in Hindi and English and then download it:
    image

  3. Notice that CM redirected to the Springer website because it was set up as SAML WAYFless URL Template:
    image

Checklist

  • I have updated the documentation accordingly.
  • All new and existing tests passed.

Copy link
Contributor

@tdilauro tdilauro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vbessonov I found a potential source of the test failures. I'd especially like to confirm your intentions for the return from SAMLWAYFlessAcquisitionLinkProcessor.fulfil in some cases.

"""
return get_one(db, ExternalIntegration, id=self._external_integration_id)

def fulfil(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name is misspelled here and should probably be "fulfill" (ending in two ls).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing it. I found quite a lot of similar typos in the code and tried to fix all of them.


fulfillment.content_link = acquisition_link

return fulfillment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless both acquisition_link_template and saml_credential end up set, no fulfillment will be returned by this method, so the caller will back get None. Was that the intention?

I suspect that this should be dedented two levels to the main level of this function, so that the fulfillment -- modified or not -- is always returned. The failing tests -- and all of the api tests pass with this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are the following cases:

  1. No acquisition_link_template. In this case CM must return the existing FulfillmentInfo object.
  2. acquisition_link_template exists but there is no saml_credential. In this case I think we need to raise an error: we expect the patron to be authenticated using SAML but there is no Credential object in the database. It probably means that there is some configuration error and either SAML Auth Provider or collection's acquisition_link_template are not configured properly. Returning the existing FulfillmentInfo object will not work because content provider expects a WAYFless URL.
  3. acquisition_link_template, saml_credential are present but there is no saml_subject.idp. In this case we need to raise an error because we cannot form the WAYFless URL without IdP's entityID. Returning the existing FulfillmentInfo object will again not work because the content provider expects a WAYFless URL.

@io7m
Copy link
Contributor

io7m commented Feb 28, 2022

I just pushed a commit that reorganizes at least some of the system into a mixin-based system. It's not expected to be usable in this form yet, and there are test failures. Thought I'd get the work so far pushed so that I can go back to working on LCP.

@io7m io7m force-pushed the feature/wayfless-urls branch 2 times, most recently from f937734 to 1577c87 Compare March 2, 2022 14:27
@io7m
Copy link
Contributor

io7m commented Mar 2, 2022

This appears to be ready to review.

@tdilauro
Copy link
Contributor

tdilauro commented Mar 2, 2022

@io7m Merging #180 added some conflicts. ☹️

vbessonov and others added 11 commits March 2, 2022 16:52
This starts the move away from an inheritance based configuration
model to an aggregation model where configurations are comprised
of mixins. This retrofits the recently added FormatPriorities
configuration in order to demonstrate how the model works with multiple
distinct configuration types.

Currently, there are a few test failures in the OPDS importer
tests. This is because the SAML code now doesn't use the correct API
to fetch configuration information.

Affects: https://www.notion.so/lyrasis/WAYFless-URL-Support-7c201b7af25d4fddb61e70cacc47660c
@io7m
Copy link
Contributor

io7m commented Mar 3, 2022

This is even more ready for review than before.

@RishiDiwanTT
Copy link
Contributor

I can see no issues with the code 👍

@tdilauro
Copy link
Contributor

I am going to merge this PR with a couple of caveats:

  • This PR contains changes to configuration settings support that are needed elsewhere, so there is pressure to merge this code. This part of the PR appears to work properly.
  • I am not 100% sure that the WAYFless URL portion of this PR works. I was unable to complete testing with the SAML testbed, which is not working properly (confirmed by @RishiDiwanTT) and needs an overhaul. However, whether it works or not, it does not appear to negatively impact integrations that do not need WAYFless support, so the need to get the configuration benefits of this PR outweigh concerns about whether this part works. We can fix this later, if needed.
  • There are errors in the testing instructions in this PR's "How Has This Been Tested?" section. We will need to update these instructions as we endeavor to complete testing outside the bounds of this PR.

@tdilauro tdilauro merged commit 376278c into main Mar 24, 2022
@tdilauro tdilauro deleted the feature/wayfless-urls branch March 24, 2022 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants