Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Additional feature] NameID formats and custom implementations #32

Closed
peppelinux opened this issue Mar 23, 2019 · 17 comments
Closed

[Additional feature] NameID formats and custom implementations #32

peppelinux opened this issue Mar 23, 2019 · 17 comments

Comments

@peppelinux
Copy link
Contributor

Hi, I would start from this good discussion I found here:
https://stackoverflow.com/questions/11693297/what-are-the-different-nameid-format-used-for

We should deal with a way to manage PersistentID based on a local (database) storage.
I have to check some other pysaml2 internals before doing some implementation but, before all, here we are for discussion

Here also a reference about how Shibboleth does this:
https://wiki.shibboleth.net/confluence/display/IDP30/NameIDGenerationConfiguration

@peppelinux peppelinux changed the title [Additional feature] NameID formats and custom implementation [Additional feature] NameID formats and custom implementations Mar 23, 2019
@askvortsov1
Copy link
Contributor

I don't quite understand what you want to add here. Yes, there are different NameID formats. They are supported by the ability to specify a nameid field on the AUTH_USER_MODEL for each SP. Please see 0e3f7f8 for implementation. The only modification I'd like to make is supporting the use of a method as NameID, which I intend to push to master shortly.

@peppelinux
Copy link
Contributor Author

peppelinux commented Mar 23, 2019

Excuse me if I a put external references, from what you wrote I know that you got it.
Often I use this thread to have some note to share, before starting development.
I agree about the method that return the NameID for each user, in addition I'd like to implement the following. I think that:

  • A PersistentID like "joe" is a quite common username. If a SP can do SSO to two different IDP (it's possibile through WAYF/Disco) this username in SP could collide with another that in another IDP could have the same value (but is not the same user!). Some SP needs persistentID to deal with internal authorization profiles and this is very dangerous if it should happens. Because of this in Shibboleth IDP (the real leading SAML server, believe me), we found the following PersitentID format:
    name for the source of the identifier!name for the intended audience of the identifier!opaque identifier for the principal that for example could be this: https://aai-logon.switch.ch/idp/shibboleth!https://attribute-viewer.aai.switch.ch/shibboleth!a6c2c4d4-08b9-4ca7-8ff9-43d83e6e1d35. This cannot collide between any SP and the purpose of this nameID is to have a string that joins IDP!SP!(opaque)username. This could be quite good for me and needs a method in the Base processor by default. I think that this could be overridable in each SP definitions if you agree, for example an SP could have its custom Processor and this come with a custom get_nameid_peristent(self, *kwargs) method.

  • TransientID could be an hashed renewable uid.

  • OpaqueID return unique but not human readable value, like an hashed username value

  • Unspecified could be everithing with all the conseguences that could derive from this

  • EmailID is very simple... peppe@email.ru

I think also that we can override the nameid_format per each SP ONLY if that nameid_format is in the IDP's nameid_format list. We can have many methods as many nameid formats, inside the Base processor, to get the username's value in such nameid formats. At this moment we have only get_user_id, this could be a leading method as interface to default nameid_method for that SP that specifies them.

So we can introduche these to our Basee Processors:

  • get_nameid_persistent() -> we need to decide the uncollidable format, shibboleth suggests us the previous one;
  • get_nameid_email()
  • get_nameid_opaque()
  • get_nameid_persistent_opaque() -> could takes output from previous one and combine them
  • get_nameid_transient() -> opaque, reusable and anonymous identifier
  • get_nameid_unspecified() -> this could be anything, so it could simply return get_user_id()!

I'm sure that we have a good roadmap starting from this!
http://docs.oasis-open.org/security/saml/v2.0/saml-core-2.0-os.pdf

@askvortsov1
Copy link
Contributor

I'm still a bit confused here. Correct me if I'm wrong, but doesn't one SAMLResponse contain only one NameID? If so, why does it matter which type of NameID it is? The website using djangosaml2idp will simply decide which NameID to return for which SP, and that's that. Am I misunderstanding how this works?

@peppelinux
Copy link
Contributor Author

peppelinux commented Mar 23, 2019

@askvortsov1 I understand you very well, because I'm a django-guy too and I know your feel!

But in SAML federation context we use OID to know the format and a deep knowledge about each attributes (OID, x.500, LDAP...They are only standards). Regarding the username, the nameID format just tell us how is valued it (it's just a format that explain higher level of usage with specific scope). I hope that the previous examples was quite simple to understand and implement. I think that for a typical django user like us, this could sound overkill but it is needed to bring djangosaml2idp in the federation context and make to this software its real value.

I feel that it's the faster and the more flexible ever, few hours of development is needed to put this bunch of standards inside, nothing else. I could do this in commit, I do not want to stress you with all this enterprise crap :)

@askvortsov1
Copy link
Contributor

@peppelinux I'm more than happy to add features to this project. I understand that there are a few different formats for NameIDs. However, each SP uses only one format, right? And with the current system, developers can choose a different method (allowing for a different NameID format) per SP, meaning that the current system fully supports what we're trying to do?

The reason I'm not sure about directly implementing this is that we would have to decide how to represent each of the formats. We cannot guarantee that a given User model will have an email field, a first name field, etc. By allowing users to specify a nameID field/method for each SP, we allow developers using djangosaml2idp to support any/all formats, without imposing a solution upon them.

@peppelinux
Copy link
Contributor Author

peppelinux commented Mar 24, 2019

Do not worry, I'll implement those base methods to handle the minim requirements and offer a fully featured interface for this.

I need it, so I'll code it in a non invasive way, do not worry.

@askvortsov1
Copy link
Contributor

Ok, sounds good.

@peppelinux
Copy link
Contributor Author

I found that uuid.uuid4().urn would be the better solution for a persistentID that cannot be reassigned to anyone in the future

@peppelinux
Copy link
Contributor Author

peppelinux commented Mar 26, 2019

@askvortsov1
Copy link
Contributor

Sounds good!

@peppelinux
Copy link
Contributor Author

peppelinux commented Mar 26, 2019

For example:
https://github.com/askvortsov1/djangosaml2idp/blob/master/djangosaml2idp/views.py#L148

Transient, Persistent or Email or other NameID should be encoded in different way, we should not return the username only as the actual implementation. Persistent should be TargetedID, because it will remain persistent on that target, the IDP. See NameIdentifier on the previous links. Tomorrow I'll get on it

@peppelinux
Copy link
Contributor Author

First implementation here:
peppelinux@82bff05

@mhindery
Copy link
Contributor

This is not something I'm using, but if you are, you'll know best what you need :) Do let us know when you're ready with it. Would it be possible to create a separate MR though for separate features? One huge MR with everything in it makes it quite hard to follow and it blocks some features to be merged when there's discussion about other unrelated features.

@peppelinux
Copy link
Contributor Author

I think that all of us are brilliant developers, we could share ideas and design concepts, to be later implemented in more personal views and approaches if you agree. I realize that we could share only a part of these so I'm quite sure that I'll do a personal project mentioning this project and all of you as authors. I'd also like that our projects could Be linked by ideas and knowledge sharing

@mhindery
Copy link
Contributor

Merged and released with version 0.6.0

@mauritsvanrees
Copy link

I ran into a problem with the persistent format today. The problem is solved, but I thought I would share it in case others run into this. I hope this comment is a reasonable place for that.

No action needed, no change requested, just a tale from the trenches. :-)

I had a Django 1.11 site with older djangosaml2idp 0.3.3. This was using name id format unspecified, which in this case meant an email address. Running fine for a couple of years.
I updated to Django 3 and djangosaml2idp 0.6.3. One of the two SPs that we need to support started giving trouble. With some more logging added, I saw the error:

   SP requested a name_id_format
   (urn:oasis:names:tc:SAML:2.0:nameid-format:persistent)
   that is not supported in the IDP
   (urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified)

My guess is that the SP had always requested this, but the previous versions did not check this, and did not throw an error.
I updated the IDP config to support both formats. All seemed well, I could login again.

But things started going wrong. I could login, but most others could not. My user name as seen in the SP was no longer my email address (or at least the part before the @-sign), but was "name for the source of the identifier!name for the intended audience of the identifier!opaque identifier for the principal" as returned by get_nameid_persistent. So it was wildly different than before. Someone else who could login as well, reported missing permissions on the SP site, obviously because the completely new id had no permissions yet.
Also, the SP replaced a few characters and stripped it to 32, so everyone who tried to login got the same user name, something like "https___my.example.com_idp_metada". Not helpful. :-)

This particular SP actually adds a unique customer number in front of all usernames that they get, so the part of get_nameid_persistent that tries to make it unique per SP is not in fact needed here. Two different IDPs returning 'Joe' would get internal user id 'customer1-joe' and 'customer2-joe'. But that is specific for this SP.

I fixed it with a patch:

from djangosaml2idp.processors import NameIdBuilder
NameIdBuilder.get_nameid_persistent = NameIdBuilder.get_nameid_transient

That is not something to change in djangosaml2idp itself, but for me it helped.
I hope someone else is helped with this workaround.

@askvortsov1
Copy link
Contributor

Another possible way of going about this is that you can use a custom processor for some SPs. You could also make the switch there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants