-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shibboleth: multiple email addresses are stored as a single invalid address separated by a semicolon #2512
Comments
This is the part that concerns me the most. The application shouldn't allow an invalid email address to be persisted. The database entity I can confirm that I'm seeing the same issue @bencomp reported. As I was saying at http://irclog.iq.harvard.edu/dataverse/2015-09-14 the quick fix in #1608 for names was to sort the multiple givenNames and just pick the first one but I agree that letting the user make a selection would be nicer. Here's a screenshot of the error: |
@michbarsinai since you know AuthenticatedUser best can you please take a look at 72ca330 at let me know if there is any problem with making AuthenticatedUser consistent with BuiltinUser in terms of using the same If you uncomment the |
I played around with some ideas in a branch at 050249a that I'd like @scolapasta (and ideally @michbarsinai and @sekmiller) to review. Details are in the commit message and javadoc. @eaquigley and @mcrosas we should probably think about how to handle multiple email addresses and multiple names coming in from Shibboleth. I added a related comment in the 2015-04-30 Dataverse User Accounts and Auth Meeting doc. Above @bencomp has suggested building a GUI/workflow for allowing users to select their preferred email address and name when multiple are asserted by Shibboleth. The commit above is more of a minimum viable fix. Consistent with what we do for multiple first names being asserted, in the proposed commit, I sort them and pick the first one. It sounds like @bencomp isn't especially concerned about seeing multiple email addresses in production but https://www.incommon.org/federation/attributesummary.html does indicate that we should prepare for the "mail" attribute to be multivalued. |
Whether The Shib "driver" should basically take any data coming from a Shib server and translate it into an
|
@michbarsinai another way to think about this: what if I remove the Also, while as of this writing (version 4.2) Dataverse will copy the email from Shibboleth directly into the AuthenticatedUser table on each login, overwriting previous values (as specified for 4.0), we want to change how this works in a future version. @mcrosas and I talked about how we'd like Shibboleth to provide the initial population of the email address, we'd like Shibboleth users change their email address (as well as other fields such as username) at will within the Dataverse app. To support all this, we'll need some more database tables, obviously. And while we're talking about email addresses, in #2170 there's an open question: "Perhaps we wouldn't need to confirm email addresses from institutions." @bencomp or anyone running Shibboleth, if you have an opinion on that, please comment on #2170. |
Note that while As the email field in |
Yes, as part of 050249a (ideas, not merged) I refactored EMailValidator so I can call it from the Shib code before it gets persisted to the AuthenticatedUser table. (It also allows to write unit tests of just the email validation logic.) In fact, if we don't get a valid email address from Shibboleth, we should tell the user "The SAML assertion contained an invalid email address" like I put in the commit (but we can make the language friendlier) and perhaps prompt them to enter a valid email address (hmm, which should be validated, since it's user-entered, per #2170). I'm a fan of belt and suspenders. Or a backstop, if you like. Yes, let's make the Shib code validate the email address but let's also put the |
I'm fine with adding an annotation there, as long as it's the second line of defence :-) |
@bencomp I want to make it clear to you and others interested in this issue that for "phase 1" effort underway in #2939 I am not planning on building any sort of GUI nor add an additional database table to handle any of the workflows you've suggested. Sorry. We're trying to keep this effort small and focused. My plan is to prevent multiple email addresses from being stored as a single address by enforcing the same Bean Validation rules we use elsewhere. Also, as you've noted, I'll probably use the same technique as #1608 where I simply sort the (unexpected) multiple values and pick the first one. Why Identity Providers think they're adding value by punting the choice to the Service Provider is beyond me. I'd much rather the Identity Provider hand me a single sane value for givenName or email or whatever. @mheppler @eaquigley if you want to start working on mockups of what some future GUI would look like to let users choose between "Eleni" and "Elleni" when |
@bencomp bah! You're right! Sorry! Doing too many things at once. Probably best to capture the idea in a new GitHub issue then. Please feel free to create one if you're so inclined! |
"the idea" being (in user story form) "a user with multiple email addresses can choose one", right? I'm not really inclined to open this right now :) |
@bencomp yes. No worries. There's a UI component to this so @eaquigley or @mheppler would be in a better position to define the requirement and open the new issue. To be clear, this is outside the scope of phase 1. It would go under "Issues that are definitely out of scope" at #2939. |
I'm working on this issue now and I just wanted to add a side note that the email address رونیکا.محمدخان@example.com is not considered valid but it comes from https://randomuser.me which I use for testing Shibboleth (documented as "RANDOM" in the Dev Guide). In practice, we don't see right-to-left email addresses like this. |
Just for the record, I've never seen a Hebrew email address either.
|
- Put email addresses throught the same "find single value" logic originally developed in #1608 for multiple first and last names. - Add `@ValidateEmail` to the "email" field on AuthenticatedUser to match BuiltinUser. - Add null check added to EmailValidator to make it testable. - Add `INVALID_EMAIL` and `MISSING_REQUIRED_ATTR` modes for Shib testing in dev. - Remove red warning when TestShib doesn't provide "mail" attribute. - Catch authSvc.createAuthenticatedUser exceptions and handle errors better. - Reformat code (getPrettyFacesHomePageString seems ok).
In ffd1d6e I implemented a fix:
To test this, "Multiple email addresses were asserted by the Identity Provider (eric1@mailinator.com;eric2@mailinator.com ). These were sorted and the first was chosen: eric1@mailinator.com" This is very much a dev mode thing (documented at http://guides.dataverse.org/en/4.2.4/developers/dev-environment.html#shibboleth ) so servers should be set to normal with I believe this issue is ready for QA. I showed it to @eaquigley (we decided to not show a message that the system is picking an email address for you from multiple values). |
I'm not sure if we should be worried about this or not. |
I mentioned this email validation issue at yesterday's standup and we agreed to create a separate issue for it, which I just did: #2998 |
I'm passing this issue to QA but it's going to be tricky to test. In dev I use "localhost" per http://guides.dataverse.org/en/4.2.4/developers/dev-environment.html#shibboleth but on a real server running Shibboleth you get shibsp::ConfigurationException if you put Dataverse in dev mode using the Since @bencomp originally reported this issue I reached out to him at http://irclog.iq.harvard.edu/dataverse/2016-03-18#i_32926 to see if he can grab the war file at https://build.hmdc.harvard.edu:8443/job/shibtest.dataverse.org-build-2939-shib/1/edu.harvard.iq$dataverse/artifact/edu.harvard.iq/dataverse/4.3/dataverse-4.3.war and try it out (you need to upgrade to 4.3 first: https://github.com/IQSS/dataverse/releases/tag/v4.3 ). Of course anyone who is reading this is welcome to test out that war file as well! It would be nice if we controlled a Shibboleth Identity Provider (IdP) so we could more easily do the QA ourselves but I don't know how to set one up. :( The bottom line for this issue is that we are using the same logic (and code path) as the fix for #1608 where we sort the multiple values received and pick the first one. The fix is in pull request #3025. |
@kcondon as I mentioned one way to test this is to put the server in "dev mode" by commenting out the the Apache Shib stuff like this:
Then, to test this particular bug: curl http://localhost:8080/api/admin/settings/:DebugShibAccountType -X PUT -d TWO_EMAILS When you're done: curl http://localhost:8080/api/admin/settings/:DebugShibAccountType -X DELETE And put Apache back, of course. |
Tested by stubbing out a IdP response and it correctly chose the first email of a sorted list and logged that fact. Closing. |
When I log in via Shibboleth and my Identity Provider (IdP) provides multiple email addresses separated by semicolon, the concatenation is treated as the email address. The login process works fine, but when I create a dataverse and click "Save", the pre-filled email address doesn't pass validation.
In the database I see
Jordan.Belfort@harvard-example.edu;jordan@harvard-example.edu
in the email field.I expect that one of the email addresses is chosen and stored. This choice could be the result of asking the user during account creation (after the first login), or perhaps on every login the first email is compared to the known address and updated when it changed (depending on the semantics of the ordering). This appears to be similar to #1608.
The text was updated successfully, but these errors were encountered: