Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Id members with multiple associated TJ users and email address changes better #71

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

michael-gratton
Copy link
Contributor

@michael-gratton michael-gratton commented Nov 21, 2024

  • Handle Id members with multiple associated TJ users better
  • Support changing email addresses

Rebases #55 off of current main, breaks it up into individual commits, updates some of the work, adds unit tests.

@michael-gratton michael-gratton changed the title Handle Id members with multiple associated TJ users better Handle Id members with multiple associated TJ users and other edge cases better Nov 21, 2024
@michael-gratton michael-gratton force-pushed the email-edge-case-handling-v2 branch from eca4e91 to 3bcdc49 Compare November 21, 2024 06:03
@michael-gratton michael-gratton force-pushed the email-edge-case-handling-v2 branch 2 times, most recently from c41cc42 to 1bb3762 Compare November 22, 2024 01:20
@michael-gratton michael-gratton force-pushed the email-edge-case-handling-v2 branch from 1bb3762 to 2510f2b Compare November 22, 2024 01:27
@michael-gratton
Copy link
Contributor Author

@ivanb777 @ella-song-1013 Would really appreciate some feedback on these changes. It's worth looking at the commits individually - they may even want to be broken up in to two different PRs, but I think I want to deploy them both at the same time for the moment.

In particular, for the first commit, it would be good to get some feedback on the validity and scope of the uses cases listed in IdentityTijuana::MemberSync::find_primary_user_for_member in the first commit. Do they make sense? Does the code reflect the use cases? Are there other important ones that I haven't listed?

@michael-gratton michael-gratton changed the title Handle Id members with multiple associated TJ users and other edge cases better Handle Id members with multiple associated TJ users and email address changes better Nov 22, 2024
@michael-gratton
Copy link
Contributor Author

On NB, there's some follow-up work to be done here as well - when a member does have multiple TJ members linked, the non-primary users should all be unsub'ed, and maybe some of the other XXX comments in this MR.

Copy link
Contributor

@ivanb777 ivanb777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my main observation here is stated thru inline comments. In short, I feel that the approach of having a one to one sync both ways when an Id member has multiple links to TJ users is not valid.

I agree with establishing a primary, but pushing it even further we should implement it as an attribute on the ExternalMemberId.

In summary the sync process of the Id member with multiple links, should be able to accept and determine which attribute changes should be imported from all external links, and also how it should export it's state to all external links.

raise "Id member has no linked TJ users"
end

# Some scenarios to think about before changing this:
Copy link
Contributor

@ivanb777 ivanb777 Nov 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(PS this was written prior to the comment below, so some of this is questioned in the above. Still the wording of scenarios could be improved, there are some repetitions.)

I understand the approach of finding the primary TJ user, but I find these comments a bit incomplete. They state email discrepancies scenarios, but it's not really clear from each why a particular matching should be employed. It would be good to add some sort of statement how each of these are relevant for the business logic of finding the primary user.

For example,

      # (3) A member has two users associated with it, one with the
      # correct email and one with a typo. The typo'ed user gets
      # updated because they sign another petition with the same typo
      # and a new postcode.

It's unclear, which user was created first, and whether the typo'ed user got cleaned/recreated by Id as part of sync process.
Which approach is valid here and why? Isn't the change of postcode valid change to be synced, regardless of which linked user was modified?

Copy link
Contributor Author

@michael-gratton michael-gratton Nov 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's why the comment is "some scenarios to think about" rather than "here is a complete list of scenarios".

We need to get the former representative of things we care about, but also the latter is either in the worst case infinitely long (but I don't think so) or in the best case too big and too complex to fit into a comment in the source code.

So instead of trying to do the latter, these are some provocations to get people to think a bit about the problem space before changing the code.

It's unclear, which user was created first, and whether the typo'ed user got cleaned/recreated by Id as part of sync process.

So what's your understanding of the n different permutations of that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question wasn't about permutations, yes there are a number of them, the root cause being that TJ is allowing these records to be created shifting the responsibility to Id to bring them in sync.

I guess, I carried into this question the thought that by determining primary link in this call, we are eliminating other users from being syncd, thus the question I had was

Isn't the change of postcode valid change to be synced, regardless of which linked user was modified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the change of postcode valid change to be synced, regardless of which linked user was modified

The answer is of course yes, but doing so correctly would require a three-way merge with both the Id member and the primary TJ user, and see my comments elsewhere about wanting to get something better in that what we currently have sooner rather than later.

# 1. The user with a normalised (but not cleaned), matching email, if any
# 2. The first subscribed user, if any
# 3. The first remaining user
def self.find_primary_user_for_member(member)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there a number of issues here.
First, this fix points out that we need to have a concept of primary (link?) for Id's external members, which is established in the very early stages of the process, rather than being continually best-guessed.

Furthermore, the syncing process logic cannot be the same both ways.
Id's member record is the source of truth, but as the changes to multiple TJ users that point to it are possible and valid, they should be synced to the Id's member record regardless of not being a primary link. Eg. a linked TJ user signs the petition with typo'ed email providing an updated postcode, the new postcode is an important change and should flow into the Id's record, rather than just linking the typo'ed user.

On the other hand, and this regards also the concept of primary, I don't fully understand why we are not propagating the Id's record state (being a source of truth) to all linked TJ users (email attr being the exception).

Finally, the primary link, in the current state, should actually be the record with the cleaned email address. There reasons are:

  • this is the only record that a member can personally can change via web portal
  • this is the record that is used when cutting lists / sending mailouts
  • this link should explicitly be marked as primary (connecting to the initial thought)
    • the reason for this is that if the email address is changed on Id's member record, then the email address on the primary should also be updated

Just for reference here – currently we have 5695 with 2, 92 with 3, and 2 with 4 TJ users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand why we are not propagating the Id's record state (being a source of truth) to all linked TJ users (email attr being the exception).

Because I want to deploy this before I go on leave.

@michael-gratton
Copy link
Contributor Author

@ivanb777 I guess I was looking for specific feedback about the concrete changes in this specific PR. There is clearly more work to be done, as has been noted multiple places here and elsewhere.

Note the name of this PR is "Handle ... changes better", not "Solve all ... problems". I think we need to move to a different approach to do that. ;)

@ivanb777
Copy link
Contributor

Note the name of this PR is "Handle ... changes better", not "Solve all ... problems".

Fair point. My only concern that remains is that with this change we will be limiting updating Id member only with changes from the primary TJ user, and omitting valid and potentially important changes from other linked users.
However, looking at the timestamps of updates for members with multiple users, this likely can part of solving all problems :)

Copy link
Contributor

@ivanb777 ivanb777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Mike, just a few minor inline suggestions. Really great to extend the test coverage for the MemberSync calls 🎸🎸🎸.
This LGTM for merging – did you want for this one to go ahead of #73?
If so, I'll rebase it from this one + move the tests along with the ones you created here.

@michael-gratton michael-gratton force-pushed the email-edge-case-handling-v2 branch 3 times, most recently from f17e592 to 237a582 Compare November 29, 2024 01:19
@michael-gratton michael-gratton force-pushed the email-edge-case-handling-v2 branch 2 times, most recently from 175d10b to df8230c Compare January 23, 2025 04:55
If an Id member has (or will have as a result of the call) more than
one TJ user associated with it (via MemberExternalId) then the right
one has to be chosen and used, and the others ignored.
Allow email addresses to be changed, but check for and raise an
obvious exception if there will be a conflict when doing so.
@michael-gratton michael-gratton force-pushed the email-edge-case-handling-v2 branch from df8230c to ee8dfdc Compare January 23, 2025 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants