-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for subscriber lists that match on reverse links #1841
Conversation
Thanks Jess, I'm just trying to have a skim and found there's some nice refactorings (i.e. https://github.com/alphagov/email-alert-api/pull/1841/files#diff-f32413b4c1acadce01b4a94c0cea1fc55ed9fb340d93ff2c75be9945c88bdf92R23) but did find myself a bit concerned/confused about special behaviour for document_collection document type and had some questions So it sounds like, we're not going to have a normal subscription to a document collection based on content_id, there will instead be a special one? That feels a bit of a smell because it doesn't feel like Email Alert API should know about document collections, since it doesn't have precedent for document type concerns I'm wondering: a) do we need this? we'll lack this functionality on other curation pages so it'll be a bit of a snowflake that behaves surprising compared to other pages and makes the subscriber matching harder to understand. What would happen if we just treated it as a normal page? With the latter point, I wonder if we can distill down what the connection is. We have a situation where we want to tell you a page x is updated because you're subscribed to page y updates? What's a generic way to explain that relation so it could be used in more scenarios than document collection. If we could distill that we could probably then have Email Alert API be generic and either Email Alert Service or Publishing API handle the part of transitioning a document_collection link into that generic relation. |
Thanks Kev!
The subscription (ie the matching) will be special, but the subscriber list will be normal. Is that what you meant? Ie the subscriber list will be based on the content id of the document collection page, and any major changes to that document collection will match those subscriber lists. But when a
I think if we don't it might be a blocker for rolling out email alerts on document collections.
I've been thinking about this all week as we now have an additional request: "Can we match document collection subscriber lists to content changes that are tagged to a specific taxonomy topic?" The thing I was finding tricky is how to display (on the various confirmation pages/emails) the name of the page the user subscribed to, whilst under the hood they're subscribed to a different list. Adding the title to the params when we find or create a list would lead to duplicate lists in email alert api - ie same links/tags/content-id, just a different title 🤔 I thought it was maybe less bad to have custom matching code in just this one application, rather than make changes in multiple repos to support a new way of linking lists. But if we're happy to consider changes further afield I'll think on, and ping you for further brain power if that's OK! cc @chao-xian |
@kevindew though it's a good point to consider if we could make this generically reusable, and indeed email-alert-api shouldn't know about collections, it's for a relatively small number of them for just one department. There's no proven need that any department wants this, nor do we know if we need to expand it to other doc types. Would it be ok to go with this as an intermediate step, to move us towards getting rid of Specialist Topics? Because we will need to revisit Taxonomy as well and that may well be where we uncover a different way to do this. And by then we could also look into making this more generic. As Jess has said, she's spent a lot of time on this and energy on this already. We've also already spoken to @leenagupte a bit. We were nudged by Richard to get your thoughts on whether there's value in taking a Publishing things sending the data over approach, and it does sound like it's the cleaner option. But, it's a lot of work for very little. |
Just to clarify a point from Kelv as there are lots of "can we do x?" questions pinging around about emails at the mo
In addition to the above, we have a new request which is how we can support automatic taxonomy topic subscriptions for a small subset of document collection pages. If supporting this new request would be made easier by making this specific piece of work more generic then that feels worth doing. But I share Kelv's desire to make some progress by at least getting the first part shipped. I'm going to continue working/thinking on this today |
Sure, yeah I meant special behaviour for this type of link. So yes sounds like it's creating a special case.
Are we sure this new functionality is needed? I'm imagining we're only doing it to retain feature parity with a different feature, topics and by doing this we're turning document collections into a new special thing. I think this isn't an ideal route as I don't think your team want to be adding new features to Email Alert API, it's a distraction from your actual objective of retiring the specialist topics, which means it's likely that new extra features won't be ideal. So I imagine it'd be much cleaner to just not extra features and accept this as a limitation of using a document collection.
I think both your points can be addressed to together. I think that if you pursue this change as a necessity for retiring specialist topics then you're at a crossroads here of deciding whether you're going down the route of applying an extra special case or adding a new feature to Email Alert API. It does sound with the way you're talking about one department, maybe short term that you want the special case and not a feature, which can be ok - it helps achieve a short term goal naturally but does drift this project away from it's architectural principles and introduces a precedent that I think we'd want to stay away from - but probably needs to call a spade a spade and flag this as a hack in the codebase with appropriate comments that suggest it's a short term approach that shouldn't be repeated with explanations. If the team that owns this app is ok with that, then that's ok 👍 Going down the not so short term path, I imagine there's a potential new feature for Email Alert API that could exist that can notify based on dependent documents, which probably can be facilitated by Email Alert Service or Publishing API, which could support this use case. But there does seem to be work in understanding what that feature is and the domain naming of it. Personally my view is it would be best to avoid the extra email sending and defer creating the new feature until there is sufficient feedback that justify the wider digging. |
I did have a thought this morning on how the desired feature might be able to work without needing to make changes to Email Alert API. Take it with a pinch of salt because I could be wrong. What if the SubscriberList was associated with content_id and a link of document collections So looked something like:
Based on a document collection of: https://www.integration.publishing.service.gov.uk/government/collections/govtech-catalyst-information Which I think would mean this list would match based on the content_id, or if any documents have a tag of document_collections with that content_id |
@kevindew Thanks Kevin! This idea might work for both document collection emails, plus this new request to support a mapped taxonomy topic subscription. Would your preference be to only add document collections and taxons to the links payload of document_collections? Or are you thinking we should add links to all content id subscriber lists? |
There'd be two sides of this, the frontend app that creates the list would need to create a subscriber list with a content_id and a document_collections link Then on the other side we need the ContentChange models in Email Alert API to have a tag of document_collections. This should already exist. I gave it a quick try:
Note: |
@kevindew Did you test that on integration? I think the doc collections id array is there because my branch of email alert service is still on integration: alphagov/email-alert-service#577. So that's the part of the puzzle which means content changes tagged to a document collection can be matched. But I thought what you were saying is we create a subscriber list for the document collection based on its content id, but in addition, that subscriber list would also have links. Allowing us to match on both content id and links. Which is not something we currently support - content_id lists don't have links or tags. So we would need to update email alert service to add links to content id based lists. Or am I misunderstanding? |
I did test on integration. Interesting.
Ah yes I see, I didn't realise it didn't utilise expanded links, so it doesn't deal with this because of it being a reverse link, the change makes sense.
Does Email Alert API reject a request that sets those? In console it looks ok:
|
No it doesn't, I agree. But when we send the params to Email Alert API to create a content id based list, we don't send any links. We only send the content id. So we will need to add links into the params for that find or create to make sure that the subscriber list can be matched against content changes by content id AND by links. But what I'm wondering is are you thinking that we only add links for document collection subscriptions, and then, only the document_collection link itself? Or are you suggesting we add links into the subscriber list for all document types. |
Gotcha, yes I think a change in a place like this is correct. We're creating a different behaving subscriber list so it would make sense to use different code to do that. It also alleviates a potential suprise that creating a subscriber list with a certain type of content_id performs differently.
👍
No just for document_collections, as I understand these are the only ones you want to behave differently. So for clarity, any other page:
a document_collection:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks okay to me, at least.
Thanks for reworking this. Leena pointed me this way to ask me to review. I am struggling a bit to understand it so have a few questions, I wonder if it may need some test refactoring to make it all a bit clearer. I think at the core of it there is a new complexity, normally we consider fields on the subscriber list as additives to tag/link So normally if we have document_type combined with links or tags we expect it to filter by document type and the links or tags Whereas for content_id you’re wanting an or condition, either it matches content_id or it matches the links/tags? I guess this opens a question for what would happen if document type and content_id was set? Though the past code looks like it handles this as an or. If I’m understanding right (and I’m not sure I am) I think we probably have one less option than you’ve got here:
I’m not sure why we’d need the fifth one, but I guess your code has needed it as you set |
Thanks Kevin. Apologies for late reply.
Yep that's correct. The only alternative I could think of was letting email alert api know about document_collections, ie the original implementation that you advised to move away from. If we want to support reverse linked lists, these subscriber lists have a content id and links. So we can match on content id of the parent document, OR the links. If we match on both then we will never match on the reverse linked documents (via links), as those content changes will have a different content id to the subscriber list.
Yeah that's a valid point. If a subscriber list was created with both content id and document type, then it would match on content changes that shared the content id. It would ignore the document type of the content change. From a functionality pov, that seems correct to me. I'm subscribed to a content id. I want changes when that piece of content is updated. I could add a test to make that more transparent, or do you think that's fundamentally problematic?
The fifth case could be renamed I agree that the specs are horrible, the use of shared examples make for very unreadable and brittle tests. |
Sure, I think the best thing to do is just try make the difference in behaviour explicit in the code.
Sounds ok, just something to try make explicit and documented through tests.
I'm sorry I can't understand why we need |
27511a3
to
d3bb0b6
Compare
Thanks Kevin, I appreciate the reviews! I've reworked the specs and reorganised the code changes 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay @hannako I missed your earlier message.
Commits look great, it was easy to follow through a review and nicely documented. Great job ⭐
Email Alert API will soon support subscriber lists that have both a content id and links. See: alphagov/email-alert-api#1841 alphagov/email-alert-frontend#1438
Email Alert API will soon support subscriber lists that have both a content id and links. See: alphagov/email-alert-api#1841 alphagov/email-alert-frontend#1438
Email Alert API will soon support subscriber lists that have both a content id and links. See: alphagov/email-alert-api#1841 alphagov/email-alert-frontend#1438
Email Alert API will soon support subscriber lists that have both a content id and links. See: alphagov/email-alert-api#1841 alphagov/email-alert-frontend#1438
The subscriber list query function is respsonsible for matching content changes to subscriber lists. Later in this PR we will be adding new rules for what constitutes a match. Refactoring the specs first to make it easier to understand the existing behaviour.
When we rolled out email subscriptions based on content-id we followed the existing matching rules. eg If both links and document_types are present, both must match. If both tags and document_types are present, both must match. If both content id and document_types are present, both must match. This commit adds a test to highlight this behaviour. Currently this behaviour is trivial because we do not store document types when we create content-id based subscriber lists.
Old behaviour: If content id matches AND document type matches, return the list. New behaviour: If content id matches OR document type matches, return the list. This change will not have any repurcussions for existing subscribers or new subscribers of single pages of content. Because there are no subscriber lists with both content id and document type set [1], and we do not support the creation of lists with both content id and document type [2]. But we will soon be supporting the creation of subscriber lists with both content id and links [3]. We want those lists to follow this new pattern ie: If content id matches OR links match, return the list. To achieve this, we need to remove content ID from the base scope, otherwise the function that matches on links (#lists_matched_on_links) requires a content-id match. [1] On production: SubscriberList.where.not(content_id: nil).pluck(:document_type).uniq => [""] SubscriberList.where.not(content_id: nil).pluck(:email_document_supertype).uniq => [""] SubscriberList.where.not(content_id: nil).pluck(:government_document_supertype).uniq => [""] [2] https://github.com/alphagov/email-alert-frontend/blob/main/app/services/subscriber_list_params/generate_single_page_list_params_service.rb#L10-L16 [3] alphagov/email-alert-frontend#1438
This function matches subscriber list params from email alert frontend to existing subscriber lists. We have removed content_id from the FindWithoutLinksAndTagsAndContentId in previous commit, which is also used here. So these changes are required to retain matching when list params contain content id without links.
When a subscriber list has both links AND content ID, only require a match on one OR the other.
This PR adds support for reverse linked subscriber lists, ie a subscriber list with both a content id, and a reverse link to itself.
Why
Soon we need to support single page notifications for document collections. When a user signs up to alerts on a document collection, research has shown an expectation of being notified when changes are made to documents within that collection. This relationship is stored as a reverse link.
Related work:
See 4c47061 for more information.
For ref this is the PR that originally implemented single content notifications
Trello https://trello.com/c/tJvdhfdd/1455-support-topic-like-subscriptions-for-document-collections-email-alert-api-m