-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to merge hierarchies and data under duplicate contacts #373
Comments
related: medic/cht-core#6751 |
The branch |
Just noting that when two duplicate places are merged, this scripts should also somehow manage any potential duplicate users which are linked to the places. |
I've started work to create a new Changes to User DocsThe Options I see:
Regardless of what we choose, any action requires a write to the Options I see:
My proposal is 2 & 1 from above: Any time a document has Primary ContactsWhen two places are merged, what should happen with the duplicate primary contact? Imagine "Pete's Area" with primary contact Pete being merged with "Pete II's Area" with primary contact Pete II. After the merge, there will be two contacts I'm thinking to make deleting the primary contact an opt-in feature via command-line flag Feedback, alternatives, thoughts? |
A 4th option would be for the command to fail if there are users with this facility id. This could be combined with a user prompt to decide what to do about it. I think we can rule out option 1 - silently breaking users is not cool. From a user perspective, option 3 would be what I would expect to happen when I request to merge one place with another. However that will immediately expose more PHI to users so there's a risk there. Why do you prefer option 2? Is the user likely to be obsolete in this case so that should be the default option, requiring manual intervention to recover them? From an implementation perspective I'm leaning more towards doing more in API because that'll make the logic available and consistent across uses, for example, if someone wants to write a custom script, or if a tool such as the User Management Tool wants to incorporate this feature. |
I'm basically writing this feature for the User Management Tool with the idea that a new UI would allow admins to merge duplicate places. There is currently an eCHIS-wide effort to cleanup dead accounts and unused parts of the hierarchy. Nairobi county alone had 350+ unused accounts many of which are duplicates. So this would get used. From the perspective of UMT, I believe the ideal is to have an interface very similar to move-contacts so the same technology stack used for moving contacts can be re-used to handle duplicates -- just different data being passed to cht-conf. I prefer option 2 because I think you should only be able to merge contacts of the same type. If there is a user at the place being merged, then there is likely a user for all places of that type (eg. CHP area). Similarly, if there is no user at the place being merged, then there is likely no user for any places of that type (eg. household). This may not be universally the case, but I think it's quite a good starting point -- especially for the projects using UMT which creates exactly one users for all places created. Prompting the user to decide doesn't align well with a solution for UMT. I think option 3 can result in multiple user accounts at the same facility. I think there are a few positive outcomes from this (mostly user never loses their session), but definitely some negative outcomes which I think are worse (increased possibility that two different users now create data under the same place which becomes a nightmare to untangle). Option 3 also seems inelegant to implement while keeping the Perhaps you can elaborate on how putting more stuff into API would be beneficial to users or the User Management Tool? What specifically do you have in mind? If you want to do more via API for this specific user-management issue, I'd direct you to medic/cht-core#9139. |
In general I prefer complex logic to be in API rather than cht-conf for a few reasons. Firstly and mainly because API is versioned with cht-core it means future changes to user structures made in cht-core don't break compatibility with cht-conf, user management, or other cases. It's also more reliable because it's hosted very close to couchdb so less likely to be impacted by network issues (UMT is close too, but cht-conf is often run far from the couchdb). Finally it makes the feature available for more use cases for the benefit of the wider community. To answer your question directly, the users of the UMT would benefit because the action would be stable across cht-core versions and more reliable due to fewer http requests. The obvious downside is having to upgrade cht-core to get the v1 API before you can call it, which is significant, but because it's a one-off cost I expect it to pay off in the long run to have the code versioned in API with a guarantee of backwards compatibility.
I think it could if the user were prompted before submission, and that selection could be submitted with the API request, eg: "what do you want to do with user X?", right?
I'm putting myself in the position of the user, and trying to decide what behaviour they would expect to happen if you explained the situation to them. I think a significant number would expect 2 to happen, but a significant number would expect 3 to happen. Furthermore both options have a serious impact on the users affected and are difficult to resolve remotely. Therefore I think the user needs to be involved in the decision. |
Cool. An implementation in API feels much more costly and I don't personally feel I have time to take that on. (I imagine since these can take hours to run, and days to run safely, it probably needs to run in some sort of worker thread via a queue... and the queue probably should be queriable, resumable, cancelable, etc?). You'd probably want to blaze trail for the infrastructure by having Do you have a plan (and resources?) to build this in cht-core such that I should hold-off on this cht-conf ticket? An achievable middle ground might be to expose the data needed by the move-contacts script via an API to make these scripts more robust (reports created by a contact, descendants of a place, ancestors of a place, etc). These endpoints would move toward an ecosystem and community empowered by an API. I can't find anything in the cht-core repo tracking that work or the bigger work of moving the whole action into core. If you do have plans, can you share them since that would mean we are duplicating effort with the current UMT investments and roadmap. If moving this into API is not a reason to hold-off, is it reasonable to move forward with a For a Sound reasonable? PS I'm planning to follow-up with one more cht-conf action for UMT which is to delete an entire tree of the hierarchy (all contacts, users and reports below a place). |
There are no resources planned to implement this from the Product team, I was just giving my 2c as to the implementation that I think would pay off in the long term. I'll leave it up to you where/how you want to implement this. |
This adds new action cht merge-contacts. The change proposes to: Move the code for move-contacts into a library lib/hierarchy-operations with interfaces move and merge Parameterize the move-contacts code changing logic for: input validation, deletes the top-level contact, changes how lineages are updated, and adds report reassignment. All else remains unaltered. Handles reports and contacts only. Unclear what other doc types I should be worried about.
Released in 4.2.0 |
Is your feature request related to a problem? Please describe.
In https://github.com/medic/config-moh-mali/issues/181 the moh-mali supervisor app has 215 duplicate places, each with reports and data under them. It is fairly easy to identify the duplicates, but merging them is not straight-forward. It would be great to have a tool that could handle this scenario to merge all information under one contact and eliminate duplicates.
Describe the solution you'd like
Potentially a new "merge-contacts" action or similar? Maybe a
--merge-duplicates
mode for move-contacts?When merging two contacts
main
contact should get assigned all reports which are currently assigned toduplicate
contactThe text was updated successfully, but these errors were encountered: