Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement reconciliation service to keep azure-blob container in sync with RDBMS #3143

Closed
punktilious opened this issue Dec 24, 2021 · 4 comments
Assignees
Labels
enhancement New feature or request P2 Priority 2 - Should Have

Comments

@punktilious
Copy link
Collaborator

Is your feature request related to a problem? Please describe.
With payload-offloading, the call to store the payload is made before the RDBMS transaction commits. If the transaction is rolled back, an attempt is made to delete any payloads which were stored as part of the transaction. However, it is possible that the delete call could fail (a network partition, maybe) in which case resource records will exist in the offload store, but not the RDBMS.

Describe the solution you'd like

We need a reconciliation process which runs in the background on a periodic basis. The process can scan the offload data store and check that the RDBMS contains a matching record. Records which exist in the offload data store but not the RDBMS can be removed.

Describe alternatives you've considered
None.

Acceptance Criteria

  1. GIVEN a resource exists in the offload data store
    AND the same resource does not exist in the RDBMS
    WHEN the reconciliation process has completed a pass over the data
    THEN the resource should no longer exist in the offload data store

Additional context
Discussion to be had on the best runtime option for this process. Options:

  1. A thread running in the background when the IBM FHIR Server starts.
  2. A custom operation which can be invoked by a simple job
  3. A standalone Java program which can be packaged as a container application

My preference is 2, with a possible extension to fhir-bucket which would support it being called in a loop until all the work was done.

@punktilious punktilious added the enhancement New feature or request label Dec 24, 2021
@punktilious
Copy link
Collaborator Author

Each payload persistence implementation will need to implement this feature separately. For the Cassandra payload persistence implementation, a prototype reconciliation implementation is provided in fhir-persistence-cassandra-app CLI.

@lmsurpre lmsurpre added the P2 Priority 2 - Should Have label Jan 26, 2022
@lmsurpre
Copy link
Member

lmsurpre commented Feb 7, 2022

Lets make this one specific to the Azure blob implementation. Will create separate ticket for the S3 / minio flavor of this.

@lmsurpre lmsurpre changed the title Implement reconciliation service to keep RDBMS-Offload consistency Implement reconciliation service to keep azure-blob container in sync with RDBMS Mar 2, 2022
@lmsurpre
Copy link
Member

lmsurpre commented Mar 21, 2022

QA trick for getting into the situation where offloading succeeds but then transaction fails:

  • revoke INSERT/UPDATE permission x_RESOURCES table

That should cause us to do a delete of the offloaded blob during rollback processing.

But how to make that rollback processing fail such that we need to run the reconciliation service?
Answer: point reconciliation service to an azure blob offload container with a config that points to an entirely different db. then NONE of the offloaded payload keys will match

@d0roppe
Copy link
Collaborator

d0roppe commented Mar 28, 2022

The reconciliation service seems to be working fine, closing issue

@d0roppe d0roppe closed this as completed Mar 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P2 Priority 2 - Should Have
Projects
None yet
Development

No branches or pull requests

3 participants