Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support fetching resources with same parent table #144 #146

Merged
merged 7 commits into from
Apr 14, 2021

Conversation

mozzy11
Copy link
Collaborator

@mozzy11 mozzy11 commented Mar 22, 2021

Description of what I changed

Fixes #144

E2E test

TESTED:

mvn compile exec:java -pl batch
-Dexec.args="--openmrsServerUrl=http://localhost:8081/openmrs
--openmrsUserName=admin --openmrsPassword=Admin123
--fhirSinkPath=http://localhost:8080/fhir
--sinkUserName=hapi --sinkPassword=hapi123
--searchList=Patient,Person --batchSize=20
--jdbcModeEnabled=true --jdbcUrl=jdbc:mysql://localhost:3306/openmrs
--dbUser=root --dbPassword=Admin123 --jdbcMaxPoolSize=50
--jdbcDriverClass=com.mysql.cj.jdbc.Driver"

Checklist: I completed these to help reviewers :)

  • My IDE is configured to follow the code style of this project.

    No? Unsure? -> configure your IDE, format the code and add the changes with git add . && git commit --amend

  • I am familiar with Google Style Guides for the language I have coded in.

    No? Please take some time and review Java and Python style guides. Note, when in conflict, OpenMRS style guide overrules.

  • I have added tests to cover my changes. (If you refactored existing code that was well tested you do not have to add tests)

    No? -> write tests and add them to this commit git add . && git commit --amend

  • I ran mvn clean package right before creating this pull request and added all formatting changes to my commit.

  • All new and existing tests passed.

    No? -> figure out why and add the fix to your commit. It is your responsibility to make sure your code works.

  • My pull request is based on the latest changes of the master branch.

    No? Unsure? -> execute command git pull --rebase upstream master

Copy link
Collaborator

@bashir2 bashir2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mozzy11; this looks good. I just have two suggestions.

@@ -180,8 +180,8 @@ public Integer fetchMaxId(String tableName) throws SQLException {
for (String search : searchList) {
if (linkTemplate.containsKey("fhir") && linkTemplate.get("fhir") != null) {
String[] resourceName = linkTemplate.get("fhir").split("/");
if (resourceName.length >= 1 && resourceName[1].equals(search)) {
reverseMap.put(entry.getValue().getParentTable(), resourceName[1]);
if (resourceName.length >= 1 && resourceName[1].equals(search) && entry.getValue().isEnabled()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: drop isEnabled condition here and add a check before reverseMap.put to throw an exception if reverseMap already has an entry with the same key. This way we can fail fast for mis-configured resource mapping files.

Copy link
Collaborator Author

@mozzy11 mozzy11 Mar 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -91,7 +92,14 @@ public MethodOutcome uploadResource(Resource resource) {
interceptors = Collections.singleton(new BasicAuthInterceptor(sinkUsername, sinkPassword));
}

return uploadBundle(sinkUrl, bundle, interceptors);
Collection<MethodOutcome> responses;
try {
Copy link
Collaborator Author

@mozzy11 mozzy11 Mar 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bashir2 , the reason for doing this ,
the pipeline was failing with an error ResourceVersionConflictException : Multiple client threads trying to create resources with similar self assigned ID , but on rerunning it again ,it could work.

i think two threads try to create a person and patient resource with the same uuid ,and hapi throws the error.
so what i did here is to catch the exception and re-upload the bundle . so one bundle ie person is uploaded first and then the patient, without conflicting

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as we discussed on Tuesday, I think we should get to the bottom of this. Assuming that your sink FHIR store is a HAPI FHIR server (JPA), I don't think creating two different resource types (e.g., Patient and Person) with the same ID should cause a problem.

@codecov
Copy link

codecov bot commented Mar 28, 2021

Codecov Report

Merging #146 (cc06c5f) into master (ae598d7) will increase coverage by 0.39%.
The diff coverage is 47.05%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #146      +/-   ##
============================================
+ Coverage     42.37%   42.76%   +0.39%     
- Complexity       96       97       +1     
============================================
  Files            21       21              
  Lines           767      774       +7     
  Branches         65       67       +2     
============================================
+ Hits            325      331       +6     
- Misses          420      421       +1     
  Partials         22       22              
Impacted Files Coverage Δ Complexity Δ
...h/src/main/java/org/openmrs/analytics/FhirEtl.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...main/java/org/openmrs/analytics/JdbcFetchUtil.java 59.03% <100.00%> (+3.19%) 11.00 <0.00> (+1.00)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ae598d7...cc06c5f. Read the comment docs.

Copy link
Collaborator

@bashir2 bashir2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mozzy11 for the changes but I think a new bug has been introduced now.

@@ -204,8 +204,7 @@
"title": "Visit",
"parentTable": "visit",
"linkTemplates": {
"rest": "/ws/rest/v1/visit/{uuid}?v=full",
"fhir": "/Encounter/{uuid}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have removed this line because of the exception at line JdbcFetchUtil.java line 188, is that correct? The problem is that the mapping here is valid and we want to keep it. Think about the other use of this config: When there is a change in the visit table captured by Debezium, we want to fetch the corresponding Encounter resource hence we need to keep this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya sure corrected it

@@ -91,7 +92,14 @@ public MethodOutcome uploadResource(Resource resource) {
interceptors = Collections.singleton(new BasicAuthInterceptor(sinkUsername, sinkPassword));
}

return uploadBundle(sinkUrl, bundle, interceptors);
Collection<MethodOutcome> responses;
try {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as we discussed on Tuesday, I think we should get to the bottom of this. Assuming that your sink FHIR store is a HAPI FHIR server (JPA), I don't think creating two different resource types (e.g., Patient and Person) with the same ID should cause a problem.

if (!reverseMap.containsKey(resourceName[1])) {
reverseMap.put(resourceName[1], entry.getValue().getParentTable());
} else {
log.error("Some tables are mapped to the same Resources");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I have commented in the dbz_event_to_fhir_config.json below, this is actually not an error but a valid scenario. For example visit and encounter tables are both mapped to the Encounter FHIR resource and this is valid (and needed).

Here is my suggestion with two examples for Patient and Encounter:

  • Parse the config and for each FHIR resource, find the list of parent tables that are relevant for that resource, e.g., the map would be: Patient->person and Encounter->encounter,visit
  • In the main pipeline (that uses this reverse-map), for each FHIR resource, fetch all IDs from all tables in its list. So for Patient all IDs of person and for Encounter all IDs of both encounter and visit.
  • For the list of IDs found above, ask the FHIR module to give those resources, e.g., for all encounter and visit IDs request Encounter/ID.

If this is not clear or you see any problems, please let me know. Otherwise let's merge both these reverse-map methods and create one like above.

This is important to be fixed (since otherwise it is a functionality bug) but if fixing this and also addressing my previous comment about fetching IDs only once from each table is difficult, please feel free not to address that issue in this PR and instead file a separate issue for it. So you may end up with Person->person and Patient->person maps and get the person IDs twice from the DB (which can be avoided but that is fine to leave for future).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return reverseMap;
}
}

public Map<String, ArrayList<String>> deduplicateReverseMap(Map<String, String> reverseMap) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to separate public methods for creation of reverse-map. Please expose only one of these and directly use that in FhirEtl.

assertEquals(4, reverseMap.size());
assertEquals(reverseMap.get("Patient"), "person");
assertEquals(reverseMap.get("Person"), "person");
assertEquals(reverseMap.get("Encounter"), "encounter");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So specifically here, Encounter should be mapped to both encounter and visit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected in the new commit

@mozzy11
Copy link
Collaborator Author

mozzy11 commented Apr 9, 2021

@bashir2 , regarding the issue of HAPI -FHIR , innitially i was running against a new version of HAPI with customized configs.
i tried with the HAPI FHIR server attached in the pipeline , but i was instead getting some error like
Resource Patient/efa95c17-0940-41ea-ac5c-f6bbe26305e5 not found, specified in path: Person.link.target
and on rerunning the jdbc fetch mode again , it could work fine. but i think its due to some configs also.

I think we should file another ticket for that specifically.

Otherwise i think the new commit addreses the issue of fetching resources with same parent table and also not re-fetching the max id for resources with the same parent table

@mozzy11
Copy link
Collaborator Author

mozzy11 commented Apr 12, 2021

cc @bashir2 , @kimaina updated PR

Copy link
Collaborator

@bashir2 bashir2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mozzy11; just some minor comments. Also it seems that the Travis tests for this PR failed (link) with a docker throttling error. Let's see again if it succeeds next time that you push (the restart fails now too).

for (Map.Entry<String, EventConfiguration> entry : tableToFhirMap.entrySet()) {
Map<String, String> linkTemplate = entry.getValue().getLinkTemplates();
for (String search : searchList) {
if (linkTemplate.containsKey("fhir") && linkTemplate.get("fhir") != null) {
String[] resourceName = linkTemplate.get("fhir").split("/");
ArrayList<String> resources = new ArrayList<String>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this to line 191 (i.e., the else section).

}
reverseMap.put(entry.getValue().getParentTable(), resources);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this to line 191 as well (i.e., the else section). In the other case that the map already contains the parent table name, we don't need to create a new list and/or add it to the map. We just need to update the current list in the map.

@mozzy11
Copy link
Collaborator Author

mozzy11 commented Apr 13, 2021

Thanks @bashir2 , the unit tests pass suceesfuly here . its the e2e tests that fails wilh BATCH MODE TEST: WAITING FOR OPENMRS SERVER TO START . i think that error coudld be cause by some thing to do with Travis memoery issue. idealy its not related to my changes , as the other travis build succeds

@mozzy11
Copy link
Collaborator Author

mozzy11 commented Apr 13, 2021

cc @bashir2 i have done the clean up

Copy link
Collaborator

@bashir2 bashir2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mozzy11 this looks good. I just added some cosmetic changes.

@bashir2 bashir2 merged commit a65f482 into google:master Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JDBC mode can not fetch Person and Patient resources in the same run
2 participants