Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include ICEES as a BTE resource #213

Closed
andrewsu opened this issue Jun 24, 2021 · 25 comments · Fixed by biothings/api-respone-transform.js#15
Closed

include ICEES as a BTE resource #213

andrewsu opened this issue Jun 24, 2021 · 25 comments · Fixed by biothings/api-respone-transform.js#15
Assignees

Comments

@andrewsu
Copy link
Member

ICEES has dropped the requirement for an API key, and I believe the ICEES KP has expanded beyond the limited asthma data in previous iterations. We should look at incorporating it as a knowledge source for BTE.

@karafecho, can you give us an example query and result that we should be able to get?

Might also be related to #153, which would allow users to override default inclusion/exclusion settings for APIs

@karafecho
Copy link

Sure thing, Andrew.

ICEES wiki page
ICEES DILI Swagger interface
Example one-hop query:
curl -XPOST https://icees.renci.org:16341/query -H "Content-Type: application/json" -d '{"message": {"query_graph": {"nodes": {"n0": {"name": "drug-induced liver injury", "ids": ["MONDO:0005359"]}, "n1": {"categories": ["biolink:DiseaseOrPhenotypicFeature"], "name": "Disease Or Phenotypic Feature"}}, "edges": {"e0": {"subject": "n0", "object": "n1", "predicates": ["biolink:correlated_with"]}}}}}'

@andrewsu
Copy link
Member Author

From @karafecho (thank you Kara!) via NCATSTranslator/minihackathons#68 (comment)

@andrewsu: I just checked the SmartAPI registry, and indeed all three ICEES instances appear to be properly registered:

ICEES COVID Instance API: https://smart-api.info/registry?q=65292eac9a88e3a895be21f19b554767
ICEES Asthma Instance API: http://smart-api.info/registry?q=0864c0912390d0876c3c34a00acb5c3b
ICEES DILI Instance API: http://smart-api.info/registry?q=9dd890397a7b8d98fbe247d56cac2b8f

Please let me or Hao know if you run into any issues.

@andrewsu
Copy link
Member Author

Here is an example query that takes advantage of the ICEES DILI Instance:

{
    "message": {
        "query_graph": {
            "edges": {
                "e0": {
                    "object": "n1",
                    "predicates": [
                        "biolink:correlated_with"
                    ],
                    "subject": "n0"
                }
            },
            "nodes": {
                "n0": {
                    "ids": [
                        "MONDO:0005359"
                    ],
                    "name": "drug-induced liver injury"
                },
                "n1": {
                    "categories": [
                        "biolink:DiseaseOrPhenotypicFeature"
                    ],
                    "name": "Disease Or Phenotypic Feature"
                }
            }
        }
    }
}

In theory, adding the API names above to /src/routes/v1/config.js (as in the diff below) should be enough to get them incorporated into BTE, since these all natively handle TRAPI queries.

diff --git a/src/routes/v1/config.js b/src/routes/v1/config.js
index b63853a..ec5e269 100644
--- a/src/routes/v1/config.js
+++ b/src/routes/v1/config.js
@@ -44,4 +44,9 @@ exports.API_LIST = [
     'Automat Pharos',
     'Automat Chembio',
     'Automat Foodb',
+
+    'ICEES COVID Instance API',
+    'ICEES Asthma Instance API',
+    'ICEES DILI Instance API',
+    'Columbia Open Health Data (COHD)'
 ];

But Chunlei made these changes on dev (https://dev.api.bte.ncats.io/v1/query/) and the query above still returns zero results. However, POSTing the same query above to the ICEES DILI Instance directly at (https://icees.renci.org:16341/query?reasoner=true&verbose=false) gives 62 results:

curl --silent --location --request POST "https://icees.renci.org:16341/query?reasoner=true&verbose=false" --header "Content-Type: application/json" --data-raw "{
    \"message\": {
        \"query_graph\": {
            \"edges\": {
                \"e0\": {
                    \"object\": \"n1\",
                    \"predicates\": [
                        \"biolink:correlated_with\"
                    ],
                    \"subject\": \"n0\"
                }
            },
            \"nodes\": {
                \"n0\": {
                    \"ids\": [
                        \"MONDO:0005359\"
                    ],
                    \"name\": \"drug-induced liver injury\"
                },
                \"n1\": {
                    \"categories\": [
                        \"biolink:DiseaseOrPhenotypicFeature\"
                    ],
                    \"name\": \"Disease Or Phenotypic Feature\"
                }
            }
        }
    }
}" | jq '.message.results | length'
62

In theory, I think, BTE should just proxy TRAPI queries to APIs in the metakg that accept TRAPI, but that doesn't seem to be working here.

@andrewsu
Copy link
Member Author

Also relevant from @newgene:
http://localhost:3000/metakg?api=ICEES%20DILI%20Instance%20API returns empty
v.s.
http://localhost:3000/metakg?api=MyDisease.info%20API returns all MyDisease edges

@karafecho
Copy link

Looping in @xu-hao...

@colleenXu
Copy link
Collaborator

colleenXu commented Aug 10, 2021

Note to @ericz1803:

Besides getting the "output IDs" from the ICEES response...

ideally we would also correctly parse/ingest the ICEES edge attributes so BTE keeps them in its edge attributes. This is related to the #209 (and situation A of the provenance ticket #208 (comment))

@ericz1803
Copy link
Contributor

Summary:
smartapi-kg uses the predicates.json file to resolve TRAPI ops. Initially, the v1/query endpoint was using this predicates.json file. This is why adding the names to the config file didn't work (since those entries don't exist in the old file). Using the newer updated version of the predicates.json file required some changes to smartapi-kg since the field it was reading that used to be all array values have changed to a mix of strings and arrays.

The current problem:
When trying this query, it returns 0 results. However, making the query below gives 124 results (not sure why it is double the expected number but this is relatively minor). What seems to be causing the big issue of the first query not working is that the initial query has edges that are MONDO:0005359 -> ? and the results have edges that are NCIT:C84427 -> ? but the biomedical-id-resolver isn't able to connect those 2 equivalent ids together.
(From @colleenXu: https://mydisease.info/v1/disease/MONDO:0005359?fields=mondo.xrefs,disease_ontology.xrefs doesn't contain the NCIT ID we need)

{
    "message": {
        "query_graph": {
            "edges": {
                "e0": {
                    "object": "n1",
                    "predicates": [
                        "biolink:correlated_with"
                    ],
                    "subject": "n0"
                }
            },
            "nodes": {
                "n0": {
                    "ids": [
                        "NCIT:C84427"
                    ],
                    "name": "drug-induced liver injury"
                },
                "n1": {
                    "categories": [
                        "biolink:DiseaseOrPhenotypicFeature"
                    ],
                    "name": "Disease Or Phenotypic Feature"
                }
            }
        }
    }
}

Test these current changes:
https://github.com/ericz1803/bte-trapi-workspace/tree/icees

@colleenXu
Copy link
Collaborator

@karafecho Could you provide example TRAPI queries for ICEES COVID19?

Also, I believe the query below works as an example TRAPI query for ICEES Asthma, but please confirm or let us know of another query that you prefer.

{
    "message": {
        "query_graph": {
            "edges": {
                "e00": {
                    "subject": "n00",
                    "object": "n01",
                    "predicates": ["biolink:correlated_with"]
                }
            },
            "nodes": {
                "n00": {
                    "ids": ["MONDO:0004766"],
                    "categories": ["biolink:Disease"]
                },
                "n01": {
                    "categories": ["biolink:Disease"]
                }
            }
        }
    }
}

@karafecho
Copy link

karafecho commented Aug 19, 2021

@colleenXu : Thanks for working on this.

The query above looks good to me, although depending on the question, you might want to look at biolink:DiseaseOrPhenotypicFeature or biolink:ChemicalEntity or biolink:ChemicalExposure instead of (or in addition to) biolink:Disease.

The structure of the above query should also work for the most recent ICEES COVID instance, which exposes data on patients confirmed to be COVID+ (cases) or COVID- (matched controls). That instance can be found here: https://covid.icees.renci.org/jan2021/apidocs#/. The input CURIE should work, but you might want to use something a bit more relevant to COVID. Here are a few suggestions:

Remdesivir
"NCIT:C152185",
"UMLSCUI:CL553517",
"PUBCHEM:121304016",
"RxNorm:2284718",
"RxNorm:2284959",
"RxNorm:2284957",
"RxNorm:2395500",
"RxNorm:2284958",
"RxNorm:2367757",
"RxNorm:2284960",
"RxNorm:2395499",
"RxNorm:2367759",
"RxNorm:2395503",
"RxNorm:2395502",
"RxNorm:2395505",
"RxNorm:2367758"

Prednisone
"CAS:68-59-7",
"CAS:53-03-2",
"MESH:D011241",
"CHEMBL:CHEMBL635",
"PUBCHEM:5865",
"RXCUI:763179",
"RXCUI:763181",
"RXCUI:795858",
"RXCUI:763185",
"RXCUI:763183",
"RXCUI:206837",
"RXCUI:206954",
"RXCUI:206988",
"RXCUI:206997",
"RXCUI:207048",
"RXCUI:1303131",
"RXCUI:1303134",
"RXCUI:1303137",
"RXCUI:198146",
"RXCUI:198148",
"RXCUI:205301",
"RXCUI:312615",
"RXCUI:312617",
"RXCUI:312687",
"RXCUI:1303135",
"RXCUI:198144",
"RXCUI:198145",
"RXCUI:1303125",
"RXCUI:1303132",
"UMLSCUI:C1811498",
"UMLSCUI:C0690124",
"UMLSCUI:C2343331",
"UMLSCUI:C1814982",
"UMLSCUI:C1814983",
"UMLSCUI:C0708023",
"UMLSCUI:C0708166",
"UMLSCUI:C0708220",
"UMLSCUI:C0708232",
"UMLSCUI:C0708301",
"UMLSCUI:C3475339",
"UMLSCUI:C3475342",
"UMLSCUI:C3475345",
"UMLSCUI:C0690123",
"UMLSCUI:C0690128",
"UMLSCUI:C0705898",
"UMLSCUI:C0979757",
"UMLSCUI:C0989249",
"UMLSCUI:C0982851",
"UMLSCUI:C3475343",
"UMLSCUI:C0690120",
"UMLSCUI:C0690121",
"UMLSCUI:C3475333",
"UMLSCUI:C3475340",
"SCTID:116602009",
"SCTID:325456002",
"SCTID:373989007",
"SCTID:373994007",
"SCTID:374058000",
"SCTID:374072009",
"SCTID:418349006",
"SCTID:10312003",
"SCTID:768296006",
"SCTID:722491009"

InvasiveVentilation
"LOINC:LA28889-6",
"LOINC:LP263712-4",
"LOINC:86851-3",
"LOINC:LA30356-2"

Hope this helps.

@colleenXu
Copy link
Collaborator

colleenXu commented Aug 20, 2021

@karafecho Thank you for your feedback. I have several questions related to ICEES COVID19 API:

  1. The SmartAPI registration says the server link is https://covid.icees.renci.org/query. Is this correct?
  2. If I wanted to do a query from COVID19 to related diseases or drugs, what ID for COVID19 would I use? I tried a few but the API didn't seem to recognize them.
  3. For the suggestions above (Remdesivir, Prednisone, InvasiveVentilation), do you have suggested semantic types for these "nodes"? I'm guessing ChemicalEntity , or ActivityAndBehavior (based on the meta_knowledge_graph endpoint)...
  4. Also, I'm currently getting this error message when trying to query the ICEES COVID19 API...
    "reasoner_id": "ICEES",
    "tool_version": "6.0.0",
    "datetime": "2021-08-08/20/21 03:56:33",
    "message_code": "Error",
    "code_description": "Traceback (most recent call last):\n  File \"/usr/local/lib/python3.8/site-packages/redis/connection.py\", line 559, in connect\n    sock = self._connect()\n  File \"/usr/local/lib/python3.8/site-packages/redis/connection.py\", line 584, in _connect\n    for res in socket.getaddrinfo(self.host, self.port, self.socket_type,\n  File \"/usr/local/lib/python3.8/socket.py\", line 918, in getaddrinfo\n    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):\nsocket.gaierror: [Errno -2] Name or service not known\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/./icees_api/features/knowledgegraph.py\", line 576, in one_hop\n    ataf = select_associations_to_all_features(\n  File \"/./icees_api/features/sql.py\", line 813, in select_associations_to_all_features\n    return select_feature_association(\n  File \"/./icees_api/features/sql.py\", line 766, in select_feature_association\n    ret = select_feature_matrix(\n  File \"/./icees_api/features/sql.py\", line 577, in select_feature_matrix\n    result = count_unique(conn, table_name, ka, kb)\n  File \"/./icees_api/features/sql.py\", line 478, in wrapper\n    cached_result = r.get(key_)\n  File \"/usr/local/lib/python3.8/site-packages/redis/client.py\", line 1606, in get\n    return self.execute_command('GET', name)\n  File \"/usr/local/lib/python3.8/site-packages/redis/client.py\", line 898, in execute_command\n    conn = self.connection or pool.get_connection(command_name, **options)\n  File \"/usr/local/lib/python3.8/site-packages/redis/connection.py\", line 1192, in get_connection\n    connection.connect()\n  File \"/usr/local/lib/python3.8/site-packages/redis/connection.py\", line 563, in connect\n    raise ConnectionError(self._error_message(e))\nredis.exceptions.ConnectionError: Error -2 connecting to redis:6379. Name or service not known.\n"

@karafecho
Copy link

@colleenXu : Thanks for your questions.

WRT (1), the endpoint for the newest instance (JAN2021 data) is https://covid.icees.renci.org/jan2021/query. I didn't realize that this wasn't registered with SmartAPI. I'll fix that.
WRT (2), we do not currently expose a variable that indicates whether a patient is confirmed COVID+ or COVID-. The reason for this is complicated and multi-faceted, but the issue will be addressed in a new deployment (ETA is end of month).
WRT (3), for Remdesivir and Prednisone we map to biolink:ChemicalEntity, biolink:Drug; for InvasiveVentilation we map to biolink:ClinicalIntervention.
WRT (4), Hao Xu looked into this: redis issue, since fixed.

One other issue that I should alert you to is that many of the feature variables that we intended to expose are either not being exposed or are showing empty cells. We are fixing this with the next release. We probably should have held off on the initial deployment, but we were racing to meet a milestone deadline, so...(you know how it goes)...

Hope this helps.

@colleenXu
Copy link
Collaborator

@karafecho I think the server urls in the file https://covid.icees.renci.org/jan2021/openapi.json are incorrect (this was used in the new SmartAPI registration of ICEES COVID19 API). Rather than seeing 1 url with https://covid.icees.renci.org/jan2021/, I see "/jan2021" and "https://covid.icees.renci.org" as two separate urls.


I suggest modifying the file and refreshing/updating the SmartAPI registration. It may also be a good idea to remove the older SmartAPI registration.

@karafecho
Copy link

karafecho commented Aug 23, 2021

I'm seeing that, too. I think this might be related to this issue.

Also, I think our SmartAPI registrations should be updated automatically.

@colleenXu
Copy link
Collaborator

@karafecho Thank you for your help. We have a few questions about the ICEES DILI and ICEES COVID APIs...

ICEES DILI:

  1. We've noticed IDs with the prefix "SCITD" that we cannot resolve with an ID resolver. Are these SNOMED IDs? If so, we believe the Translator prefix for these IDs is SNOMEDCT
  2. We've noticed confusing nodes where the name and semantic type given in ICEES DILI don't seem to correspond to the ID given after ID resolution...Is there a data issue with mapping features to IDs?
  • For example, we've seen a node with the name "SexDILI", ID "MESH:C110500", and semantic type "PhenotypicFeature" - this sounds like a feature for Phenotypic Sex of Patients.
  • However, the MESH ID actually seems to refer to a dietary supplement called Gasex - and the SRI's ID resolver identifies this ID as the ChemicalEntity Gasex.
  • We're then unsure of how to proceed, since querying related PhenotypicFeatures returns an ID that actually seems to be a ChemicalEntity...
  1. We are seeing self-edges, where ICEES reports that DILI is correlated with itself. Is this a valid association to return?

ICEES COVID:

  1. We have not been able to successfully query from COVID to related Diseases. Could you provide an example query that starts with COVID?

@karafecho
Copy link

@colleenXu: Thanks, again, for your comments. Greatly appreciated!

WRT ICEES DILI, we recently refactored our API config files. As part of this effort, we replaced all illegal prefixes with legal ones. However, we have not yet deployed the API, so the current instance contains some illegal prefixes. Your second point regarding the weird identifiers relates to the approach that we used to map identifiers, which in some cases, pulled in substrings for variables. For instance, the MESH ID for "Gasex" was picked up because it contains the substring "sex". Note that the MESH ID for "Sex" should be mapped to "SexDILI", too. This is also something that should be fixed with the next deployment (ETA end of month). Your third point regarding self-edges is a function of the KG, which relates all entities to each other via a Chi Square-derived P value. So, yes, it is a valid association.

WRT ICEES COVID, if I'm understanding you correctly, then this issue relates to the one I commented on previously. Specifically, due to a variety of reasons mostly out of our control, we were unable to capture the variable that flags patients as confirmed COVID+ or confirmed COVID-. We recognize that this is a major issue, but we moved forward with the deployment to meet a milestone deadline. (FWIW, the cohort is designed as a 1:1 case:control study, with roughly 100,000 patients each in the JAN2021 instance.) We have since corrected this issue, which will be reflected in the next deployment (ETA end of month). In the meantime, you might want to focus on the other variables I suggested.

Hope this helps...

@colleenXu
Copy link
Collaborator

colleenXu commented Aug 25, 2021

@karafecho Thank you, I think this clarifies things a lot! I'm noting a few other data-related issues below that are non-critical - feel free to look when you have the time, and chime in if you have thoughts/news on them.

  • OMIM IDs (with an ID starting with "MTHU"). These are resolvable with SRI services, but it seems like perhaps other ID namespaces would be preferred?
  • ID prefixes: UMLSCUI -> UMLS, IC10 -> ICD10 or ICD0?
  • typo: edge attribute_type_id contigency:matrices -> contingency:matrices
  • Some IDs aren't resolvable with SRI's service. For example, it doesn't find the equivalent IDs / semantic type of LOINC IDs for PhenotypicFeature...
  • would some nodes/features that are currently PhenotypicFeatures better fit another semantic type (although I totally get mapping them to the more-commonly-used semantic types)? For example, Phenotypic Sex, Race, Alcohol Use of patients may fit as a ClinicalAttribute or something under that hierarchy?

@karafecho
Copy link

karafecho commented Aug 25, 2021

No problem, Colleen!

WRT your first and fourth bullets, we've used Athena to pull in identifier systems that are not currently supported by SRI. The idea is to see if there's enough support to justify an expansion of SRI services.

WRT your second bullet, these fixes have been made and will be reflected in the next API deployments.

WRT your third bullets, we fixed the typo. Thanks for pointing that out!

WRT your fifth bullet, you are correct that we mapped to 'accurate but also commonly used' semantic types. :-)

@karafecho
Copy link

@colleenXu : I am testing direct three-hop ARA queries as part of Workflow B, the idea being that this will facilitate (1) debugging and (2) answer evaluation for scientific impact while we wait for ARA deployment of TRAPI 1.2, which will support asynchronous queries and (hopefully) resolve the timeout issues that are preventing us from running the queries via the ARS.

I don't think any of the issues above should block a direct query of BTE using the TRAPI message below, so I'm hoping that you can test this with both biolink:correlated_with and biolink:has_real_world_evidence_of_association_with. Would this be feasible?

Note that I am also using these tests to compare output across ARAs, in terms of ChemicalEntity(ies) that more than one ARA suggests. Thus far, I have results from both queries for ARAX and for one query from ARAGORN. The results are looking pretty interesting thus far, so I'm pretty excited.

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                     "ids": ["MESH:D056487"],
                     "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                },
                "n1": {
                    "categories": ["biolink:DiseaseOrPhenotypicFeature"]
                },
                "n2": {
                    "categories": ["biolink:Gene"]
                },
                "n3": {
                    "categories": ["biolink:ChemicalEntity"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:correlated_with"]
                },
                "e02": {
                    "subject": "n2",
                    "object": "n1",
                    "predicates": ["biolink:gene_associated_with_condition"]
                },
                "e03": {
                    "subject": "n2",
                    "object": "n3",
                    "predicates": ["biolink:related_to"]
                }
            }
        }
    }
}

@colleenXu
Copy link
Collaborator

note that after all PRs from Eric are merged, ICEES Asthma should also work. I've noticed this by running the sample query above for ICEES Asthma in Eric's workspace (fork).

We'll wait until after ICEES COVID-19 updates to know what to do there (ingest one or both APIs? Try out queries / integration into BTE).

@karafecho
Copy link

karafecho commented Aug 28, 2021

Thanks, Colleen. To clarify, the Workflow B query(ies) that I suggested are expected to run after Eric merges the PRs, correct? The queries should return results from ICEES DILI, not ICEES asthma.

@colleenXu
Copy link
Collaborator

Note to BTE team:

Once the new code is merged (the SRI-ID-resolver + new query handling + results assembly)...
These are the queries to use to check if BTE is correctly integrated with ICEES DILI and ICEES Asthma APIs. I have gotten them to work when querying my local setup.

Query that should have "icees:dili" in the response (PhenotypicFeatures linked to DILI):

{
    "message": {
        "query_graph": {
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1"
                }
            },
            "nodes": {
                "n0": {
                    "ids": ["MONDO:0005359"],
                    "name": "drug-induced liver injury"
                },
                "n1": {
                    "categories": ["biolink:DiseaseOrPhenotypicFeature"],
                    "name": "Disease Or Phenotypic Feature"
                }
            }
        }
    }
}

Query that should have "icees:asthma" in the response (Diseases linked to Asthma):

{
    "message": {
        "query_graph": {
            "edges": {
                "e00": {
                    "subject": "n00",
                    "object": "n01",
                    "predicates": ["biolink:correlated_with"]
                }
            },
            "nodes": {
                "n00": {
                    "ids": ["MONDO:0004766"],
                    "categories": ["biolink:Disease"]
                },
                "n01": {
                    "categories": ["biolink:Disease"]
                }
            }
        }
    }
}

@colleenXu
Copy link
Collaborator

colleenXu commented Sep 10, 2021

@karafecho, here is some feedback on the earlier comment about the Workflow/demo query:

  • Using the dev SRI ID resolver that BTE will use, the MESH ID given above does not have an equivalent NCIT ID (which is what the ICEES DILI API uses)...so ICEES DILI is then not correctly called. --> Replacing the ID used with NCIT:C84427 or MONDO:0005359 should work in getting BTE to call ICEES DILI...
  • After replacing the ID used, this query takes too long to run / is too large for BTE to reasonably handle. I am currently exploring alternative structures...

Notes for the BTE team:

  • Running this query involves using the new code base (SRI ID resolver + new query handling + results assembly).
  • this is considered a variation on a 3-hop Predict query, with predicate restrictions on the first and second hops (starred): DiseaseOrPhenotypicFeature ID -> DiseaseOrPhenotypicFeature <- Gene -> ChemicalEntity

For the BTE team, this is what I have tried:

  1. Directly querying from NCIT DILI to ChemicalEntity without a predicate restriction / more specific semantic type leads to ~2500 results. It runs on my local in ~1.25 min and the response is ~7.2 MB.
{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                     "ids": ["NCIT:C84427"],
                     "categories": ["biolink:Disease"]
                },
                "n1": {
                    "categories": ["biolink:ChemicalEntity"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                }
            }
        }
    }
}

This query is modified to be shorter (no chemicals at the end, Disease rather than DiseaseOrPhenotypicFeature). It runs on my local instance in ~4 min and returns something ~7 MB in size. Adding a ChemicalEntity query to the end of this seems like it will definitely take too long to run / be too large in size...

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                     "ids": ["NCIT:C84427"],
                     "categories": ["biolink:Disease"]
                },
                "n1": {
                    "categories": ["biolink:Disease"]
                },
                "n2": {
                    "categories": ["biolink:Gene"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:has_real_world_evidence_of_association_with"]
                },
                "e02": {
                    "subject": "n1",
                    "object": "n2",
                    "predicates": ["biolink:condition_associated_with_gene"]
                }
            }
        }
    }
}

This is another modified query (also shorter). It takes my local instance ~6 min 24 seconds to run and the response is (!!!) ~38.6 MB in size.

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                     "ids": ["NCIT:C84427"],
                     "categories": ["biolink:Disease"]
                },
                "n1": {
                    "categories": ["biolink:Disease"]
                },
                "n2": {
                    "categories": ["biolink:SmallMolecule"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:has_real_world_evidence_of_association_with"]
                },
                "e02": {
                    "subject": "n1",
                    "object": "n2"
                }
            }
        }
    }
}

@colleenXu
Copy link
Collaborator

@karafecho Just to check, the two ICEES COVID SmartAPI registrations look like they are now going to the same API (server url). Previously, we discussed how they included data from different times/cohorts though, and were separate APIs...

@karafecho
Copy link

Thanks, @colleenXu . Greatly appreciate the input!

You are correct that the data for each cohort/time period should point to a distinct URL. I suspect the issue relates to the way we've restructured the directories. Will post an issue.

@colleenXu
Copy link
Collaborator

I suggest closing this issue and opening separate issues for demo queries or ingesting ICEES COVID, when needed.

As noted above, there are some issues with the demo query that Kara asked us to run.


I've confirmed that a query like this returns edges from ICEES DILI that preserve the edge attributes.

BTE is also correctly using the SRI-based ID resolver to find different semantic types for the IDs returned (compared to what ICEES DILI provides). One example of that is MESH:C040391, which the ID resolver finds is malignancy-associated nucleolar antigen, human (a ChemicaEntity) -- not a Disease.

{
    "message": {
        "query_graph": {
            "edges": {
                "e0": {
                    "subject": "n0",
                    "object": "n1"
                }
            },
            "nodes": {
                "n0": {
                    "ids": ["MONDO:0005359"],
                    "name": "drug-induced liver injury"
                },
                "n1": {
                    "categories": ["biolink:Disease"],
                    "name": "Disease Or Phenotypic Feature"
                }
            }
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants