Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CTD processing 4 (optional / less important): handling gene -> gene's response #586

Open
colleenXu opened this issue Mar 14, 2023 · 0 comments

Comments

@colleenXu
Copy link
Collaborator

colleenXu commented Mar 14, 2023

Intro: see intro section of #583 (comment). Originally noted in #558 (comment)

4. (optional / less important): handling gene -> gene's response

I haven't written an operation for a gene->gene query, because the direction of the response's data isn't consistent. It's not important because the source of this info is BioGRID and Biolink API / Monarch also covers this information.

For example, if I query CTD with gene CYSLTR1 (10800)...

some associations have input CYSLTR1 in the source fields

source = src

    {
        "Assay": "Co-localization",
        "Input": "10800",
        "InteractionType": "physical",
        "PubMedId": "21203429",
        "SrcGeneId": "10800",
        "SrcGeneSymbol": "CYSLTR1",
        "SrcOrganism": "Homo sapiens",
        "SrcOrganismId": "9606",
        "TgtGeneId": "57105",
        "TgtGeneSymbol": "CYSLTR2",
        "TgtOrganism": "Homo sapiens",
        "TgtOrganismId": "9606",
        "Throughput": "low"
    },
    {
        "Assay": "Affinity Capture-Western",
        "Input": "10800",
        "InteractionType": "physical",
        "PubMedId": "17693579",
        "SrcGeneId": "10800",
        "SrcGeneSymbol": "CYSLTR1",
        "SrcOrganism": "Homo sapiens",
        "SrcOrganismId": "9606",
        "TgtGeneId": "57105",
        "TgtGeneSymbol": "CYSLTR2",
        "TgtOrganism": "Homo sapiens",
        "TgtOrganismId": "9606",
        "Throughput": "low"
    },
but others have input CYSLTR1 in the target fields

target = tgt

    {
        "Assay": "Affinity Capture-Western",
        "Input": "10800",
        "InteractionType": "physical",
        "PubMedId": "21203429",
        "SrcGeneId": "57105",
        "SrcGeneSymbol": "CYSLTR2",
        "SrcOrganism": "Homo sapiens",
        "SrcOrganismId": "9606",
        "TgtGeneId": "10800",
        "TgtGeneSymbol": "CYSLTR1",
        "TgtOrganism": "Homo sapiens",
        "TgtOrganismId": "9606",
        "Throughput": "low"
    },
    {
        "Assay": "Affinity Capture-Western",
        "Input": "10800",
        "InteractionType": "physical",
        "PubMedId": "19561298",
        "SrcGeneId": "2840",
        "SrcGeneSymbol": "GPR17",
        "SrcOrganism": "Mus musculus",
        "SrcOrganismId": "10090",
        "TgtGeneId": "10800",
        "TgtGeneSymbol": "CYSLTR1",
        "TgtOrganism": "Mus musculus",
        "TgtOrganismId": "10090",
        "Throughput": "low"
    }

Perhaps, post-processing could get all associations "in the same direction" from input ID -> output ID and then response-mapping could be to those post-processed fields. However, we'd want to check that this doesn't mess up the meaning (is there a different meaning if a gene is a source vs a target?).

I think Biolink API / Monarch's post-query processing + SmartAPI yaml response-mapping (which is to fields that exist only after the post-processing) is able to handle this situation, so maybe a solution like that will work here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants