Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new API for InnateDB #17

Closed
newgene opened this issue Nov 19, 2019 · 10 comments
Closed

new API for InnateDB #17

newgene opened this issue Nov 19, 2019 · 10 comments
Labels
api deployment done data source Data source pending to create a new API On Test Match https://github.com/biothings/biothings_explorer/labels x-bte
Milestone

Comments

@newgene
Copy link
Member

newgene commented Nov 19, 2019

https://www.innatedb.com/

Innate immunity interactions

@newgene newgene added the data source Data source pending to create a new API label Nov 19, 2019
@newgene newgene added this to the Segment 2 milestone Nov 19, 2019
@ericz1803
Copy link
Contributor

Example documents:

{
    "subject": {
        "unique_identifier": "innatedb:IDBG-25842",
        "alt_identifier": "ensembl:ENSG00000154589",
        "alias": "uniprotkb:LY96_HUMAN|refseq:NP_056179|uniprotkb:Q9Y6Y9|refseq:NP_001182726|hgnc:LY96(display_short)",
        "ncbi_taxid": "taxid:9606(Human)",
        "biological_role": "psi-mi:\"MI:0499\"(unspecified role)",
        "exp_role": "psi-mi:\"MI:0498\"(prey)",
        "interactor_type": "psi-mi:\"MI:0326\"(protein)",
        "participant_identification_method": "psi-mi:\"MI:0363\"(inferred by author)"
    },
    "object": {
        "unique_identifier": "innatedb:IDBG-82738",
        "alt_identifier": "ensembl:ENSG00000136869",
        "alias": "refseq:NP_612564|refseq:NP_612567|uniprotkb:O00206|uniprotkb:TLR4_HUMAN|refseq:NP_003257|hgnc:TLR4(display_short)",
        "ncbi_taxid": "taxid:9606(Human)",
        "biological_role": "psi-mi:\"MI:0499\"(unspecified role)",
        "exp_role": "psi-mi:\"MI:0496\"(bait)",
        "interactor_type": "psi-mi:\"MI:0326\"(protein)",
        "participant_identification_method": "psi-mi:\"MI:0363\"(inferred by author)"
    },
    "relation": {
        "interaction_detection_method": "psi-mi:\"MI:0007\"(anti tag coimmunoprecipitation)",
        "author": "Shimazu et al. (1999)",
        "pmid": "pubmed:10359581",
        "interaction_type": "psi-mi:\"MI:0915\"(physical association)",
        "source_database": "MI:0974(innatedb)",
        "idinteraction_in_source_db": "innatedb:IDB-113240",
        "confidence_score": "lpr:3|hpr:3|np:1|",
        "ncbi_taxid_host_organism": "taxid:10090",
        "creation_date": "2008/03/30",
        "update_date": "2008/03/30",
        "negative": "false"
    },
    "_id": "innatedb:IDBG-25842-innatedb:IDBG-82738"
}
{
    "subject": {
        "unique_identifier": "innatedb:IDBG-28022",
        "alt_identifier": "ensembl:ENSG00000198001",
        "alias": "uniprotkb:IRAK4_HUMAN|uniprotkb:Q9NWZ3|refseq:NP_001138728|refseq:NP_001138729|refseq:NP_001107654|refseq:NP_001138730|refseq:NP_057207|hgnc:IRAK4(display_short)",
        "ncbi_taxid": "taxid:9606(Human)",
        "biological_role": "psi-mi:\"MI:0501\"(enzyme)",
        "exp_role": "psi-mi:\"MI:0499\"(unspecified role)",
        "interactor_type": "psi-mi:\"MI:0326\"(protein)",
        "participant_identification_method": "psi-mi:\"MI:0363\"(inferred by author)"
    },
    "object": {
        "unique_identifier": "innatedb:IDBG-28022",
        "alt_identifier": "ensembl:ENSG00000198001",
        "alias": "uniprotkb:IRAK4_HUMAN|uniprotkb:Q9NWZ3|refseq:NP_001138728|refseq:NP_001138729|refseq:NP_001107654|refseq:NP_001138730|refseq:NP_057207|hgnc:IRAK4(display_short)",
        "ncbi_taxid": "taxid:9606(Human)",
        "biological_role": "psi-mi:\"MI:0502\"(enzyme target)",
        "exp_role": "psi-mi:\"MI:0499\"(unspecified role)",
        "interactor_type": "psi-mi:\"MI:0326\"(protein)",
        "participant_identification_method": "psi-mi:\"MI:0363\"(inferred by author)"
    },
    "relation": {
        "interaction_detection_method": "psi-mi:\"MI:0423\"(in-gel kinase assay)",
        "author": "Cheng et al. (2007)",
        "pmid": "pubmed:17141195",
        "interaction_type": "psi-mi:\"MI:0217\"(phosphorylation reaction)",
        "source_database": "MI:0974(innatedb)",
        "idinteraction_in_source_db": "innatedb:IDB-113326",
        "confidence_score": "lpr:1|hpr:1|np:1|",
        "ncbi_taxid_host_organism": "taxid:7108",
        "creation_date": "2008/09/28",
        "update_date": "2008/09/28",
        "negative": "false"
    },
    "_id": "innatedb:IDBG-28022-innatedb:IDBG-28022"
}

Statistics for interaction_type:

{
'psi-mi:"MI:0915"(physical association)': 31086, 
'psi-mi:"MI:0217"(phosphorylation reaction)': 1201, 
'psi-mi:"MI:0220"(ubiquitination reaction)': 423, 
'psi-mi:"MI:0570"(protein cleavage)': 324, 
'psi-mi:"MI:0194"(cleavage reaction)': 147, 
'psi-mi:"MI:0203"(dephosphorylation reaction)': 40, 
'psi-mi:"MI:0179"(other modification)': 14, 
'psi-mi:"MI:0566"(sumoylation)': 9, 
'psi-mi:"MI:0408"(disulfide bond)': 8, 
'psi-mi:"MI:0204"(deubiquitination reaction)': 7, 
'psi-mi:"MI:0401"(biochemical)': 6, 
'psi-mi:"MI:0213"(methylation reaction)': 6, 
'psi-mi:"MI:0414"(enzymatic reaction)': 6, 
'psi-mi:"MI:0192"(acetylation reaction)': 5, 
'psi-mi:"MI:0195"(covalent binding)': 5, 
'psi-mi:"MI:0566"(sumoylation reaction)': 3, 
'psi-mi:"MI:0914"(association)': 2, 
'psi-mi:"MI:1126"(self interaction)': 2,
'psi-mi:"MI:0557"(adp ribosylation reaction)': 1
}

@colleenXu
Copy link

colleenXu commented Aug 1, 2022

I think someone else (Andrew?) should take the lead on changes.

I have some thoughts but I don't know if others would agree (starting with the subject / object sections first)


Subject (actually applies to the object section too)

  • change subject.unique_identifier -> subject.innatedb. Then remove "innatedb:" prefix from the value. so it would be "subject": {"innatedb": "IDBG-25842" }
  • change subject.alt_identifier -> subject.ensembl. Then remove "ensembl:" prefix from the value. so it would be "subject": {"ensembl": "ENSG00000154589" }
  • break down the "alias" value (currently a string) and turn it into an object (note that I concatenated for keys that are the same and changed some key names):
"subject": {
    "alias": {
        "uniprotkb_name": "LY96_HUMAN",
        "refseq": ["NP_056179", "NP_001182726"],
        "uniprotkb_id": "Q9Y6Y9",
        "hgnc_name": "LY96(display_short)"
    }
}
  • change value for subject.ncbi_taxid to just have the ID. So it'd be "subject": {"ncbi_taxid": "9606" }
  • change the rest of the properties to separate the "id" from the prefix and label (aka human readable thing)
"subject": {
    "biological_role": {
        "psi-mi": "MI:0499",
        "label": "unspecified role"
    },
    "exp_role": {
        "psi-mi": "MI:0498",
        "label": "prey"
    },
    "interactor_type": {
        "psi-mi": "MI:0326",
        "label": "protein"
    },
    "participant_identification_method": {
        "psi-mi": "MI:0363",
        "label": "inferred by author"
    }
}

@colleenXu
Copy link

Relation

  • change relation.interaction_detection_method / relation.interaction_type to separate the "id" from the prefix and label (aka human readable thing) (similar to the "other properties" above)
"relation": {
    "interaction_detection_method": {
        "psi-mi": "MI:0007",
        "label": "anti tag coimmunoprecipitation"
    },
    "interaction_type": {
        "psi-mi": "MI:0915",
        "label": "physical association"
    }
}
  • relation.pmid: remove the "pubmed:" prefix from the value.
  • handle source database stuff differently (1 object):
"relation": {
    "source_database": {
        "id": "MI:0974",
        "label": "innatedb",
        "interaction_id": "IDB-113240"
    }
}
  • break down the "alias" value (currently a string) and turn it into an object (then the numbers can be int)
"relation": {
    "confidence_score": {
        "lpr": 3,
        "hpr": 3,
        "np": 1
    }
}
  • change value for relation.ncbi_taxid_host_organism to just have the ID. So it'd be "relation": {"ncbi_taxid_host_organism": "10090" }
  • change date format to YYYY-MM-DD (aka use dashes, not slashes)? so 2008-03-30

_id: be careful of using _ids like this....are there records that might end up with the same _ids (since their subject/object IDs are the same?). Also use a different "delimiter" since the IDs already have dashes in them....maybe _?

@ericz1803
Copy link
Contributor

I made the changes. The only difference is I slightly modified the alias field to look like:

"alias": {
    "refseq": [
        "NP_004336"
    ],
    "uniprotkb": [
        "P49913"
    ],
    "uniprotkb_name": "CAMP_HUMAN",
    "hgnc_name": "CAMP"
},

@ericz1803
Copy link
Contributor

Deployed.

@ericz1803
Copy link
Contributor

API link: https://biothings.ncats.io/innatedb

@andrewsu
Copy link
Member

Per https://github.com/biothings/BioThings_Explorer_TRAPI/blob/main/docs/README-contributing-new-data-source.md, the next step would be to write the SmartAPI metadata file. @ericz1803 do you want to take a crack at that? Or have you tried and hit problems?

@colleenXu
Copy link

I wrote the SmartAPI yaml w/ x-bte annotation for BioThings InnateDB.

Noting some decisions I made

  • ~93% of the records have the interaction_type physical_association (31086/33295)
  • I also wrote operations for 5 other interaction_types with > 10 records (excluding "other modification" because I wasn't sure what it meant). Ref: the list of interaction_type values at the bottom of this comment
    • but for now, they all use physically_interacts_with as the biolink predicate (+ no qualifiers)...because I had trouble figuring out how to map them
    • biolink-model qualifiers are only for chem-affects-gene and gene-regulates-gene, but this data seems like protein-protein interactions
    • it's not clear if the data has directionality (ex: is it "subject ubiquinates object" or vice versa?)
      • I have notes from my analysis of the subject/object biological role values for each interaction_type (ex: here). /blob/49030afd133688184ec193bffb827c47a540ab4f/innatedb/smartapi.yaml#L694) for phosphorylation)
      • I think directionality was clear when the biological role combo was enzyme/enzyme-target, enzyme-target/enzyme, or self/self
      • but this wasn't useful for ubiquitination where unspecified-role/unspecified-role seemed most common
  • for now, for all operations I used ENSEMBL (ENSG) IDs and assigned all categories as Gene (reasoning on lines 586-591)
    • but the physical_association data included different subject/object interactor_type values (protein vs dna vs rna). Noted here and here in case we want to use them in the future.
  • I included my notes on other fields I thought could be useful, as yaml comments starting on line 599

This yaml is registered in SmartAPI Registry and the PR to add this KP to BTE's regular use is here.

I tested the PR locally and the SmartAPI yaml/x-bte annotation works.

Example test

send a POST request to the api-specific endpoint, BioThings TTD only. Like http://localhost:3000/v1/smartapi/e9eb40ff7ad712e4e6f4f04b964b5966/query

Put this in the request body: It's querying with the gene PIK3C3 (aka NCBIGene:5289)

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": ["ENSEMBL:ENSG00000078142"],
                    "categories": ["biolink:Gene"]
                },
                "n1": {
                    "categories": ["biolink:Gene"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                }
            }
        }
    }
}

You should get a response with this edge (from this set of records from the BioThings API), based on this operation's example:

  • subject: PIK3C3 aka NCBIGene:5289, ENSEMBL:ENSG00000078142
  • object: ATG14 aka NCBIGene:22863, ENSEMBL:ENSG00000126775
                "352f144dd72e3f14f4fd4d628e73b552": {
                    "predicate": "biolink:physically_interacts_with",
                    "subject": "NCBIGene:5289",
                    "object": "NCBIGene:22863",
                    "attributes": [
                        {
                            "attribute_type_id": "biolink:publications",
                            "value": [
                                "PMID:18843052",
                                "PMID:19050071"
                            ],
                            "value_type_id": "linkml:Uriorcurie"
                        }
                    ],
                    "sources": [
                        {
                            "resource_id": "infores:innatedb",
                            "resource_role": "primary_knowledge_source"
                        },
                        {
                            "resource_id": "infores:biothings-innatedb",
                            "resource_role": "aggregator_knowledge_source",
                            "upstream_resource_ids": [
                                "infores:innatedb"
                            ]
                        },
                        {
                            "resource_id": "infores:service-provider-trapi",
                            "resource_role": "aggregator_knowledge_source",
                            "upstream_resource_ids": [
                                "infores:biothings-innatedb"
                            ]
                        }
                    ]
                },

@colleenXu
Copy link

colleenXu commented Dec 5, 2023

EDIT: related infores stuff is being deployed for this release cycle! so we can go forward with incorporating this resource.

Related infores stuff is ready:

@colleenXu colleenXu added the On CI Match https://github.com/biothings/biothings_explorer/labels label Dec 6, 2023
@tokebe tokebe added On CI Match https://github.com/biothings/biothings_explorer/labels and removed On CI Match https://github.com/biothings/biothings_explorer/labels labels Dec 21, 2023
@colleenXu colleenXu added On Test Match https://github.com/biothings/biothings_explorer/labels and removed On CI Match https://github.com/biothings/biothings_explorer/labels labels Dec 22, 2023
@colleenXu
Copy link

colleenXu commented Feb 21, 2024

Closing this issue since the changes have been deployed to Prod with the Feb 2024 release.

I've confirmed that I can query BioThings InnateDB through BTE prod https://bte.transltr.io/v1/team/Service Provider/query with the example in #17 (comment) and get the expected response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api deployment done data source Data source pending to create a new API On Test Match https://github.com/biothings/biothings_explorer/labels x-bte
Projects
None yet
Development

No branches or pull requests

5 participants