-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about hetionet: metabolomics / side effects versus diseases #15
Comments
Thanks @gcsh86 for the questions.
There is no reason other than I wasn't aware of an omics-wide resource for metabolite nodes / edges. I also wasn't sure whether metabolites would be redundant with compounds. Metabolomics is an area that I don't know much about, so there is potentially opportunity I overlooked. When considering adding an additional data resource, I recommend first drawing out what node/edge types that resource would contribute to the metagraph (Figure 1A of the manuscript). @gcsh86 if you have a specific proposal of what node/edge types could be generated from HMDB, I could provide more tailored feedback.
Some entities can conceptually belong to multiple node types. This is especially true for diseases, side effects, and symptoms. For these three node types, you could imagine a single concept being all three types. For example, sepsis or chronic fatigue could potentially be all three. For Hetionet, we created separate nodes for diseases, side effects, and symptoms. Therefore, it is possible for fatigue to be three separate nodes depending on its context. Do the following query at https://neo4j.het.io/browser/ and you will see "fatigue" shows up in the names of both side effects and symptoms: MATCH (node)
WHERE node.name =~ '(?i).*fatigue.*'
RETURN node You can learn more about how we created our disease catalog in this discussion. Briefly quoting from the manuscript:
So fatigue and sepsis did not make this cut. More generally, one could allow a single node to have multiple types (or labels in neo4j parlance). For simplicity, we did not allow this when building Hetionet. The number of nodes that this effects is relatively low and the one-type-per-node assumption helped simplify the metagraph and computations. |
Hi Daneil,
For Hetionet, I have two brief questions and would like to hear about your insights:
on metabolomics side, why didn't you use the HMDB database for linking metabolites, diseases, variants, genes etc?
For sepsis and chronic fatigue, why they are categorized as side effects rather than diseases?
The text was updated successfully, but these errors were encountered: