-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
investigate addition of supporting sentence to semmeddb API #563
Comments
Some previous discussion of "sentence" info:
|
We can load the SENTENCE table and join the SENTENCE records to our documents by Currently our PREDICATION parser discards the |
@erikyao are you at all worried about the explosion in index size that would result? |
@andrewsu I think the additional SENTENCE field(s) won't take too much index size. I am more worried about the memory usage when loading the SENTENCE table... But we can always preprocess it and extract smaller intermediate files if the memory usage became a real problem. |
Super, thanks. Let's wait until we make a decision on #569 (next week) so we can possibly make both changes together. |
…names and directories
sentence context has now been added to the semmeddb2 API (which will soon replace the semmeddb API), so closing this issue https://biothings.ncats.io/semmeddb2/association/C0007642-ISA-C0410013
|
Here's where I noted that I added sentence support to the x-bte annotations #569 (comment) |
SemMedDB is a text-mined resource for extracting relationships (triples) from the literature. The schema for SemMedDB is described at https://lhncbc.nlm.nih.gov/ii/tools/SemRep_SemMedDB_SKR/dbinfo.html. Our current semmeddb API (http://biothings.ncats.io/semmeddb; parser: https://github.com/biothings/semmeddb) primarily focuses on the "PREDICATIONS" table. The Translator consortium would like to explore the addition of the actual sentence used to infer the triple, found in the "SENTENCE" table. I remember that we briefly explored this previously, and the substantial increase in size was a key consideration. (The PREDICATION file is 3 GB; the SENTENCE file is 15 GB.)
The text was updated successfully, but these errors were encountered: