Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get phenio relations from BBOP-Sqlite, then filter #369

Closed
kevinschaper opened this issue Oct 26, 2022 · 6 comments · Fixed by #413
Closed

Get phenio relations from BBOP-Sqlite, then filter #369

kevinschaper opened this issue Oct 26, 2022 · 6 comments · Fixed by #413
Assignees

Comments

@kevinschaper
Copy link
Member

Instead of generating our own outside the pipeline, we can use https://s3.amazonaws.com/bbop-sqlite/phenio-relation-graph.db.gz.

This is a full expansion, and we only want a subset. My initial guess is that we can filter on just rdfs:subClassOf and BFO:0000050

It's likely that we'll be doing it in a python way rather than a shell way, but if awk is handy, here it is:

awk '{ if ($2 == "rdfs:subClassOf" || $2 == "BFO:0000050") { print } }' phenio-relation-graph.tsv > phenio-relations.tsv
@kevinschaper
Copy link
Member Author

Also include UPHENO predicate that will connect anatomy to phenotype

@kevinschaper
Copy link
Member Author

We can get the relation graph output directly from the latest phenio release (phenio, not kg-phenio).

Making this high priority for this week, Chris noticed that the closure field wasn't working for MONDO:0000508

sqlite> select count(*) from edges where object= 'MONDO:0000508' and predicate='biolink:subclass_of';
409
sqlite> select count(*) from denormalized_edges where subject_closure like '%MONDO:0000508%';
0

Our relations file didn't have any entries for MONDO:0000508, but the newer file has plenty of rdfs:subClassOf triples.

The important thing to remember with the phenio relations file is that it needs to be filtered to include rdfs:subClassOf, BFO:0000050 and the upheno predicate that connects anatomy to phenotype

@caufieldjh
Copy link
Member

Just so I can be sure I'm parsing this correctly and that all outputs are working as expected - does the phenio relation-graph file contain the necessary closures?

@kevinschaper
Copy link
Member Author

@caufieldjh I think it does have the necessary closures. I don't know if I generated something wrong in the one that we're using right now, or if it's just out of date.

@kevinschaper
Copy link
Member Author

I'm not sure whether it's UPHENO:0000003 (phenotype is associated with entity) or UPHENO:0000001 (phenotype affects entity) that I should include.

I think I may go forward on this PR with affects UPHENO:0000001?

Does that sound right @matentzn @sabrinatoro ?

@matentzn
Copy link
Member

matentzn commented Feb 8, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants