Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssociationSetFactory does not handle associations from multiple relations for a subject #370

Open
deepakunni3 opened this issue Sep 14, 2019 · 1 comment · May be fixed by #384
Open

AssociationSetFactory does not handle associations from multiple relations for a subject #370

deepakunni3 opened this issue Sep 14, 2019 · 1 comment · May be fixed by #384

Comments

@deepakunni3
Copy link
Member

deepakunni3 commented Sep 14, 2019

When fetching associations, say gene to phenotype associations, AssociationSetFactory populates an association map for each subject.

But this code does not handle associations with multiple relations.
For example, HGNC:6764 has two sets of associations:

# objects from association with relation RO:0002200
'HP:0000238', 'HP:0100587', 'HP:0000364', 'HP:0000864', 'HP:0000130', 'HP:0006501', 'HP:0008678', 'HP:0001392', 'HP:0000047', 'HP:0001903', 'HP:0007400', 'HP:0007565', 'HP:0000582', 'HP:0005522', 'HP:0003220', 'HP:0006824', 'HP:0000252', 'HP:0001639', 'HP:0002997', 'HP:0001873', 'HP:0001199', 'HP:0001631', 'HP:0001347', 'HP:0100542', 'HP:0001636', 'HP:0001562', 'HP:0000483', 'HP:0000268', 'HP:0000813', 'HP:0002863', 'HP:0000286', 'HP:0000508', 'HP:0010293', 'HP:0002245', 'HP:0001679', 'HP:0000175', 'HP:0002827', 'HP:0004322', 'HP:0000486', 'HP:0000453', 'HP:0001871', 'HP:0002007', 'HP:0001510', 'HP:0002251', 'HP:0100760', 'HP:0001643', 'HP:0002023', 'HP:0000010', 'HP:0100026', 'HP:0000135', 'HP:0006254', 'HP:0002664', 'HP:0000324', 'HP:0007874', 'HP:0000027', 'HP:0000365', 'HP:0001882', 'HP:0001053', 'HP:0001000', 'HP:0001511', 'HP:0004209', 'HP:0002575', 'HP:0010469', 'HP:0004349', 'HP:0000520', 'HP:0000083', 'HP:0001824', 'HP:0001671', 'HP:0000340', 'HP:0000316', 'HP:0002823', 'HP:0001646', 'HP:0012745', 'HP:0000072', 'HP:0008572', 'HP:0006101', 'HP:0000568', 'HP:0000218', 'HP:0001537', 'HP:0002817', 'HP:0000478', 'HP:0012041', 'HP:0000505', 'HP:0012639', 'HP:0000504', 'HP:0006265', 'HP:0005528', 'HP:0008053', 'HP:0012210', 'HP:0100867', 'HP:0005344', 'HP:0001875', 'HP:0001263', 'HP:0001763', 'HP:0000079', 'HP:0002414', 'HP:0001770', 'HP:0000518', 'HP:0000035', 'HP:0001760', 'HP:0003022', 'HP:0000347', 'HP:0002650', 'HP:0000639', 'HP:0000492', 'HP:0001249', 'HP:0001172', 'HP:0002119', 'HP:0000028'

and

# objects from association with relation RO:0003304
'EFO:0004340', 'EFO:0004765'

When populating the association map, the earlier entry is overwritten because the keys used in association map does not take relation into account. This yields an incorrect representation of associations fetched and analyses performed downstream with this association set.

Block of code where this is happening:

for a in assocs:
rel = a['relation']
subj = a['subject']
subject_label_map[subj] = a['subject_label']
amap[subj] = a['objects']

@cmungall FYI

@cmungall
Copy link
Contributor

cmungall commented Oct 4, 2019

It should not overwrite, that's a bug. The default behavior should be to take the union.

The ASset model is a simple entity-termset one by design. Making it truly relation aware is non trivial as you want to include inference.

If clients want to do separate analyses for separate ASsets, then the client can create multiple ASets, with different relation filtering parameters each time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants