Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering by ClinVar clinsig + CLINSIG Confident + ClinVar hits has a bug #4996

Open
northwestwitch opened this issue Nov 5, 2024 · 4 comments · May be fixed by #5006
Open

Filtering by ClinVar clinsig + CLINSIG Confident + ClinVar hits has a bug #4996

northwestwitch opened this issue Nov 5, 2024 · 4 comments · May be fixed by #5006
Assignees
Labels

Comments

@northwestwitch
Copy link
Member

northwestwitch commented Nov 5, 2024

Example to reproduce on a local demo:

On SNV variants page, this filter works:

image

And you get 3 variants which have likely benign clinsig

This filter also works

image

You get no variants, since all likely benign vars in the demo instance have no criteria

This filter returns all variants with any ClinVar significance, which is wrong:

image
@northwestwitch northwestwitch self-assigned this Nov 5, 2024
@northwestwitch
Copy link
Member Author

northwestwitch commented Nov 5, 2024

Inspecting the code, the error happens because of this line

The query build process is quite complicated, also because apparently we have this criterion that whenever the user specifies a clinsig (clinsign_filter) plus secondary criteria (secondary_filter), then the query becomes:

mongo_query["$or"] = [ {"$and": secondary_filter}, clinsign_filter]

Which will return any variant matched by the first $or or the second.

In the case of the bug the query becomes the following:

....{'$or': 
[
  {'$and': [{'clnsig': {'$exists': True}}, {'clnsig': {'$ne': None}}]}, {'clnsig': {'$elemMatch': {'$and': [{'$or': [{'value': {'$in': [3, 
  'Likely benign']}}, {'value': re.compile('Likely benign')}]}, 
   
{'revstat':re.compile('mult|multiple_submitters|single|single_submitter|exp|reviewed_by_expert_panel|guideline|practice_guideline')}]}}}

]}

All variants, regardless of the confidence level are returned because of the first element of the query .. 🤔

@northwestwitch northwestwitch linked a pull request Nov 6, 2024 that will close this issue
2 tasks
@dnil
Copy link
Collaborator

dnil commented Nov 12, 2024

Bug is perhaps a strong word since the interface doesn't clearly specify the priority for the "clinvar hits" checkbox. But it is sure hard to figure out a simple expression describing the design.

All three have use cases: being able to fully select/deselect "confident" annotations seems like a good feature, as does selecting any tag. Perhaps that could be an option for the multi select instead/later? Either a button that just selects all, or an "All" option like for the chromosomes?

@northwestwitch
Copy link
Member Author

All three have use cases: being able to fully select/deselect "confident" annotations seems like a good feature, as does selecting any tag. Perhaps that could be an option for the multi select instead/later? Either a button that just selects all, or an "All" option like for the chromosomes?

I would call it a bug, because if you filter:

  • variants with clinsig
  • likely benign
  • high confidence

You would expect no variants returned, not all the variants with any clinsig?

But I also understand the design, which is: return variants with clinsig no matter what. But I start to dislike this criterion more and more because you don't really get the chance to decide for yourself, like: I want to really make sure no variants are present if I filter by those 3 criteria.

Regarding including "All" in the clinsig multiselect, isn't it the same as not selecting anything at all? I think I would rather keep it as it is (with the ClinVar hits checkbox), to be able to easily exclude those variants that have no ClinVar annotations

@dnil
Copy link
Collaborator

dnil commented Nov 12, 2024

Right, two pictures here, one as it is, one as one might want to refactor it.

As it is, I would expect the above to return at least any high significance hits, but I feel a conflict in design between "clinvar hits"and "likely benign". It's kind of obvious the one named "clinsig confidence always returned" is inclusive. The others less clear.

I do agree that designs like "always" are really hard to get into a flexible system. It would make sense to refactor to retain good functionality for the clinical filter, but make the interface option a bit less strained. Meeting now! 😓

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants