Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FILTER and QUAL should be taken into account #5

Open
d-cameron opened this issue Nov 24, 2020 · 1 comment
Open

FILTER and QUAL should be taken into account #5

d-cameron opened this issue Nov 24, 2020 · 1 comment

Comments

@d-cameron
Copy link

d-cameron commented Nov 24, 2020

We weren't able to find any clear documentation on how to use QUAL, and FILTER so those were ignored

If there's no documentation in the caller, then you should use the specifications definitions of those fields:

6. QUAL — quality: Phred-scaled quality score for the assertion made in ALT. i.e. −10log10 prob(call in ALT is
wrong). If ALT is ‘.’ (no variant) then this is −10log10 prob(variant), and if ALT is not ‘.’ this is −10log10
prob(no variant). If unknown, the MISSING value must be specified. (Float)

7. FILTER — filter status: PASS if this position has passed all filters, i.e., a call is made at this position.
Otherwise, if the site has not passed all filters, a semicolon-separated list of codes for filters that fail. e.g.
“q10;s50” might indicate that at this site the quality is below 10 and the number of samples with data is below
50% of the total number of samples. ‘0’ is reserved and must not be used as a filter String. If filters have
not been applied, then this field must be set to the MISSING value. (String, no white-space or semi-colons
permitted, duplicate values not allowed.)

It seems unfair to penalise callers for following the VCF file format specifications.

QUAL could be ignored if you just want the 'default' call set but it can be used for generating ROC curves instead of single points for each caller.

FILTER should definitely be respected as those calls are not part of the 'default' call set for the caller - the caller itself already thinks they are bad for the reason specified in the FILTER field.

@d-cameron d-cameron changed the title Benchmark is unrepresentive of results for some callers Benchmark is unrepresentative of results for some callers Nov 24, 2020
@d-cameron d-cameron changed the title Benchmark is unrepresentative of results for some callers FILTER and QUAL should be taken into account Nov 26, 2020
@smangul1
Copy link
Member

smangul1 commented Dec 2, 2020

We will consider this in the next release of the analysis

Thanks for pointing this out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants