Fixed error in FilterIntervals logic when filtering on annotations with -XL. #7046
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes a bug that I inadvertently introduced long ago in #5408. A method there originally assumed that the intervals to be filtered coincide with the annotated intervals when filtering on annotations, but this assumption can be violated when e.g. using
-XL
to further exclude intervals. This breaks the annotation-based filtering and will result in incorrectly retained/dropped intervals. Unfortunately, the integration test cases were not quite complete enough to catch this.Fortunately, for typical data and parameters, the number of affected intervals is typically a very small percentage, especially in human (e.g., the default GC filters of [0.1, 0.9] affect <0.1% of 250bp bins); I only caught this when running on malaria data, since GC filtering has more impact there. I would also expect count-based filters to mitigate some of the effects (e.g., a bin with extreme GC that was erroneously retained when filtering on annotations might later be filtered because it has poor coverage). Downstream results are also all correct---they're just given for a slightly different set of intervals than would be expected.
This bug would affect users that made use of the
blacklist_intervals_for_filter_intervals
option in the gCNV WDLs, but my feeling is this functionality is not used that frequently.@droazen I think this is a relatively minor bug, but it might be good to mention it in the release notes.