Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low-level/general solution for --ignore-variants-starting-outside-interval? #8063

Open
bbimber opened this issue Oct 19, 2022 · 1 comment
Open
Assignees

Comments

@bbimber
Copy link
Contributor

bbimber commented Oct 19, 2022

When running GATK with specific interval(s), the default behavior is to include any variant spanning those interval(s). When running scatter/gather jobs, this behavior is generally not what one wants, since this would result in variants spanning the job intervals getting included twice.

In a handful of GATK tools, there is support for something like --ignore-variants-starting-outside-interval, which is probably designed to solve this problem. GenotypeGVCFs supports this. However, the implementation/support is generally tool-level and I dont believe all tools support this. For example, SelectVariants does not appear to. If one wants to run a scatter/gather task that doesnt start with a GATK tool that supports --ignore-variants-starting-outside-interval, you're out of luck.

My questions are:

  1. Am I completely missing some existing capability?

  2. There is already some low-level support in the engine for control over intervals. Would you be receptive to a PR that pushes support for "--ignore-variants-starting-outside-intervals" lower into GATK? Perhaps into VariantWalkerBase? One possibility would be to create a StartsWithinIntervalsVariantFilter, and override makeVariantFilter() to inject it. I dont think this would be particularly invasive, and could be pretty useful across many tools. As part of this, MultiVariantWalkerGroupedOnStart's argument would get merged with this.

@bbimber bbimber changed the title Low-level general solution for --ignore-variants-starting-outside-interval? Low-level/general solution for --ignore-variants-starting-outside-interval? Oct 19, 2022
@droazen
Copy link
Collaborator

droazen commented Oct 24, 2022

@bbimber Yes, this is an excellent suggestion and would be very useful! We actually already do have an open PR that adds such a general argument for VariantWalkers, but it's been languishing for a while: #6388

I'll see how difficult it would be to resurrect this old PR.

@droazen droazen self-assigned this Oct 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants