Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate Targets from new/legacy CNV tools. #3246

Closed
4 tasks
samuelklee opened this issue Jul 11, 2017 · 5 comments · Fixed by #3820
Closed
4 tasks

Eliminate Targets from new/legacy CNV tools. #3246

samuelklee opened this issue Jul 11, 2017 · 5 comments · Fixed by #3820

Comments

@samuelklee
Copy link
Contributor

samuelklee commented Jul 11, 2017

The use of Targets to refer to genomic intervals is unnecessary and confusing. It obfuscates the fact that most of the tools and code can be applied to not only counts from WES targets, but also counts from WES baits, WGS bins, etc. Requiring that Targets be named also adds unnecessary storage and memory burden. We should just use SimpleIntervals everywhere. We should also get rid of the target file format.

In terms of external visibility, we can just rename tools and edit javadoc. Internally, there will be many classes that need to be both renamed and refactored. I instead suggest that we rebuild new versions of the classes and tools as necessary in the tools/copynumber package.

  • Rename tools: AnnotateTargets -> AnnotateIntervals, TargetCoverageSexGenotyper -> ReadCountSexGenotyper.
  • Deprecate tools: CalculateTargetCoverage, ConvertBedToTargetFile, and PadTargets will be replaced by @asmirnov239's new CollectReadCounts tool and on-the-fly padding specified by --interval_padding parameters.
  • Deprecate target file format and change all other affected file formats.
  • Refactor/rename/rebuild classes.
@sooheelee
Copy link
Contributor

Hey remember the GATK engine's -L parameter for specifying intervals.

@samuelklee
Copy link
Contributor Author

Definitely! This would allow us to take intervals as input in more standard ways throughout.

@samuelklee samuelklee mentioned this issue Jul 11, 2017
1 task
@samuelklee
Copy link
Contributor Author

Actually, one thing we will have to consider is that -L will take the union or the intersection of input intervals. We will want to keep adjacent intervals separate. Optional checks for overlaps would also be nice. @droazen @lbergelson let me know what you think?

@droazen
Copy link
Contributor

droazen commented Jul 11, 2017

@samuelklee We can easily expose the IntervalMergingRule in IntervalArgumentCollection (which is where -L is defined) as an argument to prevent the merging of adjacent intervals. We could also add a way for individual tools to set a default value for IntervalMergingRule themselves.

@samuelklee
Copy link
Contributor Author

Sounds good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants