-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prevent sequence dictionary validation when aligning reads #4308
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4308 +/- ##
===============================================
+ Coverage 80.073% 80.091% +0.017%
- Complexity 17420 17437 +17
===============================================
Files 1080 1081 +1
Lines 63131 63201 +70
Branches 10200 10215 +15
===============================================
+ Hits 50551 50618 +67
Misses 8587 8587
- Partials 3993 3996 +3
|
Before I review, can you move the new arg collection out of the |
@droazen Good call. |
ac93a3b
to
3195b02
Compare
@droazen I think this is ready for review |
import org.broadinstitute.hellbender.cmdline.StandardArgumentDefinitions; | ||
|
||
/** | ||
* interface for argument collections that control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmn, this trails off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, very non committal...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor things, plus I think we should make the arg collection overridable in GATKTool as well as GATKSparkTool.
@@ -22,8 +23,8 @@ | |||
import java.util.List; | |||
|
|||
@DocumentedFeature | |||
@CommandLineProgramProperties(summary = "Runs BWA", | |||
oneLineSummary = "BWA on Spark", | |||
@CommandLineProgramProperties(summary = "align reads using BWA", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align -> Align
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@CommandLineProgramProperties(summary = "Runs BWA", | ||
oneLineSummary = "BWA on Spark", | ||
@CommandLineProgramProperties(summary = "align reads using BWA", | ||
oneLineSummary = "align reads to a given reference using BWA on Spark", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align > Align
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
import org.broadinstitute.hellbender.cmdline.StandardArgumentDefinitions; | ||
|
||
/** | ||
* interface for argument collections that control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, very non committal...
* most tools will want to use this, it defaults to performing sequence dictionary validation but provides the option | ||
* to disable it | ||
*/ | ||
class StandardValidationCollection implements SequenceDictionaryValidationArgumentCollection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like having these as inner classes so we don't have separate files for them. I assume that its the case that an inner class inside of an interface is static wrt the containing/implementing class ? Can/should these be static ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're static by default in an interface and intelli complains that it's redundnat if you mark them as static. I can add it in though if you think it's clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to change anything. I just didn't recall seeing that before, but thats about what I suspected.
@Argument(fullName = StandardArgumentDefinitions.DISABLE_SEQUENCE_DICT_VALIDATION_NAME, shortName = StandardArgumentDefinitions.DISABLE_SEQUENCE_DICT_VALIDATION_NAME, doc = "If specified, do not check the sequence dictionaries from our inputs for compatibility. Use at your own risk!", optional = true, common = true) | ||
private boolean disableSequenceDictionaryValidation = false; | ||
@ArgumentCollection | ||
SequenceDictionaryValidationArgumentCollection seqValidationArguments = new SequenceDictionaryValidationArgumentCollection.StandardValidationCollection(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this delegate to a getSequenceDictionaryValidationArgumentCollection()
method like the Spark one does ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, although we don't have any use case for it yet.
(knownSites.isEmpty() ? "": " --known-sites " + knownSites) + | ||
" -O %s"; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was totally unused.
3195b02
to
df35942
Compare
@Argument(fullName = StandardArgumentDefinitions.DISABLE_SEQUENCE_DICT_VALIDATION_NAME, shortName = StandardArgumentDefinitions.DISABLE_SEQUENCE_DICT_VALIDATION_NAME, doc = "If specified, do not check the sequence dictionaries from our inputs for compatibility. Use at your own risk!", optional = true, common = true) | ||
private boolean disableSequenceDictionaryValidation = false; | ||
@ArgumentCollection | ||
SequenceDictionaryValidationArgumentCollection seqValidationArguments = getSequenceDictionaryValidationArgumentCollection(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed one access modifier thing, otherwise 👍 .
previously, tools that align reads required you to manually disable sequence dictionary validation if you didn't, they would fail because the unaligned bam didn't have the required sequence dictionary extracting out a SequenceDictionaryValidationArgumentCollection and providing a method for GATKSparkTools to configure it ReadsPipeline couldn't easily make use of this, so instead it overrides the method that does validation BwaSpark / BwaAndMarkDuplicatesPipelineSpark now do not require or allow dictionary validation fixes #4131
df35942
to
21322db
Compare
made it private, will merge when tests pass, thanks @cmnbroad |
previously, tools that align reads required you to manually disable sequence dictionary validation
if you didn't, they would fail because the unaligned bam didn't have the required sequence dictionary
extracting out a SequenceDictionaryValidationArgumentCollection and providing a method for GATKSparkTools to configure it
ReadsPipeline couldn't easily make use of this, so instead it overrides the method that does validation
BwaSpark / BwaAndMarkDuplicatesPipelineSpark now do not require or allow dictionary validation
fixes #4131