-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRAM queryAlignmentStart/queryMate fix. #1164
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1164 +/- ##
===============================================
+ Coverage 67.863% 67.879% +0.016%
- Complexity 8289 8294 +5
===============================================
Files 563 564 +1
Lines 33718 33729 +11
Branches 5659 5659
===============================================
+ Hits 22882 22895 +13
+ Misses 8656 8655 -1
+ Partials 2180 2179 -1
|
|
||
/** | ||
* Subclasses must call this method in their constructors AFTER construction of this class is complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you not have a "abstract callInitialize()" method in the base class and then call that method from the base constructor? that would force an subclass to implement this method and read this comment...otherwise I'm not sure it will be noticed....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is error prone but at least it's a private class, so it's likely a future subclasser will read this comment before writing code.
I don't see why this class still has CRAMIntervalIteratorBase(final QueryInterval[] queries, final boolean contained, final long[] coordinates)
as coordinates
is ignored. Removing that constructor will also be a way to tell callers that they must call this method, which you could rename to something like setCoordinates()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that constructor appears to be redundant. Removed. Not sure I understand how that helps with the requirement to call initialize though (or perhaps I misunderstand what the suggestion is). This is a private class though, and this whole containing class is targeted for a large refactoring as part of the cram overhaul I'm starting.
return FilteringIteratorState.CONTINUE_ITERATION; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra nl (from previous code)
@@ -0,0 +1,43 @@ | |||
package htsjdk.samtools; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not htsjdk.samtools.filter ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, this seems to have more affinity with BAMIteratorFilter
and it's other subclass BAMQueryMultipleIntervalsIteratorFilter
, which are all in this package. Happy to move it if you still think its better there though.
@@ -492,21 +494,30 @@ void enableFileSource(final SamReader reader, final boolean enabled) { | |||
return BAMFileSpan.merge(spanArray).toCoordinateArray(); | |||
} | |||
|
|||
private class CRAMIntervalIterator extends BAMQueryMultipleIntervalsIteratorFilter | |||
private abstract class CRAMIntervalIteratorBase extends BAMQueryMultipleIntervalsIteratorFilter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to note here the reason for the abstractness, not only in line 513....or better, actually have an abstract method whose implementation will resolve the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added. Not sure I see understand how the suggestion to add another abstract method resolves the initialize issue though. Can you give a more specific example ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey! work on CRAM is happening!! hurray!
case MATCHES_FILTER: | ||
nextRec = nextRecord; | ||
break; | ||
case CONTINUE_ITERATION: | ||
continue; | ||
case STOP_ITERATION: | ||
break; | ||
nextRec = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstances can you reach this line and have nextRec
not be null? It's not modified outside of this method. Making it a private
field would make this a little clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the assignment seems to be redundant. fixed.
@@ -51,43 +51,43 @@ | |||
@Test | |||
public void testConstructors () throws IOException { | |||
CRAMFileReader reader = new CRAMFileReader(cramFile, indexFile, source, ValidationStringency.SILENT); | |||
CloseableIterator<SAMRecord> iterator = reader.queryAlignmentStart("chrM", 1500); | |||
CloseableIterator<SAMRecord> iterator = reader.queryAlignmentStart("chrM", 1519); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a named constant instead of a magic number would avoid the need to modify it in many places the next time this changes. And also make it more apparent that the same value is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, including using a constant for the sequence name.
Assert.assertTrue(iterator.hasNext()); | ||
SAMRecord record = iterator.next(); | ||
|
||
Assert.assertEquals(record.getReferenceName(), "chrM"); | ||
Assert.assertTrue(record.getAlignmentStart() >= 1500); | ||
Assert.assertEquals(record.getAlignmentStart(),1519); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing space after ,
on many lines here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
final String queryContig, | ||
final int alignmentStart, | ||
final int expectedReadCount) throws IOException | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
open brace should be at end of previous line, not on its own line
23807f1
to
ef537e8
Compare
Fixes an issue discovered as part of debugging #1065. The CRAM implementation of queryAlignmentStart (which is in turn used by queryMate) currently returns all reads in a container that start after the requested alignment start. This PR pulls out the
BAMStartingAtFilterIterator
into a top-level class so it can be reused by the CRAM implementation to filter, and then stop iterating after, all reads with the desired start have been returned.