Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove hardcoded limit on max homopolymer call #8088

Merged
merged 9 commits into from
Dec 7, 2022

Conversation

ilyasoifer
Copy link
Collaborator

No description provided.

Copy link
Contributor

@meganshand meganshand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Do we have test cases that include this max class value in the header? If not it would be good to add at least one test file with the new header.

@ilyasoifer ilyasoifer force-pushed the fix_flow_matrix_mods_hc branch from eb84f1d to 232a84f Compare November 21, 2022 05:58
@ilyasoifer ilyasoifer force-pushed the fix_flow_matrix_mods_hc branch from acd85de to 097ee71 Compare December 1, 2022 20:18
@ilyasoifer ilyasoifer requested a review from meganshand December 3, 2022 15:53
@ilyasoifer
Copy link
Collaborator Author

@meganshand can you take a look again please?

  1. Added the test for parsing mc tag in the header correctly
  2. Generated a mechanism that automatically determines the correct value of flow-collapse-hmer-length parameter
  3. (unrelated) fixed a small bug in the assembly complexity annotation that created inconsistent results between SDK11 and SDK8

Copy link
Contributor

@meganshand meganshand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor questions about expected test output and default argument values.

@@ -264,7 +264,7 @@ public enum FlowMode {
MIN_BASE_QUALITY_SCORE_SHORT_NAME, "0",
FILTER_ALLELES, "true",
FILTER_ALLELES_SOR_THRESHOLD, "3",
FLOW_ASSEMBLY_COLLAPSE_HMER_SIZE_LONG_NAME, "12",
FLOW_ASSEMBLY_COLLAPSE_HMER_SIZE_LONG_NAME, String.valueOf(AssemblyBasedCallerUtils.DETERMINE_COLLAPSE_THRESHOLD),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change the default for STANDARD/ADVANCED flow mode for older samples that don't have the mc tag?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not, if there is no mc - then it will be 12

@@ -3879,7 +3879,7 @@ chr9 81153765 . G <NON_REF> . . END=81153766 GT:DP:GQ:MIN_DP:PL 0/0:27:69:25:0,6
chr9 81153767 . A <NON_REF> . . END=81153783 GT:DP:GQ:MIN_DP:PL 0/0:25:72:24:0,72,802
chr9 81153784 . A <NON_REF> . . END=81153785 GT:DP:GQ:MIN_DP:PL 0/0:28:22:27:0,22,996
chr9 81153786 . G <NON_REF> . . END=81153800 GT:DP:GQ:MIN_DP:PL 0/0:28:81:28:0,81,1175
chr9 81153801 . C CA,<NON_REF> 0 . ASSEMBLED_HAPS=10;AS_RAW_BaseQRankSum=|0.0,1|NaN;AS_RAW_MQ=90000.00|10800.00|0.00;AS_RAW_MQRankSum=|0.0,1|NaN;AS_RAW_ReadPosRankSum=|0.9,1|NaN;AS_SB_TABLE=8,17|3,0|0,0;BaseQRankSum=0.000;DP=29;ExcessHet=0.0000;FILTERED_HAPS=6;HAPCOMP=4,0;HAPDOM=0.500,0.00;HEC=55,3,2,1,0,0;MLEAC=0,0;MLEAF=0.00,0.00;MQRankSum=0.000;RAW_MQandDP=104400,29;ReadPosRankSum=0.929;SUSP_NOISY_ADJACENT_TP_VARIANT GT:AD:DP:GQ:PL:SB 0/0:25,3,0:28:18:0,18,58,75,692,115:8,17,3,0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change in HAPCOMP annotations is expected?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I fixed a small bug in assemblycomplexity annotation that caused a different result between two different java versions, it subtly changed the output, but it should be fine

? new LongHomopolymerHaplotypeCollapsingEngine(argumentCollection.flowAssemblyCollapseHKerSize, argumentCollection.flowAssemblyCollapsePartialMode, fullReferenceWithPadding,
int collapseHmerSize = argumentCollection.flowAssemblyCollapseHKerSize;
if (collapseHmerSize == DETERMINE_COLLAPSE_THRESHOLD){
collapseHmerSize = AssemblyBasedCallerUtils.determineFlowAssemblyColapseHmer(header);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
collapseHmerSize = AssemblyBasedCallerUtils.determineFlowAssemblyColapseHmer(header);
collapseHmerSize = AssemblyBasedCallerUtils.determineFlowAssemblyCollapseHmer(header);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -411,6 +417,18 @@ public static AssemblyResultSet assembleReads(final AssemblyRegion region,
}
}

private static int determineFlowAssemblyColapseHmer(SAMFileHeader readsHeader) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static int determineFlowAssemblyColapseHmer(SAMFileHeader readsHeader) {
private static int determineFlowAssemblyCollapseHmer(SAMFileHeader readsHeader) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@ilyasoifer ilyasoifer merged commit 84ade40 into broadinstitute:master Dec 7, 2022
@ilyasoifer ilyasoifer deleted the fix_flow_matrix_mods_hc branch December 7, 2022 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants