-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yf fix GenotypeAlleleCount's toString #6376
Conversation
@@ -371,13 +371,12 @@ public String toUnphasedGenotypeString() { | |||
return ""; | |||
} | |||
final StringBuilder sb = new StringBuilder(distinctAlleleCount * 3); | |||
for (int i = 0; i < distinctAlleleCount; i += 2) { | |||
for (int i = 0; i < distinctAlleleCount * 2; i += 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you avoid the multiplication * 2 at each iteration? Don't know if the VM will optimize this out. I wonder whether you could compare against sortedAlleleCounts.length but need to see the rest of the code to confirm.... too lazy
If you change this loop consider to skip the +2 increment on i and simple increase as it is need to access the index and count:
for (int i = 0; i < stop; ) {
int index = sortedAlleleCounts[i++];
int count = sortedAlleleCounts[i++];
....
}
@@ -371,13 +371,12 @@ public String toUnphasedGenotypeString() { | |||
return ""; | |||
} | |||
final StringBuilder sb = new StringBuilder(distinctAlleleCount * 3); | |||
for (int i = 0; i < distinctAlleleCount; i += 2) { | |||
for (int i = 0; i < distinctAlleleCount * 2; i += 2) { | |||
final int alleleIndex = sortedAlleleCounts[i]; | |||
final int alleleCount = sortedAlleleCounts[i + 1]; | |||
for (int j = 0; j < alleleCount; j++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since counts is guaranteed not to be 0, then the string composition could be rather:
sb.append(index);
for (int j = 1; j < count; j++) {
sb.append('/').append(index);
}
return sb.toString();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, you need to add a /
after every allele, not only the alleles with counts >1. think about 0/1, whose sortedAlleleCounts array is (0,1,1,1)....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index is 0 and 1, count is 1 and 1. ... (index=0, count=1) ... then (index=1, count=1). count is always 1 or larger or that is my understanding....
at least in the example that you are giving count for both alleles (0 and 1) is 1 (the second number of the pair. (0, 1, 1, 1) is decomposed into (0, 1) and (1, 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case you could leave the code as it is... just noticed this optimization opportunity near your change so I suggested the edit but it has no concern with fix, so.
- made tests more explicit - added some more tests
done. |
@@ -371,13 +371,12 @@ public String toUnphasedGenotypeString() { | |||
return ""; | |||
} | |||
final StringBuilder sb = new StringBuilder(distinctAlleleCount * 3); | |||
for (int i = 0; i < distinctAlleleCount; i += 2) { | |||
for (int i = 0; i < distinctAlleleCount * 2; i += 2) { | |||
final int alleleIndex = sortedAlleleCounts[i]; | |||
final int alleleCount = sortedAlleleCounts[i + 1]; | |||
for (int j = 0; j < alleleCount; j++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index is 0 and 1, count is 1 and 1. ... (index=0, count=1) ... then (index=1, count=1). count is always 1 or larger or that is my understanding....
at least in the example that you are giving count for both alleles (0 and 1) is 1 (the second number of the pair. (0, 1, 1, 1) is decomposed into (0, 1) and (1, 1)
@@ -371,13 +371,12 @@ public String toUnphasedGenotypeString() { | |||
return ""; | |||
} | |||
final StringBuilder sb = new StringBuilder(distinctAlleleCount * 3); | |||
for (int i = 0; i < distinctAlleleCount; i += 2) { | |||
for (int i = 0; i < distinctAlleleCount * 2; i += 2) { | |||
final int alleleIndex = sortedAlleleCounts[i]; | |||
final int alleleCount = sortedAlleleCounts[i + 1]; | |||
for (int j = 0; j < alleleCount; j++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case you could leave the code as it is... just noticed this optimization opportunity near your change so I suggested the edit but it has no concern with fix, so.
GenotypeAlleleCount's toString function was broken, leading to a misleading debugging information showing up in the debugger
This PR fixes the bug and adds tests for toString output.