You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My understanding is that the manipulateGTC option currently updates the genotype call and the base calls (1002 and 1003 byte arrays respectively) based on the input alleles in the updates file and the allele combo in the manifest ([A/B] in the manifest). However, the logic in manipulateGTC currently doesn't work in two scenarios:
Indel updates are not performed correctly. (Both genotype call and the base calls)
Base calls are not updated for some SNPs.
Indel update
When there are Indel updates specified in the updates.txt file with AA/BB combination (II,DD), the snpUpdate ignores this case. I assume that this is because of the conditional check in this line in snpUpdate function. This condition checks for only A,T,G,C in the input line. So, indels are ignored.
Base calls
Currently base calls in the GTC file (1003 byte array) is updated based off the alleles mentioned in the input updates file - ref but, this sometimes gives us the wrong base calls as GTC files use the TOP strand alleles combination to generate base calls value for a SNP whereas the base calls generated using the allele combination for the SNP in the BeadPoolManifest might be different.
I've managed to find a workaround for this by updating the base calls using the TOP strand combination found in the CSV format of the BeadPoolManifest.
I have raised a PR with the changes. Please let me know if you need any clarifications on anything. Thank you for your response and offering to take a look at the changes, @tbrunetti!
My understanding is that the manipulateGTC option currently updates the genotype call and the base calls (1002 and 1003 byte arrays respectively) based on the input alleles in the updates file and the allele combo in the manifest ([A/B] in the manifest). However, the logic in manipulateGTC currently doesn't work in two scenarios:
Indel update
When there are Indel updates specified in the updates.txt file with AA/BB combination (II,DD), the snpUpdate ignores this case. I assume that this is because of the conditional check in this line in snpUpdate function. This condition checks for only A,T,G,C in the input line. So, indels are ignored.
Base calls
Currently base calls in the GTC file (1003 byte array) is updated based off the alleles mentioned in the input updates file - ref but, this sometimes gives us the wrong base calls as GTC files use the TOP strand alleles combination to generate base calls value for a SNP whereas the base calls generated using the allele combination for the SNP in the BeadPoolManifest might be different.
GTC file documentation reference: https://github.com/Illumina/BeadArrayFiles/blob/develop/docs/GTC_File_Format_v5.pdf (TOC Entry table)
I've managed to find a workaround for this by updating the base calls using the TOP strand combination found in the CSV format of the BeadPoolManifest.
BeadPoolManifest file documentation for reference: https://knowledge.illumina.com/microarray/general/microarray-general-reference_material-list/000001565
I've temporarily addressed both the scenarios and pushed the changes to a fork of this repo - https://github.com/sgopalan98/GThaCk/tree/fixing-bug-manipulate-gtc .
It would be really helpful if you could look at these bugs and find out if there is a better fix for this? Thank you!
The text was updated successfully, but these errors were encountered: