-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gene_Centic_Coding Unable to Analyze Gene #9
Comments
Hi @kwdoyle, Thanks for your questions. I haven't encountered this issue before, since I don't think Best, |
On another note, what are those variants with Thank you! |
So all variants do indeed have a value for |
I think this is another data-loading issue. |
Hi @kwdoyle, Thanks so much for your input. The issue might still be related to auto-assign the column classes. Please feel free to send me an email and if you'd like we can do a quick call to get it right. Best, |
In brief, if you annotate your genotype data using the FAVOR Essential Database, this issue should not persist. |
Yes, so this issue was due to Since I chose to add all 160 annotations from the FAVOR database, it would have been inconvenient to assign the column classes for each one within For reference, I checked which columns were read in differently between Anything read in by |
Thank you @kwdoyle. This is very helpful! If you would like to contribute some documents/scripts that you use, please let me know. I can add you as a collaborator of the STAARpipeline-Tutorial repo so that you can contribute to this section. Best, |
That would be great, as I've been making some modifications to these scripts to be generally applicable to other scenarios. Mainly, being independent from the Harvard cluster job IDs used to select the current chromosome to analyze. |
Sounds perfect, thank you @kwdoyle! I've invited you to be part of the STAARpipeline-Tutorial repo. Look forward to your contributions! p.s. I'll close this issue and the other issue in the STAARpipeline-Tutorial repo. Best, |
Hello,
While running the
Gene_Centic_Coding
function, I noticed a strange issue while processing through a list of genes for a specific chromosome.On any given gene, the function seems to work properly until the internal
coding
function attempts to run theSTAAR
function:I am receiving the following error, and thus no results from the current gene:
This error occurs virtually for all genes. Looking into this, it appears the issue is how the annotation data is subset for the final list of variants that are lof in plof:
When I run this,
lof.in.plof
is a vector of NAs, TRUEs, and FALSEs, with the number of TRUEs corresponding to the final filtered number of variants to use (in my case, 5). When the annotation data inAnno.Int.PHRED.sub
is subset using this vector, however, the final dimensions of the table still contain the number of rows that correspond to the previous number of variants (which, in my case, was 129).The
Geno
matrix has the dimensions [n samples x 5 variants]. WhenAnno.Int.PHRED.sub.category
is passed to theSTAAR
function, however, its dimensions are still [n samples x 129 variants], causing the error.If I wrap the
which
function aroundlof.in.plof
, the dimensions of the resulting table are [n samples x 5] andSTAAR
is able to run properly and gives no error:I assume this fix makes sense and there shouldn't be a reason
Anno.Int.PHRED.sub.category
should still contain rows with NA data..? The final dimensions of this annotation table should indeed match that of the genotype matrix, no?The text was updated successfully, but these errors were encountered: