You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was coming across an error when trying to run this using an object that had Ensembl IDs as the row names rather than gene symbols. During the annotateGenes() function, I was getting the following error:
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 7104, 7107
I narrowed this down to this line, which removes any genes that have a duplicated gene symbol in the reference edb matrix. However, you don't do the same thing to the mtx variable.
This is probably necessary with gene symbols since the dimensions between edb and mtx may not match if duplicated values are present in edb. However, if using Ensembl IDs there are no duplicated IDs, so this step isn't necessary. Also you should only remove duplicated for IDs for the column indicated with use_geneID, although I think if it's gene_id, then I would skip this step all together.
The text was updated successfully, but these errors were encountered:
allyhawkins
changed the title
Remove filtering by duplicate gene symbols when rownames of counts matrix are Ensembl IDs
Remove filtering of duplicate gene symbols when rownames of counts matrix are Ensembl IDs
May 16, 2024
I was coming across an error when trying to run this using an object that had Ensembl IDs as the row names rather than gene symbols. During the
annotateGenes()
function, I was getting the following error:I narrowed this down to this line, which removes any genes that have a duplicated gene symbol in the reference
edb
matrix. However, you don't do the same thing to themtx
variable.SCEVAN/R/preProcessing.R
Line 32 in 228beea
This is probably necessary with gene symbols since the dimensions between
edb
andmtx
may not match if duplicated values are present inedb
. However, if using Ensembl IDs there are no duplicated IDs, so this step isn't necessary. Also you should only remove duplicated for IDs for the column indicated withuse_geneID
, although I think if it'sgene_id
, then I would skip this step all together.The text was updated successfully, but these errors were encountered: