Skip to content

Using run_ceres directly #11

@yesitsjess

Description

@yesitsjess

I can't get prepare_ceres_inputs to work consistently with my proxy settings and, on the rare occasion I do, I can't get it to work with my bowtie and samtools:

> prepare_ceres_inputs(inputs_dir="ceres_inputs",
+                      dep_file="ceres_inputs/ceres_LFC_input.gct",
+                      cn_seg_file="ceres_inputs/ceres_CN_input.tsv",
+                      gene_annot_file="example_data/CCDS.current.txt",
+                      rep_map_file="ceres_inputs/ceres_rep_input.tsv",
+                      genome_id="hg19",
+                      chromosomes=paste0("chr", 1:22),
+                      dep_normalize="zmad")
loading dependency data...

Parsed with column specification:
cols(
  Replicate = col_character(),
  CellLine = col_character()
)
loading copy number data...

mapping sgRNAs to the genome...

sh: bowtie: command not found
sh: samtools: command not found
Error in value[[3L]](cond) : 
  failed to open BamFile: file(s) do not exist:
  '/tmp/RtmpfQL4OU/guides.bam'
In addition: Warning messages:
1: In system(bowtie_cmd) : error in running command
2: In system(samtools_cmd) : error in running command

As a result, I've tried to put together the correct data and supply it directly to run_ceres. It fails with error:

> run_ceres(sg_data=sg_data, cn_data=cn_data, 
+           guide_locus=guide_locus, locus_gene=locus_gene, replicate_map=repmap)
Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent
In addition: There were 50 or more warnings (use warnings() to see the first 50)

and the warnings are:

Warning messages:
1: In mean.default(x, na.rm = T) :
  argument is not numeric or logical: returning NA

Obviously I'm using real data, but I thought dummy data would help you to spot what I'm doing wrong:


# log fold change calc from plasmid of each gRNA in each sample
dum_sg_lfc <- as.matrix(sapply(1:6, function(x) rnorm(4)))
rownames(dum_sg_lfc) <- c("ATCGA", "ATCGT", "ATCGC", "ATCGG")
colnames(dum_sg_lfc) <- c("A1", "A2", "B1", "B2", "C1", "C2")

# log2ratio copy number at each gRNA cut site in each cell line given as chr:pos
dum_cn_lr <- as.matrix(sapply(1:3, function(x) rnorm(4)))
rownames(dum_cn_lr) <- c("1:100", "1:200", "1:300", "1:400")
colnames(dum_cn_lr) <- c("A", "B", "C")

# dummy data using chr:pos as locus, entrez gene id as gene and sample to cell line names
dum_gl <- data.frame(Guide=rownames(dum_sg_lfc), Locus=rownames(dum_cn_lr))
dum_lg <- data.frame(Locus=rownames(dum_cn_lr), Gene=paste0("eg", 1:nrow(dum_cn_lr)))
dum_rep <- data.frame(Replicate=colnames(dum_sg_lfc), CellLine=gsub("[[:digit:]]*", "", colnames(dum_sg_lfc)))

run_ceres(sg_data=dum_sg_lfc, cn_data=dum_cn_lr, 
          guide_locus=dum_gl, locus_gene=dum_lg, replicate_map=dum_rep)


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions