-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fail to create scTAR_cellranger env #10
Comments
Hi Antoine, The It looks like the error you're getting is related to your Best, |
Hi Michael, It turns out the genes.gtf was zipped; after unzipping and rerunning the snakemake I got this error: Error in rule convertToRefFlat:
thanks a lot for your help, Antoine |
Can you share the error message you get when you run the following commands on the command line? The snakemake log file does not provide the error message info.
Thanks, |
I get this |
Hmm, seems like an issue with the libssl library. Try conda installing with Michael |
So I installed libssh2 and rerun, here is the log file: [Fri Oct 8 10:07:32 2021] [Fri Oct 8 12:18:55 2021] [Fri Oct 8 12:18:55 2021] [Fri Oct 8 12:18:55 2021]
Shutting down, this might take some time. and what I get during execution: (snakemake) Perrys-MacBook-Pro:from_cellranger perrylabmac$ snakemake -j --rerun-incomplete HMM_refFlat_to_gtf_WITHDIR 1 1 1 Select jobs to execute... [Fri Oct 8 10:07:32 2021] Number of aligned reads is 594024319 Reads spanning over splicing junction will join HMM blocks Start to run groHMM on each individual chromosome... Merging HMM blocks within 500bp... Calculating the coverage... Filtering the HMM blocks by coverage... Please examine if major chromosomes are all present in the final TAR_reads.bed.gz filezcat: can't stat: TAR_reads.bed.gz (TAR_reads.bed.gz.Z): No such file or directory Link the final TAR_reads.bed.gz file to the working directory Move intermediate files to /Users/perrylabmac/M1_Musca/Lib_10X_Musca_larva_2019_02_25_M1_1sttry/TAR/possorted_genome_bam_HMM_features/toremove ... /Users/perrylabmac/M1_Musca/Lib_10X_Musca_larva_2019_02_25_M1_1sttry/TAR/possorted_genome_bam_HMM_features/toremove can be deleted if no error message in SingleCellHMM_Run log file and gzip: toremove is a directory done! [Fri Oct 8 12:18:55 2021] Error in read.table(file = file, header = header, sep = sep, quote = quote, :
Shutting down, this might take some time. thanks! Antoine |
Hi Antoine, I think this issue may again be related to your MacOS system. The
Let me know if that fixes it. Michael |
Hi Michael, I tried both, I still get this error: The flag 'directory' used in rule getMatsSteps is only valid for outputs, not inputs. HMM_refFlat_to_gtf_WITHDIR 1 1 1 Select jobs to execute... [Tue Oct 12 09:22:09 2021] Error in read.table(file = file, header = header, sep = sep, quote = quote, :
Shutting down, this might take some time. Thanks! Antoine |
Can you share the content of your Thanks, |
Hi! (base) Perrys-MacBook-Pro:TAR perrylabmac$ ls -hlt and the content of the inner folder: (base) Perrys-MacBook-Pro:possorted_genome_bam_HMM_features perrylabmac$ ls -hlt thanks, Antoine |
Can you make sure you have the necessary R packages, listed below, installed? BiocManager Can you also share what is inside the Thanks, |
Hi, So I already had these R packages installed via Rstudio but I reinstalled them within miniconda3 just to make sure; I get the same error message as before. thanks, Antoine |
Can you share the SingleCellHMM_Run*.log file generated after implementing the awk fix? I would also recommend running the pipeline on a Linux system if possible. Michael |
ok I will try to run this on Linux. Where can I find the SingleCellHMM log file? |
I made a change to Michael |
Here it is: Path to SingleCellHMM.R /Users/perrylabmac/TAR-scRNA-seq/from_cellranger/scripts Reads spanning over splicing junction will join HMM blocks Start to run groHMM on each individual chromosome... Merging HMM blocks within 500bp... Calculating the coverage... Filtering the HMM blocks by coverage... Please examine if major chromosomes are all present in the final TAR_reads.bed.gz filezcat: can't stat: TAR_reads.bed.gz (TAR_reads.bed.gz.Z): No such file or directory Link the final TAR_reads.bed.gz file to the working directory Move intermediate files to /Users/perrylabmac/M1_Musca/Lib_10X_Musca_larva_2019_02_25_M1_1sttry/TAR/possorted_genome_bam_HMM_features/toremove ... /Users/perrylabmac/M1_Musca/Lib_10X_Musca_larva_2019_02_25_M1_1sttry/TAR/possorted_genome_bam_HMM_features/toremove can be deleted if no error message in SingleCellHMM_Run log file and |
Hi, So I still haven't managed to get it to work on MacOS but I was able to run the pipeline on Linux. Path to SingleCellHMM.R /home/adonati/TAR-scRNA-seq/from_cellranger/scripts Reads spanning over splicing junction will join HMM blocks Start to run groHMM on each individual chromosome... Merging HMM blocks within 500bp... Calculating the coverage... Filtering the HMM blocks by coverage... Please examine if major chromosomes are all present in the final TAR_reads.bed.gz fileNW_004754939.1 Link the final TAR_reads.bed.gz file to the working directory Move intermediate files to /home/adonati/Desktop/M1/Lib_10X_Musca_larva_2019_02_25_M1_1sttry/TAR/possorted_genome_bam_HMM_features/toremove ... /home/adonati/Desktop/M1/Lib_10X_Musca_larva_2019_02_25_M1_1sttry/TAR/possorted_genome_bam_HMM_features/toremove can be deleted if no error message in SingleCellHMM_Run log file and gzip: possorted_genome_bam_split.sorted.bed.gz already has .gz suffix -- unchanged Looking at the TAR_reads. bed it seems many scaffold are absent indeed. Any idea why the program skipped so many scaffolds? Thanks! Antoine |
Hi Antoine, I'm glad the pipeline worked better on Linux. I updated Best, |
Hi Michael, I am trying to get the pipeline to work on UCSD supercomputer Expanse, I installed miniconda3 and all the software required except R, which I can't figure out how to install without sudo (which I can't use on Expanse); do you think the pipeline could work if I install R via conda? Thanks! Antoine |
A conda installed R should work fine, as long as you have all the required R packages as well. Let me know if it works out. Michael |
Hi Michael, Sorry about the stupid question, but am I supposed to activate the scTAR_cellranger environment before running snakemake? I do have R and installed all the required libraries but within the scTAR_cellranger environement. The problem is that to run snakemake I need to use "conda activate snakemake", which deactivates scTAR_cellranger... Antoine |
Hi Michael, I get this error when I run the pipeline after installing R with conda: wildcards: CR_REF=/expanse/lustre/scratch/adonati2/temp_project/10XMusca/mapping/Musca_ref_genome [Sat Oct 23 18:45:20 2021] [Sat Oct 23 18:45:20 2021] /usr/bin/bash: line 1: Rscript: command not found
Shutting down, this might take some time. it looks like the Rscript command is not working from within the snakemake pipeline even if I do have Rscript in miniconda3/bin... thanks! Antoine |
Hi Antoine, It looks like Rscript is not available within the snakemake environment. You can try simply replacing In regards to the snakemake environment, it may be worthwhile to install snakemake via pip so you can activate the scTAR_cellranger environment without conflicting with the snakemake environment. Best, |
Hi Michael, I had a problem with R because for some reason conda installed R 3.2 Path to SingleCellHMM.R /home/adonati2/TAR-scRNA-seq/from_cellranger/scripts Reads spanning over splicing junction will join HMM blocks I was wondering if the pipeline is really supposed to generate and keep all these .bed files? Is it normal that the pipeline took so long even running on Expanse (I ran it on just one Node, which is two 64-core AMD EPYC 7742 processors and contain 256 GB of DDR4 memory)? Antoine |
Hi Antoine, Yes, the pipeline is suppose to generate those bed files. The groHMM algorithm is run on individual chromosomes/scaffolds. There is a limitation with the tool if you have a very high number of scaffolds (over 20,000 in your case). Are you interested in the expression across all scaffolds? Perhaps you can filter for the major scaffolds where you'd expect the most expression. Best, |
Hi Michael, I guess it would be nice to have the TARs for all scaffold; here I have many scaffolds indeed but most of them are small, only 35 of them are over 1Mb, and the total genome size is only 750Mb; will the pipeline run slower with a more fragmented genome or is the speed only dependent on genome size? |
Hi Antoine, The pipeline would run slower with more fragmented genome. It wouldn't take as long if you ran the pipeline on, for example, the human or mouse genome. Best, |
Hi Michael, Are all these bedfiles deleted at the end of the pipeline or are they all merged together and zipped? Antoine |
Hi Antoine, These bed files are zipped, not merged, and moved to the Best, |
Hi,
I am trying to get the fromcellranger pipeline to work on MacOS Catalina 10.15.7
When I run
conda env create -f scTAR_cellranger.yml
(after modifying the yml file prefix to /home/miniconda3/envs/scTAR_cellranger)
I get the following error message:
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
any idea about what is wrong?
I tried to run snakemake without creating the scTAR_cellranger environment but I get a bunch of errors
for example after doing
conda activate snakemake
snakemake -j
I get:
The flag 'directory' used in rule getMatsSteps is only valid for outputs, not inputs.
Building DAG of jobs...
MissingInputException in line 99 of /Users/perrylabmac/TAR-scRNA-seq/from_cellranger/Snakefile:
Missing input files for rule convertToRefFlat:
/Users/perrylabmac/Musca_ref_genome_Cellranger/genes/genes.gtf
thanks!
Antoine
The text was updated successfully, but these errors were encountered: