-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prune2df runing for more than 140h #142
Comments
I having a similar issue. The progress bar creeps up relatively fast to a point and subsequently stalls. No error message but no output either. Happened both on Linux and Anaconda on Windows. |
Hello @jk86754 , but did you let it end?, because i have to stopped it, I think 145h is quite a lot for a small set of samples. Jp |
Hi @JPcerapio , @jk86754 , This step should definitely not take 145 hours. This seems to be a bug in the pruning step, similar to #104 . Running this step via the CLI seems to have worked for others, could you try this? |
Hey @cflerin thanks for your answer, I will try it but with this option the problem is that we do not have access to intermediates files or results that we will like to have. I don't know if someone figure out if the error is coming from a some missing dependence or library. Jp |
Hi, @JPcerapio , which intermediate files are you referring to? When you run this step in the CLI, you can still get the motif and regulon information. Although the CLI outputs only one of these, you can convert to the other without re-running, for example: #100 |
hello, thanks for your help, |
Hi @morganee261 , Have you solved this problem? I'm also running the CLI ( Thanks, |
Hi @liboxun, Unfortunately no, I haven't had any luck. It has been (and is still) running for a month now and I did not get an answer from the developers of this package. Morgane |
Hi @morganee261 , @liboxun , This step should definitely not take this long. If it's been running for a month there's clearly something wrong and I would stop it. I've seen this issue a few times before, but I haven't been able to reproduce the problem to see where and why this step hangs, so I can't offer you a good solution. A few suggestions:
|
Thanks a lot @cflerin ! Since I'm already running the CLI version, I'll try switching to the Docker image or using just a single feather database. I'll update this here when things come out. |
Hello @cflerin, I have been running the CLI of pyscenic ctx and that is what got stuck running for over a month. I stopped and I started running it with a single feather database. I am also trying to run the docker image but I am not very familiar with it and I run into an error : docker run -it --rm \
docker: invalid reference format. could you please advise? thanks, for your reply and your help, Morgane |
Hi @cflerin , I went back to ran the CLI with a single feather database, and it didn't help. It still got stuck forever at:
But when I tried using Singularity image (since Docker isn't available on our HPC system) of pySCENIC 0.10.0, it certainly helped. Now I actually got an progress bar, despite its failing at 57%:
It failed because of it ran out of memory:
I used a node with 32GB memory, with 32 workers. Is that too little? What would you recommend? Thanks! |
Hi @liboxun, I got it to run in less than 14 min by using the docker image. I used 20 cores so the more the better I think. But here is my code (note that the whole code is in 1 line without "", the code that is on the tutorial did not work for me) sudo docker pull aertslab/pyscenic:0.10.0 sudo docker run -it --rm -v /path/to/data:/scenicdata aertslab/pyscenic:0.10.0 pyscenic grn --num_workers 20 --transpose -o /scenicdata/expr_mat.adjacencies.tsv /scenicdata/ex_matrix.csv /scenicdata/hgnc_tfs.txt I have to transpose my expression matrix to get it in the right format but you might not have tosudo docker run -it --rm -v /path/to/data:/scenicdata aertslab/pyscenic:0.10.0 pyscenic ctx scenicdata/expr_mat.adjacencies.tsv /scenicdata/hg19-tss-centered-10kb-7species.mc9nr.feather /scenicdata/hg19-500bp-upstream-7species.mc9nr.feather --annotations_fname /scenicdata/motifs-v9-nr.hgnc-m0.001-o0.0.tbl --expression_mtx_fname /scenicdata/ex_matrix.csv --transpose --mode "dask_multiprocessing" --output /scenicdata/regulons.csv --num_workers 20 #this ran is 14 min on a server with 1Tb of RAM and using 20 out of 64 cores sudo docker run -it --rm -v /path/to/data:/scenicdata aertslab/pyscenic:0.10.0 pyscenic aucell /scenicdata/ex_matrix.csv --transpose /scenicdata/regulons.csv -o /scenicdata/auc_mtx.csv --num_workers 20 #this took less than 10 min hope this helps! morgane |
Hi @morganee261 , Thanks for that tip! Glad to hear it eventually worked for you. I also got it to run (~23min) when I bumped the task over to a node with 128GB of memory (using 32 out of 32 cores). Best, |
Hi @cflerin I am trying to import the results of the CLI pyscenic (3 csv files) into R for further analysis but I am having a lot of problems. it seems like having a loom file for the importation helps however your CLI tutorial exports as csv. could you please provide a brief tutorial on how to import them into R to be able to run the rest of the SCENIC script and look at the data? thanks for your help, Morgane |
Hi @liboxun I am having issues with the downstream analysis. I was wondering what platform you were using and if you had any luck with it. Thanks, |
Hi @morganee261 , I use Python. I haven't done any downstream analysis yet. I'll let you know how it goes in the next couple of weeks. Best of luck, |
Hi @morganee261 , I was able to run the example jupyter notebook successfully for 10x PBMC dataset: This notebook was written in Python, and was meant for analysis downstream of While there were several issues (some were due to wrong versions of dependencies, which thankfully were easy enough for me to fix by myself), I could largely run through the notebook smoothly. Hopefully this helps! I'm not sure if there's an equivalent example in R, but I'd assume there is, since the original SCENIC was written in R. Best, |
Hi @liboxun, Many thanks. Weijian |
Hi @ureyandy2009 , For me, a combination of two changes worked:
Hopefully this helps! Best, |
Thank you very much. I think RAM maybe the main problem. Many thanks. |
I have faced the same issue recently and spent 3days trying to figure it out. Singularity build would't run for me on my institute's HPC, i kept getting this error : conda installation of my data set: 14766 cells × 23011 genes 1- specified an interactive session; 2- acitave conda environemt where pyscenic is installed. 3-run this script: I just added:
total consumed time:50minutes |
Hello,
so I managed to get until the Phase II of your Tutorial with your data.
But after running 145h I stopped the process. I don't know if it is normal that it runs that long.
Thanks for your help.
Jp
Here some info,
PHASE I
network = grnboost2(expression_data=ex_matrix2,
gene_names=gene_names,
tf_names=tf_names)#6h of runing
modules = list(modules_from_adjacencies(network, ex_matrix))
PHASE II
[####################################### ] | 98% Completed | 25min 37.2s
2020-02-12 15:05:46,854 - pyscenic.transform - WARNING - Less than 80% of the genes in Tcf21 could be mapped to mm9-tss-centered-5kb-10species.mc9nr. Skipping this module.
[####################################### ] | 98% Completed | 25min 45.6s
2020-02-12 15:05:55,227 - pyscenic.transform - WARNING - Less than 80% of the genes in Mef2d could be mapped to mm9-tss-centered-5kb-10species.mc9nr. Skipping this module.
[####################################### ] | 98% Completed | 25min 46.4s
2020-02-12 15:05:56,007 - pyscenic.transform - WARNING - Less than 80% of the genes in Meox2 could be mapped to mm9-tss-centered-5kb-10species.mc9nr. Skipping this module.
[####################################### ] | 99% Completed | 18hr 5min 13.5s
[####################################### ] | 99% Completed | 145hr 0min 28.9s^CProcess ForkPoolWorker-446:
The text was updated successfully, but these errors were encountered: