Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PBS qsub Node Specification Error #118

Open
CriticalSci opened this issue May 13, 2021 · 4 comments
Open

PBS qsub Node Specification Error #118

CriticalSci opened this issue May 13, 2021 · 4 comments

Comments

@CriticalSci
Copy link

Hi there,

I installed the ATAC-seq pipeline on a PBS cluster.
Caper is configured with pbs backend and the lead job was submitted.
It seems the pipeline fails to submit the first subjob, this is what the generated submission script script.submit looks like:

#!/bin/bash
if [ -z \"$SINGULARITY_BINDPATH\" ]; then export SINGULARITY_BINDPATH=; fi; \
if [ -z \"$SINGULARITY_CACHEDIR\" ]; then export SINGULARITY_CACHEDIR=; fi;

echo "/bin/bash /scratch/username/KMS11_ATAC/runQSZNCL.ctrls/atac/cc317db4-d5f2-4ad7-ad32-5b95862f02c7/call-read_genome_tsv/execution/script" | \
qsub \
    -N cromwell_cc317db4_read_genome_tsv \
    -o /scratch/username/KMS11_ATAC/runQSZNCL.ctrls/atac/cc317db4-d5f2-4ad7-ad32-5b95862f02c7/call-read_genome_tsv/execution/stdout \
    -e /scratch/username/KMS11_ATAC/runQSZNCL.ctrls/atac/cc317db4-d5f2-4ad7-ad32-5b95862f02c7/call-read_genome_tsv/execution/stderr \
    -lnodes=1:ppn=1:mem=2048mb \
    -lwalltime=1:0:0 \
     \
    -q q32 \
     -P sbs_liyh \
    -V

When this is submitted, stderr.background comes back with qsub: node(s) specification error
Looks like some spaces are missing?

@leepc12
Copy link
Contributor

leepc12 commented Sep 7, 2021

Sorry about very late response. What is a node specification (-l) for your PBS cluster? What node specification do you usually use when you submit your jobs without Caper?

@CriticalSci
Copy link
Author

No problem, if I remember correctly I got it working by making a hacky fix to the cromwell_backend.py script:

@@ -788,19 +788,19 @@ class CromwellBackendPBS(CromwellBackendLocal):
                 if [ -z \\"$SINGULARITY_BINDPATH\\" ]; then export SINGULARITY_BINDPATH=${singularity_bindpath}; fi; \\
                 if [ -z \\"$SINGULARITY_CACHEDIR\\" ]; then export SINGULARITY_CACHEDIR=${singularity_cachedir}; fi;
 
-                echo "${if !defined(singularity) then '/bin/bash ' + script
+                echo '${if !defined(singularity) then 'module load anaconda2020/python3 && eval \"$(/usr/local/anaconda3-2020/bin/conda shell.bash hook)\" && conda activate encode-atac-seq-pipeline && /bin/bash ' + script
                         else
                           'singularity exec --cleanenv ' +
                           '--home ' + cwd + ' ' +
                           (if defined(gpu) then '--nv ' else '') +
-                          singularity + ' /bin/bash ' + script}" | \\
+                          singularity + ' /bin/bash ' + script}' | \\
                 qsub \\
                     -N ${job_name} \\
                     -o ${out} \\
                     -e ${err} \\
-                    ${true="-lnodes=1:ppn=" false="" defined(cpu)}${cpu}${true=":mem=" false="" defined(memory_mb)}${memory_mb}${true="mb" false="" defined(memory_mb)} \\
-                    ${'-lwalltime=' + time + ':0:0'} \\
-                    ${'-lngpus=' + gpu} \\
+                    ${true="-l select=1:ncpus=" false="" defined(cpu)}${cpu}${true=":mem=" false="" defined(memory_mb)}${memory_mb}${true="MB" false="" defined(memory_mb)} \\
+                    ${'-l walltime=' + time + ':0:0'} \\
+                    ${'-l ngpus=' + gpu} \\
                     ${'-q ' + pbs_queue} \\
                     ${pbs_extra_param} \\
                     -V

It adds some spaces and extra conda activation calls (which is quite specific to this cluster) as it seems the environment would not be inherited in sub-jobs.

@leepc12
Copy link
Contributor

leepc12 commented Oct 25, 2021

Sorry about late response.
I actually have been working on refactoring Caper for HPC backend stuffs and will release new Caper today or tomorrow.
You can customize resource parameters for any HPC type (pbs, sge, slurm) in the conf file ~/.caper/default.conf.
So we won't need any hacky workaround to run Conda + PBS.

New Caper can runs like caper run .. --conda ENV_NAME. ENV_NAME can be skipped if it's defined in WDL workflow's meta (default_conda).
It will internally call conda run -n ENV_NAME TASK_SCRIPT.

@CriticalSci
Copy link
Author

Great! Thanks for letting me know. I will test it out soon :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants