Skip to content
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.

Configuring to run with SLURM #74

Open
graffico opened this issue Mar 23, 2017 · 1 comment
Open

Configuring to run with SLURM #74

graffico opened this issue Mar 23, 2017 · 1 comment

Comments

@graffico
Copy link

graffico commented Mar 23, 2017

Hi, I'm getting the following error running FALCON_unzip on a cluster using SLURM. I'm assuming my .cfg isn't configured properly. Any help would be greatly appreciated.

$ fc_unzip.py fc_unzip.cfg
[INFO]Setup logging from file "None".
[WARNING]In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.fs_based' from '/data/harrison/FALCON-integrate/pypeFLOW/pwatcher/fs_based.pyc'>
[INFO]In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.fs_based' from '/data/harrison/FALCON-integrate/pypeFLOW/pwatcher/fs_based.pyc'>
[INFO]job_type='slurm', job_queue='default', sge_option=None, use_tmpdir=None, squash=False
[INFO]Num unsatisfied: 1, graph: 1
[INFO]About to submit: Node(3-unzip/reads)
[INFO]starting job Job(jobid='P030fee073fa368', cmd='/bin/bash run.sh', rundir='/data/harrison/falcon/3-unzip/reads', options={'job_queue': 'default', 'sge_option': ' -pe smp 12 -q bigmem', 'job_type': 'slurm'})
[INFO]!sbatch -J P030fee073fa368 -pe smp 12 -D /data/harrison/falcon/mypwatcher/jobs/P030fee073fa368 -o stdout -e stderr --wrap="/bin/bash /data/harrison/falcon/mypwatcher/wrappers/run-P030fee073fa368.bash"
sbatch: error: Unable to open file smp
[ERROR]In pwatcher.fs_based.cmd_run(), failed to submit background-job:
MetaJobSlurm(MetaJob(job=Job(jobid='P030fee073fa368', cmd='/bin/bash run.sh', rundir='/data/harrison/falcon/3-unzip/reads', options={'job_queue': 'default', 'sge_option': ' -pe smp 12 -q bigmem', 'job_type': 'slurm'}), lang_exe='/bin/bash'))
Traceback (most recent call last):
  File "/data/harrison/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 530, in cmd_run
    state.submit_background(bjob)
  File "/data/harrison/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 117, in submit_background
    bjob.submit(self, exe, script_fn) # Can raise
  File "/data/harrison/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 419, in submit
    system(sge_cmd, checked=True) # TODO: Capture sbatch-jobid
  File "/data/harrison/FALCON-integrate/pypeFLOW/pwatcher/fs_based.py", line 549, in system
    raise Exception('{} <- {!r}'.format(rc, call))
Exception: 256 <- 'sbatch -J P030fee073fa368 -pe smp 12 -D /data/harrison/falcon/mypwatcher/jobs/P030fee073fa368 -o stdout -e stderr --wrap="/bin/bash /data/harrison/falcon/mypwatcher/wrappers/run-P030fee073fa368.bash"'
[ERROR]Failed to enqueue 1 of 1 jobs: set([Node(3-unzip/reads)])
[WARNING]Nothing is happening, and we had 0 failures. Should we quit? Instead, we will just sleep.
[INFO]sleep 0.1s
Traceback (most recent call last):
  File "/data/harrison/FALCON-integrate/fc_env/bin/fc_unzip.py", line 4, in <module>
    __import__('pkg_resources').run_script('falcon-unzip==0.4.0', 'fc_unzip.py')
  File "/data/harrison/FALCON-integrate/fc_env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 738, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/data/harrison/FALCON-integrate/fc_env/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1499, in run_script
    exec(code, namespace, namespace)
  File "/data/harrison/FALCON-integrate/fc_env/lib/python2.7/site-packages/falcon_unzip-0.4.0-py2.7.egg/EGG-INFO/scripts/fc_unzip.py", line 4, in <module>
    main(sys.argv)
  File "/data/harrison/FALCON-integrate/fc_env/lib/python2.7/site-packages/falcon_unzip-0.4.0-py2.7.egg/falcon_unzip/unzip.py", line 384, in main
    unzip_all(config)
  File "/data/harrison/FALCON-integrate/fc_env/lib/python2.7/site-packages/falcon_unzip-0.4.0-py2.7.egg/falcon_unzip/unzip.py", line 222, in unzip_all
    with open('./3-unzip/reads/ctg_list') as f:
IOError: [Errno 2] No such file or directory: './3-unzip/reads/ctg_list'

This is the contents of fc_unzip.cfg:

[General]
job_type = slurm

# list of fasta files
input_fofn = test.fofn

smrt_bin = /data/harrison/programs/virtualenv-15.1.0/my_env/bin/

sge_phasing = --ntasks=1 --nodes=1 --cpus-per-task=4
sge_quiver = --ntasks=1 --nodes=1 --cpus-per-task=4
sge_track_reads = --ntasks=1 --nodes=1 --cpus-per-task=4
sge_blasr_aln = --ntasks=1 --nodes=1 --cpus-per-task=4
sge_hasm = --ntasks=1 --nodes=1 --cpus-per-task=4

unzip_concurrent_jobs = 1
quiver_concurrent_jobs = 1
@pb-cdunn
Copy link

Instead of job_type=slurm, I recommend using a "blocking" process-watcher, with your own blocking call to Slurm.

That might be easier to debug. But either way, you need to go into the task directory to see stderr/stdout (possibly under pwatcher.dir/ in your case).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants