- Notation remarks:
- ">" denotes the command line prompt (whatever it may be in your profile)
- ">>" denotes the output of a command.
- On a command line, "<x>" is an example of an argument value.
- MCCE_Benchmarking pip-installed into a dedicated environment, e.g.
mce
;
(mce) >pip install git+https://github.com/GunnerLab/MCCE_Benchmarking.git#egg=mcce_benchmark
- The environment is activated:
>conda activate mce
(mce) >cd /home/muddin/raihan_work/complexI/complexI_E-channel/md_frames/mq_all_frames
# List its contents (truncated):
(mce) >ls -l
drwxrwxr-x 3 muddin muddin 31 Jul 30 2023 fr0/
drwxrwxr-x 7 muddin muddin 4.0K Aug 2 2023 frame0/
drwxrwxr-x 7 muddin muddin 4.0K Jul 30 2023 frame10/
drwxrwxr-x 6 muddin muddin 4.0K Jul 30 2023 frame20/
drwxrwxr-x 6 muddin muddin 4.0K Jul 30 2023 frame30/
drwxrwxr-x 6 muddin muddin 4.0K Jul 30 2023 frame40/
drwxrwxr-x 6 muddin muddin 4.0K Jul 30 2023 frame50/
drwxrwxr-x 6 muddin muddin 4.0K Mar 19 01:44 frame60/
#[...]
(mce) >mkdir ./pdblist
# examples:
(mce) >cp frame0/pqrs.pdb ./pdblist/frame0.pdb
(mce) >cp frame10/pqrs.pdb ./pdblist/frame10.pdb
(mce) >ls -l ./pdblist/
rwxrwxrwx2 muddin muddin 25 Mar 20 16:53 frame0.pdb
rwxrwxrwx2 muddin muddin 25 Mar 20 16:53 frame10.pdb
#[...]
The 'setup' mode prepares the pdbs in their own folders where mcce will be run, and creates a custom run script if requested.
The default run script is the following:
#!/bin/bash
step1.py --dry prot.pdb
step2.py -d 4
step3.py -d 4
step4.py --xts
sleep 10
- A directory where the runs will be setup (does not have to exists);
- The name of the folder (or file) listing the pdbs to use, i.e. 'pdblist' in our example;
Note: Changing even just one option, say "-d 4" to "-d 8" makes the script a custom one, in which case a "job name" must be passed to the `-job_name" option.
In our current use case, Raihan needs to run steps 1 and 2 to add a membrane (to be placed in another protein), so we know that the final script will look like:
#!/bin/bash
step1.py --dry prot.pdb [..]
step2.py -d 4
step3.py --norun
step4.py --norun
sleep 10
-job_name
-sentinel_file
: it is "pK.out" if you are running step4, here it must be "step2_out.pdb"- The "off switch" for steps 3 and 4:
--s3_norun
and--s4_norun
-u
: Accepts all "not often used" parameters, e.g. to add a membrane
The fully specified command line for this task is the following (press Enter to run it):
(mce) >bench_setup user_pdbs -bench_dir ./S12M -pdbs_list ./pdblist -job_name s12m -sentinel_file step2_out.pdb --s3_norun --s4_norun -u IPECE_ADD_ME=t,IPECE_MEM_THICKNESS=28
(mce) >ls -l S12M
drwxr-xr-x 5 muddin muddin 4.0K Mar 17 13:42 runs/
(mce) >ls -l S12M/runs
drwxr-xr-x 5 muddin muddin 4.0K Mar 17 13:42 runs/
-rwxr-xr-x 5 muddin muddin 90 Mar 3 10:34 default_run.sh*
-rwxr-xr-x 5 muddin muddin 87 Mar 17 13:42 s12m.sh*
drwxr-xr-x 5 muddin muddin 4.0K Mar 17 13:51 FRAME0/
drwxr-xr-x 5 muddin muddin 4.0K Mar 17 13:51 FRAME10/
drwxr-xr-x 5 muddin muddin 4.0K Mar 17 13:55 FRAME20/
# [...]
-rw-r--r-- 5 muddin muddin 27 Mar 17 13:56 book.txt
(You could also modify it.)
(mce) >cat S12M/runs/s12m.sh
#!/bin/bash
step1.py prot.pdb --dry -u IPECE_ADD_ME=t,IPECE_MEM_THICKNESS=28
step2.py -u IPECE_ADD_ME=t,IPECE_MEM_THICKNESS=28
step3.py --norun -u IPECE_ADD_ME=t,IPECE_MEM_THICKNESS=28
step4.py --xts --norun -u IPECE_ADD_ME=t,IPECE_MEM_THICKNESS=28
sleep 10
Dev TODO: For ibench
: Make -u
step-specific => mapping of keys per step needed.
# first batch:
(mce) >bench_batch -bench_dir ./S12M -job_name s12m -sentinel_file step2_out.pdb
# monitor the bookkeeping file; a state of 'r' (running), means mcce has been launched in that folder:
(mce) >cat S12M/runs/book.txt
FRAME0 r
FRAME10 r
FRAME20 r
#[...]
Wait a few minutes...
# second batch:
(mce) >bench_batch -bench_dir ./S12M -job_name s12m -sentinel_file step2_out.pdb
# book file, some runs show a completed state ('c'):
(mce) >cat small/runs/book.txt
FRAME0 c
FRAME10 r
FRAME20 c
#[...]
Wait a few minutes...
# third batch:
(mce) >bench_batch -bench_dir ./S12M -job_name s12m -sentinel_file step2_out.pdb
# book file:
(mce) >cat small/runs/book.txt
FRAME0 c
FRAME10 c
FRAME20 c
#[...]
If all the folders listed in the book file have a state of 'c' (complete) or 'e' (error): they all have been processed.
You can list one of them to verify that no mcce output file for step3 or step4 was created.
Here is a short survey of the available options.
1. To create a set using the pdbs from pKaDBv1, use the setup command & sub-command bench_setup pkdb_pdbs
.
Note: bench_setup
creates the run script with the command line options for the mcce steps; if none are given, the default run script is used.
-
Minimal input
-bench_dir
:(mce) >bench_setup pkdb_pdbs -bench_dir <some/dir> # -bench_dir is a required arg
You can limit the number of curated proteins (folder) to setup using
-n_pdbs
. Give a low number for testing (max is 120, default):(mce) >bench_setup pkdb_pdbs -bench_dir <some/dir> -n_pdbs <2>
-
-job_name
option The default job name is 'default_run'; you can change it using-job_name
.- If you use
-job_name
without amending any of the steps args, the default_run.sh file will be soft-linked as ; - If you pass one or more non-default mcce steps parameters, then you MUST have a job name (& not 'default_run'):
(mce) >bench_setup pkdb_pdbs -bench_dir <some/dir> -n_pdbs <2> -d 8 -job_name <foo_e8>
- If you use
-
-sentinel_file
option: The default sentinel_file is pK.out (means the run completed Step 4). It is part of script setup to ensure it is deleted prior to launching. You need to include it if you are not running all 4 steps. Example running only steps 1 & 2:
(mce) >bench_setup pkdb_pdbs -bench_dir <some/dir> -s3_norun True -s4_norun True -job_name <up_to_s2> -sentinel_file step2_out.pdb
# OR
(mce) >bench_setup user_pdbs -bench_dir <some/dir> -pdbs_list ./pdbs -s3_norun True -s4_norun True -job_name <up_to_s2> -sentinel_file step2_out.pdb
DO NOT USE --launch
or bench_setup launch
: the scheduling cannot run due to issues in Stable-MCCE
packaging.
Instead, launch a batch of 10 runs every time this command is used:
(mce) >bench_batch -bench_dir <some/dir> -job_name <foo_e8> [-n_batch if not 10] # 10 is the default batch size
--launch
option (flag) means "create a crontab entry to scheduling the processing of all pdbs in the set". Launching the scheduled processing of the set withbench_setup
is possible with the--launch
flag:
(mce) >bench_setup pkdb_pdbs -bench_dir <some/dir> -d 8 -job_name <foo_e8> --launch
# OR
(mce) >bench_setup user_pdbs -bench_dir <some/dir> -pdbs_list ./pdbs -d 8 -job_name <foo_e8> --launch
In this case, you CANNOT review or amend the run script. You would not launch the job this way if, for instance,
you want to add Step 6.
TODO: Step 6 will be part of the script once a hydrogen-bond network comparison procedure is finalized.
2. To create a set using a user-provided list of pdbs, use the setup command & sub-command bench_setup user_pdbs
.
- Minimal input
-bench_dir
,-pdbs_list
:
(mce) >bench_setup user_pdbs -bench_dir <some/dir> -pdbs_list ./path/to/dir or ./path/to/file/of/pdbs filepaths
-
-job_name
option: same usage as withpkdb_pdbs
sub-command -
-sentinel_file
option: same usage as withpkdb_pdbs
sub-command -
--launch
option (flag): same usage as withpkdb_pdbs
sub-command
- Use the automated scheduling to process all the pdbs in batches of size n_batch (default 10):
(mce) >bench_setup launch -bench_dir <some/dir> -job_name <foo> -n_batch <4> [-sentinel_file x] # needed if used in setup
The command line interface for analyzing a job setup via bench_analyze [pkdb_pdbs or user_pdbs]
; requires 1 argument: -bench_dir
.
The first step of the analysis checks that all the runs are completed. Incidentally, it's an easy way to check how far along the processing is.
- Output files & sub-commands:
- user_pdbs: Collated pK.out and sum_crg.out, res and confs count files & related figures
- pkdb_pdbs: Analysis additionally includes res and confs stats viz experimental pKa values and related figures.
To compare two (completed) sets of runs, use bench_compare
:
- Options: -di1, -dir2 -o (output folder), and two flags:
--user_db_pdbs
# absence means 'pkdb_pdbs'--dir2_is_refset
# if used, --user_pdbs must NOT be present, e.g.:
(mce) >bench_compare -dir1 <d1> dir2 parse.e4 --dir2_is_refset -o <d1/comp>
Please, open an issue with any problem installing or using the app, along with enhancement requests on the MCCE_Benchmarking issues page. Thanks.