-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/logging #86
Feature/logging #86
Conversation
@@ -1,2 +1,6 @@ | |||
#!/bin/bash | |||
sbatch {options_args:s} --array=1-{jobs_no:d} --output="{log_dir:s}/output_%j_%a.log" --error="{log_dir:s}/error_%j_%a.log" {script_path:s} | |||
|
|||
LOGFILE="$(dirname $0)/submit.log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment where LOGFILE will be located. user may not know what is ""$(dirname $0)
It would be also nice to print on screen sth like:
Saving logs to ....
Codecov Report
@@ Coverage Diff @@
## master #86 +/- ##
=======================================
Coverage 64.32% 64.32%
=======================================
Files 12 12
Lines 855 855
=======================================
Hits 550 550
Misses 305 305 Continue to review full report at Codecov.
|
Why this PR has so many commits ? They have been correctly merged already: Please fix all git related issues before proceeding. I see your master branch is not up to date. See Maybe reading this one https://help.github.com/articles/syncing-a-fork/ will help. |
394dd0d
to
c3e11be
Compare
Ok, I see only two commits, so I guess it's fixed. |
LOGFILE="$(cd $(dirname $0) && pwd)/submit.log" | ||
printf "Job ID: " > "$LOGFILE" | ||
|
||
sbatch {options_args:s} --array=1-{jobs_no:d} --output="{log_dir:s}/output_%j_%a.log" --error="{log_dir:s}/error_%j_%a.log" --parsable {script_path:s} | cut -d ";" -f 1 >> "$LOGFILE" && echo "Saving logs to $LOGFILE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to save also stderr stream (i.e. using 2>?1
, see http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-3.html)
I do not like this approach with &&
. It won't work if sbatch returns non-zero exit code. See i.e.:
→ echo "Job id:" > 1.log
→ ls aaaa >> 1.log && echo "Saving logs"
ls: cannot access 'aaaa': No such file or directory
→ cat 1.log
Job id:
→ echo "Job id:" > 2.log
→ ls >> 2.log && echo "Saving logs"
Saving logs
→ cat 2.log
Job id:
1.csv
1.log
1.txt
I'd put this echo statement a line before printf "Job ID: " > "$LOGFILE"
(as there really writing to logs happen)
@@ -1,2 +1,7 @@ | |||
#!/bin/bash | |||
sbatch {options_args:s} --array=1-{jobs_no:d} --output="{log_dir:s}/output_%j_%a.log" --error="{log_dir:s}/error_%j_%a.log" {script_path:s} | |||
|
|||
# Log file submit.log will be created in the same directory submit.sh is located |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a clear description what is the difference between submit.log
and "{log_dir:s}/output_%j_%a.log"
files. First one will contain stderr and stdout from sbatch command (including job id and probably job execution status). Second one contains stdout from the MC engine executable (doing our simulation).
Concerning start and end time of a job - what about doing dumping such information at the beginning and end of https://github.com/DataMedSci/mcpartools/blob/master/mcpartools/scheduler/data/run_slurm.sh ? |
|
||
# Log file submit.log will be created in the same directory submit.sh is located | ||
LOGFILE="$(cd $(dirname $0) && pwd)/submit.log" | ||
printf "Job ID: " > "$LOGFILE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log also job submission time, at this place or after sbatch command.
echo "Saving logs to $LOGFILE" | ||
printf "Job ID: " > "$LOGFILE" | ||
|
||
sbatch {options_args:s} --array=1-{jobs_no:d} --output="{log_dir:s}/output_%j_%a.log" --error="{log_dir:s}/error_%j_%a.log" --parsable {script_path:s} | cut -d ";" -f 1 >> "$LOGFILE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest following thing:
First writes sbatch
stderr and stout to some temporary file (see i.e. mktemp
command).
Then parse its output using cut
. If job id was found, then save it to the log file, otherwise put there what was generated.
Move printf "Job ID" after sbatch command. Do not forget about stderr stream.
See i.e. the case whens batch fails due to wrong user arguments:
# plgkongruencj at login01.pro.cyfronet.pl in /net/scratch/people/plgkongruencj/mcpartools on git:feature/logging ✖︎ [20:33:02]
→ PYTHONPATH=. python mcpartools/generatemc.py -j 10 -p 100 tests/res/shieldhit -s "[--ttime=1:90:90]" -w aa
# plgkongruencj at login01.pro.cyfronet.pl in /net/scratch/people/plgkongruencj/mcpartools on git:feature/logging ✖︎ [20:33:34]
→ cat aa/run_20170902_203334/submit.sh
#!/bin/bash
# Log file submit.log will be created in the same directory submit.sh is located
LOGFILE="$(cd $(dirname $0) && pwd)/submit.log"
echo "Saving logs to $LOGFILE"
printf "Job ID: " > "$LOGFILE"
sbatch --ttime=1:90:90 --array=1-10 --output="/net/scratch/people/plgkongruencj/mcpartools/aa/run_20170902_203334/workspace/log/output_%j_%a.log" --error="/net/scratch/people/plgkongruencj/mcpartools/aa/run_20170902_203334/workspace/log/error_%j_%a.log" --parsable /net/scratch/people/plgkongruencj/mcpartools/aa/run_20170902_203334/workspace/main_run.sh | cut -d ";" -f 1 >> "$LOGFILE"
printf "Submission time: `date +"%Y-%m-%d %H:%M:%S"`" >> "$LOGFILE"
# plgkongruencj at login01.pro.cyfronet.pl in /net/scratch/people/plgkongruencj/mcpartools on git:feature/logging ✖︎ [20:33:42]
→ ./aa/run_20170902_203334/submit.sh
Saving logs to /net/scratch/people/plgkongruencj/mcpartools/aa/run_20170902_203334/submit.log
sbatch: unrecognized option '--ttime=1:90:90'
Try "sbatch --help" for more information
# plgkongruencj at login01.pro.cyfronet.pl in /net/scratch/people/plgkongruencj/mcpartools on git:feature/logging ✖︎ [20:33:55]
→ echo $?
0
# plgkongruencj at login01.pro.cyfronet.pl in /net/scratch/people/plgkongruencj/mcpartools on git:feature/logging ✖︎ [20:34:09]
→ cat /net/scratch/people/plgkongruencj/mcpartools/aa/run_20170902_203334/submit.log
Job ID: Submission time: 2017-09-02 20:33:54%
Instead of parsing both stdout and stderr I used seperated files for them and then log them. With this if some error will occur but command itself will end with success there will be no problems with parsing Job ID |
d642309
to
b9b6c5b
Compare
cat $ERR >> "$LOGFILE" | ||
fi | ||
|
||
rm $OUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if script is killed or sbatch crashes ?
Consider making trap like this one: https://stackoverflow.com/questions/687014/removing-created-temp-files-in-unexpected-bash-exit
LOGFILE="$(cd $(dirname $0) && pwd)/submit.log" | ||
echo -n "" > "$LOGFILE" | ||
|
||
OUT=`mktemp` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a bit of comment what is happening here. Code does much better job now, but starts to be much harder to read.
Sth like: "we create now two temporary files, which will store stderr and stdout of sbatch command. Their content will be used for furter analysis, i.e. extraction of jobid. Files will be removed on script exit (see trap command)"
I changed the submit templates and tested it on system with Slurm, but unfortunately I don't have access to cluster with Torque, so it only should work in theory.
I haven't figured out how to log start and end times of batch jobs as requested in an corresponding issue (#84) yet, but I'll put my mind into it.