Skip to content

Commit

Permalink
testsuite: test prolog timeout after cancel
Browse files Browse the repository at this point in the history
Problem: No tests check that the perilog plugin handles the case
where a job prolog is canceled and the cancel times out.

Add a test to t2274-manager-perilog-per-rank.t that cancels a prolog
that ignores SIGTERM and ensure the job does not start.
  • Loading branch information
grondo committed Nov 3, 2024
1 parent b7bd48e commit 2811174
Showing 1 changed file with 23 additions and 0 deletions.
23 changes: 23 additions & 0 deletions t/t2274-manager-perilog-per-rank.t
Original file line number Diff line number Diff line change
Expand Up @@ -345,12 +345,35 @@ test_expect_success 'perilog: epilog failure drains ranks' '
test "$(flux resource drain -no {reason})" = "epilog failed for job $jobid" &&
undrain_all
'
test_expect_success 'perilog: job does not start when prolog cancel times out' '
undrain_all &&
flux config load <<-EOF &&
[job-manager.prolog]
per-rank = true
command = [ "sh",
"-c",
"trap \"\" 15; sleep 2" ]
kill-timeout = 1
timeout = ".25s"
[job-manager.epilog]
per-rank = true
command = [ "true" ]
EOF
flux jobtap query perilog.so | jq &&
jobid=$(flux submit hostname) &&
flux job wait-event -vHt 30 $jobid clean &&
flux job eventlog -H $jobid >prolog-cancel-eventlog.out &&
cat prolog-cancel-eventlog.out &&
test_must_fail grep "start$" prolog-cancel-eventlog.out &&
grep epilog-start prolog-cancel-eventlog.out
'
test_expect_success 'perilog: load offline plugin before perilog.so' '
flux jobtap remove perilog.so &&
flux jobtap load $OFFLINE_PLUGIN &&
flux jobtap load perilog.so
'
test_expect_success 'perilog: load simple prolog for offline rank testing' '
undrain_all &&
flux config load <<-EOF
[job-manager.prolog]
per-rank = true
Expand Down

0 comments on commit 2811174

Please sign in to comment.