You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a suite is restarted on a different host, jobs running locally may get marked as failed when the suite restarts.
The following test (which currently fails) outlines the issue:
#!/bin/bash# THIS FILE IS PART OF THE CYLC SUITE ENGINE.# Copyright (C) 2008-2018 NIWA & British Crown (Met Office) & Contributors.## This program is free software: you can redistribute it and/or modify# it under the terms of the GNU General Public License as published by# the Free Software Foundation, either version 3 of the License, or# (at your option) any later version.## This program is distributed in the hope that it will be useful,# but WITHOUT ANY WARRANTY; without even the implied warranty of# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the# GNU General Public License for more details.## You should have received a copy of the GNU General Public License# along with this program. If not, see <http://www.gnu.org/licenses/>.#-------------------------------------------------------------------------------."$(dirname "$0")/test_header"#-------------------------------------------------------------------------------export CYLC_TEST_HOST=$( \ cylc get-global-config -i '[test battery]remote host with shared fs' \2>'/dev/null')if [[ -z"${CYLC_TEST_HOST}" ]];then
skip_all '"[test battery]remote host with shared fs": not defined'fi
set_test_number 3
#-------------------------------------------------------------------------------# Test Cylc's handling of local jobs when the suite is restarted on a different# host.
TEST_DIR="$HOME/cylc-run/" init_suite "${TEST_NAME_BASE}"<<<'[scheduling] [[dependencies]] graph = """ foo:start => bar foo & bar => pub """[runtime] [[foo]] # innocent bystander script = sleep 40 [[bar]] script = """ function poll() { local TIMEOUT="$(($(date +%s) + 60))" # wait 1 minute while (($(date +%s) < TIMEOUT)) && eval "$@"; do sleep 1 done } cylc stop "${CYLC_SUITE_NAME}" --now poll test -f "${CYLC_SUITE_RUN_DIR}/.service/contact" sleep 1 cylc restart "${CYLC_SUITE_NAME}" --host="'"${CYLC_TEST_HOST}"'" """'
cylc run "${SUITE_NAME}" --host='localhost'# wait for suite to stop
FILE=$(cylc cat-log "${SUITE_NAME}" -m p |xargs readlink -f)
log_scan "${TEST_NAME_BASE}-stop""${FILE}" 20 1 \
'Suite shutting down - REQUEST(NOW)'# wait for suite to restart
poll !test -f "${SUITE_RUN_DIR}/.service/contact"
sleep 2
# wait for suite to stop
FILE=$(cylc cat-log "${SUITE_NAME}" -m p |xargs readlink -f)
log_scan "${TEST_NAME_BASE}-restart""${FILE}" 60 1 \
'\[pub.1\] -(current:running)> succeeded' \
'Suite shutting down - AUTOMATIC'
cylc stop "${SUITE_NAME}" --now --now
poll test -f "${SUITE_RUN_DIR}/.service/contact"
sleep 1
purge_suite "${SUITE_NAME}"
The text was updated successfully, but these errors were encountered:
When a suite is restarted on a different host, jobs running locally may get marked as failed when the suite restarts.
The following test (which currently fails) outlines the issue:
The text was updated successfully, but these errors were encountered: