Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move suite servers to scheduler section of global config #3962

Merged
merged 7 commits into from
Nov 26, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 86 additions & 75 deletions cylc/flow/cfgspec/globalcfg.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,92 @@
desc='''
The number of old run directory trees to retain at start-up.
''')
Conf('auto restart delay', VDR.V_INTERVAL, desc='''
Relates to Cylc's auto stop-restart mechanism (see
:ref:`auto-stop-restart`). When a host is set to automatically
shutdown/restart it will first wait a random period of time
between zero and ``auto restart delay`` seconds before
beginning the process. This is to prevent large numbers of
suites from restarting simultaneously.
''')
MetRonnie marked this conversation as resolved.
Show resolved Hide resolved
with Conf('run hosts', desc='''
Configure allowed suite hosts and ports for starting up (running or
restarting) suites. Additionally configure host selection settings
specifying how to determine the most suitable run host at any given
time from those configured.
'''):
Conf('available', VDR.V_SPACELESS_STRING_LIST, desc='''
A list of allowed suite run hosts. One of these hosts will be
appointed for a suite to start up on if an explicit host is not
provided as an option to a ``run`` or ``restart`` command.
''')
Conf('ports', VDR.V_INTEGER_LIST, list(range(43001, 43101)),
desc='''
A list of allowed ports for Cylc to use to run suites.
''')
Conf('condemned', VDR.V_ABSOLUTE_HOST_LIST, desc='''
Hosts specified in ``condemned hosts`` will not be considered
as suite run hosts. If suites are already running on
``condemned hosts`` they will be automatically shutdown and
restarted (see:ref:`auto-stop-restart`).
''')
Conf('ranking', VDR.V_STRING, desc='''
A multiline string containing Python expressions to filter
and/or rank hosts. For example:
cpu_percent() < 70
cpu_percent()
to filter by cpu_percent() < 70 then to rank by cpu_percent.
''')

with Conf('host self-identification', desc='''
The suite host's identity must be determined locally by cylc and
passed to running tasks (via ``$CYLC_SUITE_HOST``) so that task
messages can target the right suite on the right host.

.. todo
Is it conceivable that different remote task hosts at the same site
might see the suite host differently? If so we would need to be
able to override the target in suite configurations.
'''):
Conf(
'method', VDR.V_STRING, 'name',
options=['name', 'address', 'hardwired'],
desc='''
This item determines how cylc finds the identity of the
suite host.For the default *name* method cylc asks the
suite host
for its host name. This should resolve on remote task
hosts to
the IP address of the suite host; if it doesn't, adjust
network settings or use one of the other methods. For the
*address* method, cylc attempts to use a special external
"target address" to determine the IP address of the suite
host as seen by remote task hosts. And finally, as a
last resort, you can choose the *hardwired* method and
manually
specify the host name or IP address of the suite host.

Options:

name
Self-identified host name.
address
Automatically determined IP address (requires *target*).
hardwired
Manually specified host name or IP address (requires
*host*).
''')
Conf('target', VDR.V_STRING, 'google.com', desc='''
This item is required for the *address* self-identification
method. If your suite host sees the internet, a common address
such as ``google.com`` will do; otherwise choose a host
visible on your intranet.
''')
Conf('host', VDR.V_STRING, desc='''
Use this item to explicitly set the name or IP address of the
suite host if you have to use the *hardwired*
self-identification method.
''')

with Conf('events', desc='''
You can define site defaults for each of the following options,
Expand Down Expand Up @@ -536,81 +622,6 @@
Conf('from', VDR.V_STRING)
Conf('to', VDR.V_STRING)

# suite
with Conf('suite host self-identification', desc='''
The suite host's identity must be determined locally by cylc and passed
to running tasks (via ``$CYLC_SUITE_HOST``) so that task messages can
target the right suite on the right host.

.. todo
Is it conceivable that different remote task hosts at the same site
might see the suite host differently? If so we would need to be able
to override the target in suite configurations.
'''):
Conf('method', VDR.V_STRING, 'name',
options=['name', 'address', 'hardwired'], desc='''
This item determines how cylc finds the identity of the suite host.
For the default *name* method cylc asks the suite host for its host
name. This should resolve on remote task hosts to the IP address of
the suite host; if it doesn't, adjust network settings or use one
of the other methods. For the *address* method, cylc attempts to
use a special external "target address" to determine the IP address
of the suite host as seen by remote task hosts. And finally, as a
last resort, you can choose the *hardwired* method and manually
specify the host name or IP address of the suite host.

Options:

name
Self-identified host name.
address
Automatically determined IP address (requires *target*).
hardwired
Manually specified host name or IP address (requires *host*).
''')
Conf('target', VDR.V_STRING, 'google.com', desc='''
This item is required for the *address* self-identification method.
If your suite host sees the internet, a common address such as
``google.com`` will do; otherwise choose a host visible on your
intranet.
''')
Conf('host', VDR.V_STRING, desc='''
Use this item to explicitly set the name or IP address of the suite
host if you have to use the *hardwired* self-identification method.
''')

# suite
with Conf('suite servers', desc='''
Configure allowed suite hosts and ports for starting up (running or
restarting) suites. Additionally configure host selection settings
specifying how to determine the most suitable run host at any given
time from those configured.
'''):
Conf('run hosts', VDR.V_SPACELESS_STRING_LIST, desc='''
A list of allowed suite run hosts. One of these hosts will be
appointed for a suite to start up on if an explicit host is not
provided as an option to a ``run`` or ``restart`` command.
''')
Conf('run ports', VDR.V_INTEGER_LIST, list(range(43001, 43101)),
desc='''
A list of allowed ports for Cylc to use to run suites.
''')
Conf('condemned hosts', VDR.V_ABSOLUTE_HOST_LIST, desc='''
Hosts specified in ``condemned hosts`` will not be considered as
suite run hosts. If suites are already running on ``condemned
hosts`` they will be automatically shutdown and restarted (see
:ref:`auto-stop-restart`).
''')
Conf('auto restart delay', VDR.V_INTERVAL, desc='''
Relates to Cylc's auto stop-restart mechanism (see
:ref:`auto-stop-restart`). When a host is set to automatically
shutdown/restart it will first wait a random period of time between
zero and ``auto restart delay`` seconds before beginning the
process. This is to prevent large numbers of suites from restarting
simultaneously.
''')
Conf('ranking', VDR.V_STRING)


def upg(cfg, descr):
"""Upgrader."""
Expand Down
10 changes: 7 additions & 3 deletions cylc/flow/host_select.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,16 @@ def select_suite_host(cached=True):

return select_host(
# list of suite hosts
global_config.get(['suite servers', 'run hosts']) or ['localhost'],
global_config.get([
'scheduler', 'run hosts', 'available'
]) or ['localhost'],
# rankings to apply
ranking_string=global_config.get(['suite servers', 'ranking']),
ranking_string=global_config.get([
'scheduler', 'run hosts', 'ranking'
]),
# list of condemned hosts
blacklist=global_config.get(
['suite servers', 'condemned hosts']
['scheduler', 'run hosts', 'condemned']
),
blacklist_name='condemned host'
)
Expand Down
6 changes: 3 additions & 3 deletions cylc/flow/hostuserutil.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,13 +125,13 @@ def _get_host_info(self, target=None):
@staticmethod
def _get_identification_cfg(key):
"""Return the [suite host self-identification]key global conf."""
return glbl_cfg().get(['suite host self-identification', key])
return glbl_cfg().get(['scheduler', 'host self-identification', key])

def get_host(self):
"""Return the preferred identifier for the suite (or current) host.

As specified by the "suite host self-identification" settings in the
site/user global.cylc files. This is mainly used for suite host
As specified by the "[scheduler][host self-identification]" settings in
the site/user global.cylc files. This is mainly used for suite host
identification by task jobs.

"""
Expand Down
27 changes: 15 additions & 12 deletions cylc/flow/main_loop/auto_restart.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"""Automatically restart suites if they are running on bad servers.

Loads in the global configuration to check if the server a suite is running
on is listed in :cylc:conf:`global.cylc[suite servers]condemned hosts`.
on is listed in :cylc:conf:`global.cylc[scheduler]condemned hosts`.

This is useful if a host needs to be taken off-line e.g. for scheduled
maintenance.
Expand All @@ -26,11 +26,11 @@

.. cylc-scope:: global.cylc

- :cylc:conf:`[suite servers]auto restart delay`
- :cylc:conf:`[suite servers]condemned hosts`
- :cylc:conf:`[suite servers]run hosts`
- :cylc:conf:`[scheduler]auto restart delay`
- :cylc:conf:`[scheduler][run hosts]condemned`
- :cylc:conf:`[scheduler][run hosts]available`

.. cylc-scope:: global.cylc[suite servers]
.. cylc-scope:: global.cylc[scheduler]

The auto stop-restart feature has two modes:

Expand All @@ -56,9 +56,10 @@

.. code-block:: cylc

[suite servers]
run hosts = pub
condemned hosts = foo, bar!
[scheduler]
[[run hosts]]
available = pub
condemned = foo, bar!

.. warning::

Expand All @@ -67,9 +68,11 @@
are evaluated on the suite host server.

To prevent large numbers of suites attempting to restart simultaneously the
:cylc:conf:`auto restart delay` setting defines a period of time in seconds.
:cylc:conf:`[scheduler]auto restart delay` setting defines a period of time in
seconds.
Suites will wait for a random period of time between zero and
:cylc:conf:`auto restart delay` seconds before attempting to stop and restart.
:cylc:conf:`[scheduler]auto restart delay` seconds before attempting to stop
and restart.

Suites that are started up in no-detach mode cannot auto stop-restart on a
different host - as it will still end up attached to the condemned host.
Expand Down Expand Up @@ -107,7 +110,7 @@ async def auto_restart(scheduler, _):
_set_auto_restart(
scheduler,
restart_delay=current_glbl_cfg.get(
['suite servers', 'auto restart delay']
['scheduler', 'auto restart delay']
),
mode=mode
)
Expand All @@ -117,7 +120,7 @@ def _should_auto_restart(scheduler, current_glbl_cfg):
# check if suite host is condemned - if so auto restart
if scheduler.stop_mode is None:
for host in current_glbl_cfg.get(
['suite servers', 'condemned hosts']
['scheduler', 'run hosts', 'condemned']
):
if host.endswith('!'):
# host ends in an `!` -> force shutdown mode
Expand Down
2 changes: 1 addition & 1 deletion cylc/flow/scheduler.py
Original file line number Diff line number Diff line change
Expand Up @@ -528,7 +528,7 @@ async def configure(self):

async def start_servers(self):
"""Start the TCP servers."""
port_range = glbl_cfg().get(['suite servers', 'run ports'])
port_range = glbl_cfg().get(['scheduler', 'run hosts', 'ports'])
self.server.start(port_range[0], port_range[-1])
self.publisher.start(port_range[0], port_range[-1])
# wait for threads to setup socket ports before continuing
Expand Down
2 changes: 1 addition & 1 deletion cylc/flow/scheduler_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ def get_option_parser(is_restart, add_std_opts=False):
help=(
"Specify the host on which to start-up the suite. "
"If not specified, a host will be selected using "
"the 'suite servers' global config."
"the '[scheduler]run hosts' global config."
),
metavar="HOST", action="store", dest="host")

Expand Down
12 changes: 7 additions & 5 deletions tests/flakyfunctional/restart/39-auto-restart-no-suitable-host.t
Original file line number Diff line number Diff line change
Expand Up @@ -44,18 +44,20 @@ init_suite "${TEST_NAME_BASE}" <<< '

create_test_global_config '' "
${BASE_GLOBAL_CONFIG}
[suite servers]
run hosts = localhost
[scheduler]
[[run hosts]]
available = localhost
"

cylc run "${SUITE_NAME}" --debug
poll_suite_running

create_test_global_config '' "
${BASE_GLOBAL_CONFIG}
[suite servers]
run hosts = localhost
condemned hosts = $(localhost_fqdn)
[scheduler]
[[run hosts]]
available = localhost
condemned = $(localhost_fqdn)
"

FILE=$(cylc cat-log "${SUITE_NAME}" -m p |xargs readlink -f)
Expand Down
12 changes: 7 additions & 5 deletions tests/flakyfunctional/restart/40-auto-restart-force-stop.t
Original file line number Diff line number Diff line change
Expand Up @@ -40,18 +40,20 @@ init_suite "${TEST_NAME_BASE}" <<< '

create_test_global_config '' "
${BASE_GLOBAL_CONFIG}
[suite servers]
run hosts = localhost
[scheduler]
[[run hosts]]
available = localhost
"

cylc run "${SUITE_NAME}" --hold
poll_suite_running

create_test_global_config '' "
${BASE_GLOBAL_CONFIG}
[suite servers]
run hosts = localhost
condemned hosts = $(localhost_fqdn)!
[scheduler]
[[run hosts]]
available = localhost
condemned = $(localhost_fqdn)!
"

FILE=$(cylc cat-log "${SUITE_NAME}" -m p |xargs readlink -f)
Expand Down
2 changes: 1 addition & 1 deletion tests/functional/lib/bash/test_header
Original file line number Diff line number Diff line change
Expand Up @@ -700,7 +700,7 @@ create_test_global_config() {
# Tidy in case of previous use of this function.
rm -fr 'etc'
mkdir 'etc'
# Suite host self-identification method.
# Scheduler host self-identification method.
echo "$PRE" >'etc/global.cylc'
USER_TESTS_CONF_FILE="${HOME}/.cylc/flow/$(cylc version)/global-tests.cylc"
if [[ -f "${USER_TESTS_CONF_FILE}" ]]; then
Expand Down
10 changes: 6 additions & 4 deletions tests/functional/restart/34-auto-restart-basic.t
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,9 @@ BASE_GLOBAL_CONFIG="
abort on timeout = True
inactivity = PT1M
timeout = PT1M
[suite servers]
run hosts = localhost, ${CYLC_TEST_HOST}"
[scheduler]
[[run hosts]]
available = localhost, ${CYLC_TEST_HOST}"

TEST_NAME="${TEST_NAME_BASE}"
TEST_DIR="$HOME/cylc-run/" init_suite "${TEST_NAME}" - <<'__FLOW_CONFIG__'
Expand All @@ -55,8 +56,9 @@ cylc suite-state "${SUITE_NAME}" --task='task_foo01' \
# condemn localhost
create_test_global_config '' "
${BASE_GLOBAL_CONFIG}
[suite servers]
condemned hosts = $(hostname)
[scheduler]
[[run hosts]]
condemned = $(hostname)
"
# test shutdown procedure - scan the first log file
FILE=$(cylc cat-log "${SUITE_NAME}" -m p |xargs readlink -f)
Expand Down
Loading