-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve comm #2582
Improve comm #2582
Conversation
e491e2a
to
c425e52
Compare
(Trying to figure out how to write an automated test for the suite UUID change. Otherwise, this PR is good for review.) |
56b0de5
to
60012c7
Compare
(Re-based.) |
60012c7
to
14e8f63
Compare
(Re-based. Codacy failure is nonsense.) |
afadadb
to
881762a
Compare
(Codacy failure is nonsense.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still working my way through this but looks pretty good so far.
if func_name == 'put_messages': | ||
results = self._call_server(func_name, payload=payload) | ||
elif func_name == 'put_message': # API 1, 7.5.0 compat | ||
cycle, name = payload['task_job'].split('/')[0:2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be easier to provide these as separate dictionary keys to avoid having to split the value in multiple places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The task job or task id is used here for consistency with the other API methods that deal with tasks. We should definitely improve this in the future.
@cherrypy.expose | ||
@cherrypy.tools.json_in() | ||
@cherrypy.tools.json_out() | ||
def put_messages(self, task_job=None, event_time=None, messages=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be args rather than kwargs (else we could end up passing 'None'
strings into the scheduler or does it make little difference anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes no difference here, because the data actually come in via the payload JSON, not really via the arguments.
command.append(user_at_host) | ||
|
||
# Pass cylc version through. | ||
command += ['env', quote(r'CYLC_VERSION=%s' % CYLC_VERSION)] | ||
if 'CYLC_UTC' in os.environ: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a change in behaviour?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is functionality ported from the old SSH functionality of cylc.task_message
. It is beneficial to other commands as well.
(Sorry to keep re-basing here. Still trying to get Travis CI to pass.) |
(Travis CI has finally passed, but still reporting as in progress for some reason.) |
bin/cylc-client
Outdated
suite, options.owner, options.host, options.port, | ||
options.comms_timeout, my_uuid=options.set_uuid, | ||
print_uuid=options.print_uuid, auth=auth) | ||
kwargs = json.load(sys.stdin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whilst cylc client
is an internal command it would be nice if we could call it without providing stdin if it is not required i.e.:
cylc client get_suite_state_summary <reg>
Rather than:
cylc client get_suite_state_summary <reg> <<< '{}'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
json.load
raises a ValueError
on an empty file. Although the error message can be used to deduce an empty file, the message may change in the future and break out logic here.
I have now introduced a -n
option for those API functions that do not expect extra keyword arguments. It is not as clever, but is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solution might work https://stackoverflow.com/a/3763257.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only as a solution between interactive versus batch usage. Not quite the intention here.
bin/cylc-client
Outdated
print_uuid=options.print_uuid, auth=auth) | ||
kwargs = json.load(sys.stdin) | ||
sys.stdin.close() | ||
sys.stdout.write(json.dumps(getattr(pclient, func)(**kwargs), indent=4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for pretty printing!
(Re-based and resolved conflicts.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(started reviewing; I'll try to finish tomorrow.)
bin/cylc-help
Outdated
@@ -270,6 +270,8 @@ control_commands['set-verbosity'] = ['set-verbosity'] | |||
control_commands['broadcast'] = ['broadcast', 'bcast'] | |||
control_commands['ext-trigger'] = ['ext-trigger', 'external-trigger'] | |||
control_commands['checkpoint'] = ['checkpoint'] | |||
control_commands['message'] = ['message', 'task-message'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'message' is already categorized as a task_command
I have added some doc strings to the method where Codacy is reporting the security issues related to usage of |
(Re-based + Resolved Conflicts.) |
yep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good, no problems found. I still need to do a bit of testing...
typeset run_err_script="$3" | ||
shift 3 | ||
typeset run_err_script="$2" | ||
shift 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments documenting these arguments needs updating (in the comments just above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lib/cylc/remote.py
Outdated
self.owner = arg.replace('--user=', '') | ||
elif arg.startswith('--host='): | ||
self.host = arg.replace('--host=', '') | ||
elif arg == '--ssh-cylc=': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ssh command generated for ssh task messaging is screwed up. This line is the problem - it should use startswith()
not ==
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me now.
Generate new suite UUID on normal run. Load previous value on restart. In a task job environment, check that the environment value of the suite UUID matches the one in the suite contact file. Don't send message on mismatch.
cylc client: new command to call network API client. * Replace usage of `curl` in tests with this command. When SSH messaging is already set up for a task remote, other client commands will pick up the set up automatically without having to use `--use-ssh` and related options. Client logic will run locally, with only network logic done via `cylc client --use-ssh`. cylc message: * Expect suite name and task job (cycle/task/submit) as arguments. * Can now send multiple messages at different severities. * API method `put_messages`. * Multiple messages in one call. * Event time as separate argument. * Task job (cycle/task/submit) instead of ID (task.cycle). Server program will ignore task messages from old submits. Reduce job failed message to just say `failed/SIGNAL`. cylc remote-init: now handles SSH comm set up. Job files no longer contain SSH comm settings. Rationalise `cylc.task_message` and `cylc.network.httpclient`. Location of network logic and retry can now be found in the latter.
Branch re-based. Conflicts resolved. Now use new logic of #2503 for SSH cylc client call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review in progress - on priority so will complete tomorrow. General proposal as I have understood it so far seems very sensible. Reporting a few possible minor issues I have spotted at this stage.
bin/cylc-client
Outdated
#!/usr/bin/env python | ||
|
||
# THIS FILE IS PART OF THE CYLC SUITE ENGINE. | ||
# Copyright (C) 2008-2016 NIWA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy New Year x2: 2016 2018
bin/cylc-client
Outdated
if remrun(): | ||
sys.exit(0) | ||
|
||
import cylc.flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused import (assuming the standardcylc.flags.debug
catch in the main()
function is not necessary/advised as this command is only for internal use)?
bin/cylc-message
Outdated
failure. Applications run by task jobs can use this command to report messages | ||
and to report registered task outputs. | ||
|
||
Messages can be specified as arguments. A '-' indicates that messages the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is amiss with the second sentence: '...that messages the...' seems likely.
lib/cylc/network/httpclient.py
Outdated
# * Exhausted number of tries. | ||
# * Contact info file not found, suite probably not running. | ||
# Don't bother with retry, suite restart will poll any way. | ||
if i >= max_tries or isinstance(exc, ClientInfoError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClientInfoError
is raised separately prior to it's base exception class ClientError
on 287:288, so I believe the second condition will never be True
as it would have already have been caught?
Fix copyright year. Fix doc for new commands. Fix bad exception handling. Remove pointless warning message.
Minor issues and comments addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review complete. Test battery passes locally for me, with new tests suitable. Feedback from first sub-review addressed appropriately. I have several new comments, though they are largely minor so can be addressed separately once this has been merged if this would help ease the PR 'logjam', hence approving.
lib/cylc/network/httpclient.py
Outdated
"""Call server via "cylc client --use-ssh". | ||
|
||
Call "cylc client --use-ssh" using `subprocess.Popen`. Payload and | ||
arguments of the API method are serialized as JSON and is written to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few tiny docstring issues that we could ignore as it still makes sense but ideally we could change: '...is are written...' here & then 'even gets a chance' on line 534.
lib/cylc/network/httpclient.py
Outdated
json.dump(kwargs, stdin) | ||
stdin.seek(0) | ||
else: | ||
stdin = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stdin = open(os.devnull)
instead unless there is a reason that it is unnecessary or wrong in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now handled by remote_cylc_cmd
, which will set this to open(os.devnull)
and add -n
to the SSH command line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add some extra comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, sorry - my bad! By all means add some comments if you think others might not see this but having had a second look it seems fairly obvious.
lib/cylc/network/httpclient.py
Outdated
# Check mismatch suite UUID | ||
env_suite = os.getenv(self.srv_files_mgr.KEY_NAME) | ||
env_uuid = os.getenv(self.srv_files_mgr.KEY_UUID) | ||
if (self.suite and env_suite and env_uuid and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would I think be easier to comprehend the list of conditions if env_suite == self.suite
came before the env_uuid
truth condition so that line 628 is a related group (or an re-ordering along a similar line).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. It proves that our brains are wired differently. I'll make the change to keep you happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need, it's your code & obviously an inconsequential change. It seemed a much more natural grouping to me personally but I can only use my own brain - I can only guess how others might regard things :)
lib/cylc/network/httpserver.py
Outdated
@cherrypy.tools.json_in() | ||
@cherrypy.tools.json_out() | ||
def put_messages(self, task_job=None, event_time=None, messages=None): | ||
"""Put task messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description should be improved with a bit more context.
lib/cylc/remote.py
Outdated
@@ -81,21 +85,24 @@ def remote_cylc_cmd(cmd, user=None, host=None, capture=False, | |||
# The command is read from the site/user global config file, but we check | |||
# above that it ends in 'cylc', and in any case the user could execute | |||
# any such command directly via ssh. | |||
proc = Popen(command, stdout=stdout, stdin=open(os.devnull)) | |||
if stdin is None: | |||
stdin = open(os.devnull) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could just set this at the end of the block on line 54 (unless it is particularly important that the code there deals entirely with the commands & the stdin/out in the subsequent code).
lib/cylc/task_message.py
Outdated
job_log_name = os.path.join( | ||
glbl_cfg().get_derived_host_item(suite, 'suite job log directory'), | ||
'job') | ||
job_status_file = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? And if so shouldn't it get set to an empty file & not None
so that if not caught as an IOError
the write()
method isn't called on a non-file object on lines 111 etc?
if job_id > 1: | ||
# If os.getppid() returns 1, the original job process | ||
# is likely killed already | ||
job_status_file.write('%s=%s\n' % (CYLC_JOB_PID, job_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems quite unwieldy to keep applying the string formatting template '%s=%s\n'
so could we assign it as a regex to a constant to apply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is actually clearer in the context here to use a string literal for format strings instead of introducing a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough! I was mainly thinking it would make it much simpler if we ever wanted to adjust the 'template' but would probably make the code a little less clear, yes.
# is likely killed already | ||
job_status_file.write('%s=%s\n' % (CYLC_JOB_PID, job_id)) | ||
job_status_file.write('%s=%s\n' % (CYLC_JOB_INIT_TIME, event_time)) | ||
elif message == TASK_OUTPUT_SUCCEEDED: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This elif
statements use the same format & variables to write
to job_status_file
as the following two, with only the 'message' changing, so 112:127 could definitely be consolidated (at some point in the future at least).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not worth doing as part of this change, given that this is not really new code (- just re-indented), and it is unlikely that we can reduce the line count by consolidation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
@@ -56,6 +58,11 @@ def remote_init(uuid_str, rund): | |||
tarhandle.close() | |||
finally: | |||
os.chdir(oldcwd) | |||
if indirect_comm: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if indirect_comm is not None
given default is None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a boolean, so can be set to False as well.
env no_proxy=* curl -A "${AGENT}" -v --cacert "${SRV_D}/ssl.cert" \ | ||
--digest -u "cylc:$(<"${SRV_D}/passphrase")" \ | ||
"https://${HOST}:${PORT}/get_latest_state" | ||
cylc client -n --set-uuid="${UUID}" 'get_latest_state' "${SUITE_NAME}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could (eventually) create a function for a run_ok
test of incremental name calling this exact command (as is done throughout this test file) after sleep 1
or other activity to make this test a bit more intuitive & tidy .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed especially if we end up doing more of these.
And removal of some unnecessary logic.
All review issues fixed or addressed. |
Re-approving after looking at the one response commit. Advised to merge given trivial nature of changes in the commits subsequent to review from @hjoliver who is now on short-term leave. |
Fix docstring typo in #2582
cylc client: new command to call network API client.
When SSH messaging is already set up for a task remote, other client
commands will pick up the set up automatically without having to use
--use-ssh
and related options. Client logic will run locally, withonly network logic done via
cylc client --use-ssh
.cylc message:
put_messages
(replacesput_message
).Server program will ignore task messages from old submits.
Reduce job failed message to just say
failed/SIGNAL
, etc.started
andsucceeded
messages.cylc remote-init: now handles SSH comm set up. Job files no longer
contain SSH comm settings.
Rationalise
cylc.task_message
andcylc.network.httpclient
. Locationof network logic and retry can now be found in the latter.
Create a suite UUID on start up of a new run. This is stored in the
suite runtime database file and written to the contact file. Restart will
load and use the same UUID. The UUID is exported to the environment
of a job script. In a job environment, the client will only attempt a
connection to the suite if the UUID value in the environment matches
with that of the contact file.
Close #439. Close #2214. Close #2502. Close #2528. Tick final box in and close #2302.