Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task state code refactor. #1775

Merged
merged 5 commits into from
May 12, 2016
Merged

Conversation

hjoliver
Copy link
Member

@hjoliver hjoliver commented Apr 1, 2016

Refactor the code associated with handling of task state/status.

[updated 13 April - original lines crossed out]

Motivation: working on another current pull request, I was bitten by a typo in a task status string comparison, which wouldn't happen if we used (constant) variables to refer to states like "waiting" etc. (which would be better for internationalization too, if it comes to that at some point...). Further, each task proxy currently holds its status in a "task_state" object that is (a) next to useless - it contains just a status string, and some minimal logic that is largely duplicated in wrapper methods in the task proxy class anyway; and (b) also holds "spawned/not spawned" status in a silly two-key dict along with "status", which doesn't make much sense. This is all dates back a long way, before I learned a few things off @cylc/core members 😬

So: this PR does the following:

  • separates spawning from task_state
  • deprecates "cylc reset --state=spawn" to a new command "cylc spawn"
  • gets rid of the task state objects (now just a status string held by task proxies)
  • the task_state module now defines a TaskState class that (for each task proxy) holds the simple status string, and prerequisites and outputs (which are intimately connected to task status), plus associated logic.
  • the task state module also defines constants for each status string, e.g. STATUS_WAITING = 'waiting', and these are now used throughout the code.
  • replaces the state.is_currently() and state.get_status() methods with static set membership tests, for performance reasons (as pointed out by @benfitzpatrick below).

@matthewrmshin - this isn't quite ready - I still need to use STATUS_WAITING etc. throughout the code instead of hard-coded strings "waiting" etc.; add a test for "cylc spawn", and address a few TODO comments left in the code (mainly for checking valid status at various points). However, as this grew a bit bigger than I was expecting, I just wanted to post the branch as-is before the weekend in case of any fundamental objections, or suggestions for improvement. Test battery passes on this branch at the moment.

@hjoliver hjoliver self-assigned this Apr 1, 2016
@hjoliver hjoliver added this to the soon milestone Apr 1, 2016
@arjclark
Copy link
Contributor

arjclark commented Apr 1, 2016

+1 from me for moving from strings to constants (e.g. state.WAITING_STATE over 'waiting') - pretty much for the typos reason you give above

@benfitzpatrick
Copy link
Contributor

This would be good for performance - is_currently and get_status are pretty slow according to the profiler (could be profile overhead, could be Python function call overhead).

@hjoliver
Copy link
Member Author

hjoliver commented Apr 7, 2016

I'll get rid of is_currently and get_status.


(almost done - tests passing, just pep8 and some tidying to do)

@hjoliver hjoliver force-pushed the task-state-refactor branch 8 times, most recently from afc796f to dc9774e Compare April 12, 2016 06:43
@hjoliver
Copy link
Member Author

@matthewrmshin - this is ready to review, test battery passes here. Description updated above. (note I backed out of a more extensive refactoring of task_proxy module, will email re this later...).

(I do intend to add some more docstrings yet)

@hjoliver hjoliver assigned matthewrmshin and unassigned hjoliver Apr 13, 2016
@matthewrmshin
Copy link
Contributor

Test battery OK (with and without global.rc) with the exception the PEP8 test, which is failing with something like this in my environment:

-/home/matt/cylc.git/lib/cylc/task_pool.py:276:20: E713 test for membership should be 'not in'
-/home/matt/cylc.git/lib/cylc/task_pool.py:1485:21: E713 test for membership should be 'not in'
-/home/matt/cylc.git/lib/cylc/task_pool.py:1491:21: E713 test for membership should be 'not in'
-/home/matt/cylc.git/lib/cylc/task_proxy.py:1379:17: E713 test for membership should be 'not in'
-/home/matt/cylc.git/lib/cylc/task_state.py:356:16: E713 test for membership should be 'not in'
-/home/matt/cylc.git/bin/cylc-restart:483:20: E713 test for membership should be 'not in'

@matthewrmshin
Copy link
Contributor

(I'll attempt to look at this change in detail tomorrow.)

@hjoliver hjoliver changed the title Task state code refactor (in progress...) Task state code refactor. Apr 13, 2016
@hjoliver
Copy link
Member Author

note I backed out of a more extensive refactoring of task_proxy module, will email re this later...

The handling of state, prerequisites, outputs, retries, incoming messages, events, DB writes, etc., in the task_proxy module has become increasingly detailed and unnecessarily tangled, and IMO the whole thing could use a complete redesign, to abstract out the messy detail. I actually made a start on it in this branch (because moving prereqs and outputs to task_state.py goes a little way toward that goal) but I backed out of it pretty quickly because it could be a lot of work and a difficult change to review, and for no functional purpose at this point. However, I thought I'd just flag that this should probably be done at some point in the future...


suite = args.pop(0)

prompt('Reset task(s) %s in %s' % (args, suite), options.force)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spawn instead of Reset?

@hjoliver
Copy link
Member Author

Rebased.

@matthewrmshin
Copy link
Contributor

Branch needs rebase again, I am afraid.

self.legal_task_states = TaskState.legal_for_restricted_monitoring
else:
self.legal_task_states = TaskState.legal
self.legal_task_states = TASK_STATUSES_RESTRICTED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think self.legal_task_states needs to be set regardless. I am getting a traceback on View > Task Filtering.

Traceback (most recent call last):
  File "/home/matt/cylc.git/lib/cylc/gui/app_gcylc.py", line 3150, in popup_filter_dialog
    self.create_task_filter_widgets()
  File "/home/matt/cylc.git/lib/cylc/gui/app_gcylc.py", line 2911, in create_task_filter_widgets
    n_states = len(self.legal_task_states)
AttributeError: 'ControlApp' object has no attribute 'legal_task_states'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spotting, I should have tested that.

@hjoliver
Copy link
Member Author

Branch rebased, bug fixed.

A minor thing I forget to mention above - this PR also makes the gcylc task right-click View menu a bit more context sensitive: for submitted, submit-failed, and submit-retrying tasks you can view job script and activity log, but not job logs.

@hjoliver
Copy link
Member Author

Branched rebased, tests passing.

@benfitzpatrick
Copy link
Contributor

Sorry for conflicting this one up!

@hjoliver
Copy link
Member Author

Sorry for conflicting this one up!

Again!

But never mind, it's worth it!

@hjoliver hjoliver modified the milestones: next release, soon May 4, 2016
@hjoliver hjoliver force-pushed the master branch 3 times, most recently from 17fe934 to 3bdb067 Compare May 5, 2016 01:37
@matthewrmshin
Copy link
Contributor

@hjoliver Sorry, but as usual, this branch is in conflicts again. We should aim to merge this in as soon as possible so that we can be testing with it in for most of the new release cycle.

@hjoliver
Copy link
Member Author

hjoliver commented May 8, 2016

(I'll sort out the conflicts as soon as possible...)

@hjoliver
Copy link
Member Author

@matthewrmshin - branch rebased, tests pass here. It might be a good idea for @benfitzpatrick to take a quick look at this too, to check my resolution of the conflicts with his changes #1813 and #1819.

@matthewrmshin
Copy link
Contributor

Looks OK. Tests OK (with and without global configuration) in my environment.

@benfitzpatrick please review 2.

@@ -333,7 +337,8 @@ class restart(Scheduler):
for line in task_lines:
# instance variables
try:
(id_, state) = line.split(' : ')
(id_, state_string) = line.split(' : ')
# state_string e.g. 'status=running, spawned=false'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=False

@benfitzpatrick
Copy link
Contributor

Looks like it leads to a speedup! Great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants