Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

New TaskComputer #4510

Merged
merged 7 commits into from
Jul 31, 2019
Merged

New TaskComputer #4510

merged 7 commits into from
Jul 31, 2019

Conversation

Wiezzel
Copy link

@Wiezzel Wiezzel commented Jul 18, 2019

Created NewTaskComputer class for computing tasks created with the new Task API. Support for old-style tasks is kept as well. TaskComputerAdapter was introduced to dispatch tasks between new and old
task computer.

@Wiezzel Wiezzel added the clay label Jul 18, 2019
@Wiezzel Wiezzel self-assigned this Jul 18, 2019
@Wiezzel Wiezzel force-pushed the new_taskcomputer branch 9 times, most recently from 8ec549d to 392128f Compare July 23, 2019 14:56
Created NewTaskComputer class for computing tasks created with the new
Task API. Support for old-style tasks is kept as well.
TaskComputerAdapter was introduced to dispatch tasks between new and old
task computer.
@Wiezzel Wiezzel force-pushed the new_taskcomputer branch from 392128f to 63ce003 Compare July 23, 2019 15:11
@Wiezzel Wiezzel changed the title [WIP] New TaskComputer New TaskComputer Jul 23, 2019
@Wiezzel Wiezzel marked this pull request as ready for review July 23, 2019 15:30
@Wiezzel
Copy link
Author

Wiezzel commented Jul 23, 2019

@Krigpl @mfranciszkiewicz Unit tests for the NewTaskComputer are yet to come but you can already start the review.

@@ -419,7 +419,7 @@ def resource_failure(self, task_id: str, reason: str) -> None:

self.task_computer.task_interrupted()
self.send_task_failed(
self.task_computer.assigned_subtask['subtask_id'],
self.task_computer.assigned_subtask_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is suspicious. It either means that assigned_subtask_id doesn't get cleared after task_interrupted or it is cleared and it's empty here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... This is actually a race condition. In most cases assigned_subtask_id won't be cleaned yet. 😛
Good catch!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it's better to get this fixed - like by grabbing the subtask_id before calling task_interrupted and using that value here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, for sure.

task_id = self.assigned_task_id
subtask_id = self.assigned_subtask_id
computation = self._new_computer.compute()
self._handle_computation_results(task_id, subtask_id, computation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That returns deferred but it's not yielded anywhere - expected? If so, please put a comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's on purpose. The computation deferred resolves once the computation is complete not when it's started. And this method is supposed to just start the computation. I remember adding a comment about it but must've deleted it by accident.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still needs an error handler, right? twisted complains when there are unyielded deferreds with unhandled errors.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_handle_computation_results() has a catch-all block so it shouldn't raise any error. I can put the _send_results() inside this block as well.

Copy link
Author

@Wiezzel Wiezzel Jul 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, there is a comment but somehow it got three lines up.

return self._assigned_task.env_id

@defer.inlineCallbacks
def change_config(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what's the decision with change_config? I believe it doesn't necessarily belong here but it's too awkward to remove just now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no final decision yet. @mfranciszkiewicz is researching this topic. This is most probably a temporary solution. And it's compatible with existing code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the method in this PR

@Wiezzel Wiezzel requested a review from shadeofblue as a code owner July 25, 2019 15:19
@Wiezzel Wiezzel force-pushed the new_taskcomputer branch from 28c07cf to 7e42786 Compare July 25, 2019 15:43
Copy link
Contributor

@maaktweluit maaktweluit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Some small nitty comments, still approving

golem/task/taskcomputer.py Outdated Show resolved Hide resolved
requirements_to-freeze.txt Show resolved Hide resolved
tests/golem/task/test_taskcomputeradapter.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jul 29, 2019

Codecov Report

Merging #4510 into develop will increase coverage by 0.02%.
The diff coverage is 98.02%.

@@             Coverage Diff             @@
##           develop    #4510      +/-   ##
===========================================
+ Coverage    90.29%   90.31%   +0.02%     
===========================================
  Files          225      225              
  Lines        19715    19932     +217     
===========================================
+ Hits         17802    18002     +200     
- Misses        1913     1930      +17

@Wiezzel Wiezzel force-pushed the new_taskcomputer branch from 166049c to 54cec12 Compare July 29, 2019 13:42
def subtask_deadline(self):
return int(time.time()) + 3600

def get_task_header(self, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could just use factories from GM, here and for ctd, and remove these helpers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look at these factories.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, after a look I don't see much value in using these factories because they generate some random ID's and I want to use particular ones so that I can easily make asserts without passing them around.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set the values of the fields you're interested in. Like TaskHeaderFactory(task_id=<your id>), so basically very similar logic to this one here.
Not insisting but I feel it's somewhat redundant to have these helpers.

deadline=kwargs.get('subtask_deadline') or self.subtask_deadline
)

def patch_async(self, name, *args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is async supposed to mean here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It supposed to mean that these patch method is suitable for asynchronous functions unlike the ordinary @patch decorator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaict it's only used with non-async functions here so I got confused, that's why I asked.

@Wiezzel Wiezzel force-pushed the new_taskcomputer branch from 583788b to 0ac8c24 Compare July 30, 2019 10:34
golem/task/taskcomputer.py Outdated Show resolved Hide resolved
golem/task/taskcomputer.py Outdated Show resolved Hide resolved
@Wiezzel Wiezzel force-pushed the new_taskcomputer branch from 35e3c7e to 11e053d Compare July 30, 2019 13:54
Copy link
Contributor

@mfranciszkiewicz mfranciszkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small thing

sync_wait(self._new_computer.prepare())

# Should this node behave as provider and compute tasks?
self.compute_tasks = task_server.config_desc.accept_tasks \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute_tasks should become a @property because afair task_server.config_desc.in_shutdown is changed dynamically during runtime

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a part of the config and config needs to be explicitly changed be calling change_config. That has been the case with the old TaskComputer.

If some code modifies the in_shutdown setting without calling update_config then probably it should be fixed but that's not a part of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change_config is called on the client after updating this setting, so should be good :)

https://github.com/golemfactory/golem/blob/develop/golem/node.py#L377

* Refactored _get_task_api_service() and _get_task_dir() to use
  _assigned_task instead of method parameters.
* Added support_direct_computation property to TaskComputerAdapter
  (needed by dummy task computation).
* Fixed DummyTask.computation_failed() which had passing None value as
  a not-none parameter to DummyTask.computation_finished().
* Added unit test for NewTaskComputer._get_task_api_service()
* Moved _runtime assignment after calling _app_client.compute() in
  NewTaskComputer.compute() (otherwise it would be always None).
@Wiezzel Wiezzel force-pushed the new_taskcomputer branch from 11e053d to cadc186 Compare July 31, 2019 09:11
@Wiezzel Wiezzel merged commit 4fd68a6 into develop Jul 31, 2019
@etam etam deleted the new_taskcomputer branch July 31, 2019 14:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants