Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use max_cores in taskQueue instead of system cores #2038

Merged
merged 5 commits into from
Nov 11, 2024

Conversation

kmavrommatis
Copy link
Contributor

The class TascQueue accepts an argument thread_count that is used for the max size of a queue.
In the MultithreadedJobExecutor class there is a variable defined (max_cores) that is getting its value from the available cores of the machine. Further in the same class when TaskQueue is called instead of using the max_cores it is using psutil.cpu_count().
I propose to use the max_cores variable when initializing the TaskQueue.

Use case. one can override the max_cores in the MutlithreadedJobExecutor after initialization and force cwltool to use only up to a certain number of cores.

The class TascQueue accepts an argument `thread_count` that is used for the max size of a queue.

In the `MultithreadedJobExecutor `class there is a variable defined (`max_cores`) that is getting its value from the available cores of the machine. Further down in the same class when TaskQueue is called instead of using the `max_cores` it is using `psutil.cpu_count()`. 

I suggest to use  the self.max_cores in the call of TaskQueue in the file executor.py instead of psutil.cpu_count()

Use case:
when a job executor is setup as MultithreadedJobExecutor one can override the max_cores after the initialization of the object and limit the use to the specified cores.

Additional enhancement would be to include an argument to allow the use to provide the number of cores available for use
@mr-c mr-c requested a review from tetron September 6, 2024 19:20
@kmavrommatis
Copy link
Contributor Author

Hello, is there any update on this?
Does this modification work - to your knowledge - with the overall architecture of the tool?

@tetron
Copy link
Member

tetron commented Oct 23, 2024

hi @kmavrommatis, I'm taking a look at it.

It should already be the case that it will not start more than max_cores worth of work, because MultithreadedJobExecutor separately accounts for resource (cores/RAM) usage when deciding whether to put a job into the task queue. So if max_cores < cpu_count() I don't think it would make a difference in how many parallel jobs it starts. Although if you are seeing something different happen that your pull request addresses, please provide more information.

On the other hand if you want to start more than cpu_count() worth of work (i.e. max_cores > cpu_count()), then the number of parallel processes would be limited by the thread count being set to cpu_count() instead of the greater value of max_cores.

That said, it would be a useful feature to add command line options like --max-cores and --max-ram that would let you set your own limits instead of the defaults that are chosen for you. Is that the feature you are looking for?

@kmavrommatis
Copy link
Contributor Author

Hi @tetron
my use case requires to use more cores than the ones that are physically available on the system (because the tasks running in the pipeline use the CPU very sparingly while waiting for network signals).
I have modified cwltool to allow to specify the number of max cores (as you suggest) in both places in the TaskQueue and MultithreadedJobExecutor to a number that is higher than the actual cpu count (in my typical case it is 256 on a 32 core instance.
Adding the arguments you suggest would work - provided that it is allowed to specify higher number of cores than the ones physically available to the instance :)
Thanks

@tetron
Copy link
Member

tetron commented Nov 5, 2024

@kmavrommatis just to confirm, are you requesting a fractional core in your CWL file? e.g. minCores: 0.125 ? Then it would just need to no longer limit the number of processes it launches to cpu_count().

@mr-c
Copy link
Member

mr-c commented Nov 11, 2024

Hey @kmavrommatis , thank you for opening this PR.

Why do you want to run so many CWL Processes at the same time? Are they very short processes and you want to launch many to avoid delays from the setup and tear-down? Or are they IO-bound, so by over-subscribing the system it is still faster to be processing other jobs while some are waiting for IO?

Ah, I see that you are working with cwl-tes. I think you want to issue any many jobs as possible, because all of them are executed elsewhere.

Okay, this PR makes more sense to me now. It should produce identical behavior for everyone, except those that override ‎MultithreadedJobExecutor.max_cores to some other value.

@kmavrommatis
Copy link
Contributor Author

Hey @kmavrommatis , thank you for opening this PR.

Why do you want to run so many CWL Processes at the same time? Are they very short processes and you want to launch many to avoid delays from the setup and tear-down? Or are they IO-bound, so by over-subscribing the system it is still faster to be processing other jobs while some are waiting for IO?

Ah, I see that you are working with cwl-tes. I think you want to issue any many jobs as possible, because all of them are executed elsewhere.

Okay, this PR makes more sense to me now. It should produce identical behavior for everyone, except those that override ‎MultithreadedJobExecutor.max_cores to some other value.

Exactly right, cwl-tes is waiting for network events and it can handle many tasks at the same time, so it makes sense to make it assume that it can use many more cores than the host provides.
Thanks

Copy link

codecov bot commented Nov 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.01%. Comparing base (0b64935) to head (53b73e1).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2038   +/-   ##
=======================================
  Coverage   84.01%   84.01%           
=======================================
  Files          46       46           
  Lines        8302     8302           
  Branches     1957     1957           
=======================================
  Hits         6975     6975           
  Misses        853      853           
  Partials      474      474           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mr-c mr-c enabled auto-merge (squash) November 11, 2024 12:25
@mr-c mr-c merged commit 048eb55 into common-workflow-language:main Nov 11, 2024
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants