Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's add a suggested_num_workers() method? #2196

Closed
williamFalcon opened this issue Jun 15, 2020 · 13 comments · Fixed by #18591
Closed

Let's add a suggested_num_workers() method? #2196

williamFalcon opened this issue Jun 15, 2020 · 13 comments · Fixed by #18591
Assignees
Labels
design Includes a design discussion feature Is an improvement or enhancement good first issue Good for newcomers

Comments

@williamFalcon
Copy link
Contributor

V1 could be:

import subprocess

def suggest_num_workers(num_accelerators):
    num_cpus = multiprocessing.cpu_count()
    return num_cpus * num_accelerators

@PyTorchLightning/core-contributors
Any other heuristics you guys use?

@williamFalcon williamFalcon added the feature Is an improvement or enhancement label Jun 15, 2020
@williamFalcon williamFalcon added this to the 0.9.0 milestone Jun 15, 2020
@justusschock
Copy link
Member

@williamFalcon I tend to do some stress testing. E.g. I try to repeatedly load samples from my dataset with increasing number of workers and monitor disk I/O. Especially while loading large files (e.g. medical DICOMs) the cpu is not the bottleneck but disk io is

@williamFalcon
Copy link
Contributor Author

but there’s some upper bound on num_workers based on the num_cpus no?

maybe we can do something like the learning rate finder but for num_workers?
@SkafteNicki

@justusschock
Copy link
Member

Technically there isn't. If you got to many workers/threads they will be scheduled by your OS, but there is no such thing as a limit. Even if you have more workers, it may sometimes be easier to just load another process context and have inter-process communication in the background, since this sometimes takes a while.

I'm also not sure, if cpu count gives you logical or physical cores

@SkafteNicki
Copy link
Member

@williamFalcon I had the same idea when saw your PR yesterday.
We could iteratively try to increase the number of workers and log the training time over a few batches. I guess that when we reach a certain amount of workers, the training time will plateau (no decrease any further) and we could suggest n_workers just before this happens.

@williamFalcon
Copy link
Contributor Author

exactly. that would be ideal.

More of a reason to get this Tuner object separated @tullie

@tullie
Copy link
Contributor

tullie commented Jun 16, 2020

Yeah that’d be awesome. I often do a small benchmark on this when I create a new data loader.

@SkafteNicki
Copy link
Member

So I guess first step is to get the v1 of tuner merge to master (PR: #1998).
Then we can extend with this functionality (I will be happy to do it but if you have some code @tullie to get me started it would be more than welcome)

@tullie
Copy link
Contributor

tullie commented Jun 16, 2020

Yep that sounds good. BTW a related PR to the Tuner is this one I sent out last week #2107. It shows another potential way of decomposing the trainer and having shared arguments.

@Borda Borda added the good first issue Good for newcomers label Aug 4, 2020
@edenlightning edenlightning modified the milestones: 0.9.0, 0.9.x Aug 18, 2020
@edenlightning edenlightning modified the milestones: 0.9.x, 1.1 Sep 22, 2020
@edenlightning edenlightning added design Includes a design discussion v1.0 post labels Sep 22, 2020
@Borda Borda removed the v1.0 post label Oct 13, 2020
@edenlightning edenlightning modified the milestones: 1.1, 1.0.3, 1.2 Oct 19, 2020
@edenlightning edenlightning modified the milestones: 1.2, 1.3 Feb 8, 2021
@edenlightning
Copy link
Contributor

@SkafteNicki still relevant?

@edenlightning edenlightning removed this from the 1.3 milestone Feb 22, 2021
@SkafteNicki
Copy link
Member

Lets keep it alive, I already have a partial implementation ready

@SkafteNicki SkafteNicki self-assigned this Feb 24, 2021
@williamFalcon
Copy link
Contributor Author

we already show a warning no? is the implementation different than that?

@GeorgePearse
Copy link

I'd be keen to try to improve this a bit if no one's got code they're happy to contribute already.

@justusschock
Copy link
Member

@GeorgePearse feel free to give it a try. This obviously hasn't been a priority for us.

Just remember that you want to have one cpu-core/thread free for the main process so this doesn't get scheduled too bad.

Ideally you'd also consider RAM as it usually scales linearly with number of workers (each gets a copy of the dataset and loads stuff at the same time). However, this is something that can also come later on.

Also make sure to open a draft PR as soon aa you have something to discuss to get feedback as early as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Includes a design discussion feature Is an improvement or enhancement good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants