-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let's add a suggested_num_workers()
method?
#2196
Comments
@williamFalcon I tend to do some stress testing. E.g. I try to repeatedly load samples from my dataset with increasing number of workers and monitor disk I/O. Especially while loading large files (e.g. medical DICOMs) the cpu is not the bottleneck but disk io is |
but there’s some upper bound on num_workers based on the num_cpus no? maybe we can do something like the learning rate finder but for num_workers? |
Technically there isn't. If you got to many workers/threads they will be scheduled by your OS, but there is no such thing as a limit. Even if you have more workers, it may sometimes be easier to just load another process context and have inter-process communication in the background, since this sometimes takes a while. I'm also not sure, if cpu count gives you logical or physical cores |
@williamFalcon I had the same idea when saw your PR yesterday. |
exactly. that would be ideal. More of a reason to get this Tuner object separated @tullie |
Yeah that’d be awesome. I often do a small benchmark on this when I create a new data loader. |
Yep that sounds good. BTW a related PR to the Tuner is this one I sent out last week #2107. It shows another potential way of decomposing the trainer and having shared arguments. |
@SkafteNicki still relevant? |
Lets keep it alive, I already have a partial implementation ready |
we already show a warning no? is the implementation different than that? |
I'd be keen to try to improve this a bit if no one's got code they're happy to contribute already. |
@GeorgePearse feel free to give it a try. This obviously hasn't been a priority for us. Just remember that you want to have one cpu-core/thread free for the main process so this doesn't get scheduled too bad. Ideally you'd also consider RAM as it usually scales linearly with number of workers (each gets a copy of the dataset and loads stuff at the same time). However, this is something that can also come later on. Also make sure to open a draft PR as soon aa you have something to discuss to get feedback as early as possible. |
V1 could be:
@PyTorchLightning/core-contributors
Any other heuristics you guys use?
The text was updated successfully, but these errors were encountered: