You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a github repo where I define some models and a hubconf.py file for access via the torch.hub.load() API. The models work fine when there is no multi-processing (ie num_workers=0 for dataloader), but fail with an error about pickling and Module Not Found if two conditions are true: (1) num_workers>0 and (2) the module that the model object is defined in uses an absolute import for another module in my repo.
For example, with the structure:
my_repo/
utils.py
a/
model_a.py # contains class ModelA
...
if model_a.py has from my_repo import utils, and we import ModelA and try to use >0 workers in a DataLoader, error looks like this:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File ".../python3.10/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File ".../python3.10/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
ModuleNotFoundError: No module named 'my_repo'
This makes some sense, because we never installed my_repo as a package when loading ModelA - somewhere, it seems multiprocessing tries to recreate or reimport things and does not find my_repo.
Edit: I thought relative imports might be a workaround, but they don't fix the issue
Is there a solution to this? Thanks!
The text was updated successfully, but these errors were encountered:
Hi @sammlapp , thanks for the report. Just to make sure, can you run and import the model properly without calling torch.hub.load()?
If yes, then sorry, you're probably hitting one of the few edge cases that exist when using torch.hub, and I'm afraid there isn't an obvious work-around that comes to mind for this.
Yes, if I install or import the package locally there is no issue. This seems like a pretty important/common use case, because (1) packages with more than just a few dozen lines of code often use imports from other modules, and (2) multi-processing using torch.DataLoader with num_workers>0 is an extremely common workflow.
I have a github repo where I define some models and a hubconf.py file for access via the torch.hub.load() API. The models work fine when there is no multi-processing (ie num_workers=0 for dataloader), but fail with an error about pickling and Module Not Found if two conditions are true: (1) num_workers>0 and (2) the module that the model object is defined in uses an absolute import for another module in my repo.
For example, with the structure:
if model_a.py has
from my_repo import utils
, and we import ModelA and try to use >0 workers in a DataLoader, error looks like this:This makes some sense, because we never installed my_repo as a package when loading ModelA - somewhere, it seems multiprocessing tries to recreate or reimport things and does not find my_repo.
Edit: I thought relative imports might be a workaround, but they don't fix the issue
Is there a solution to this? Thanks!
The text was updated successfully, but these errors were encountered: