You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to use horovod accelerator which is correctly installed in my environment, as I can import horovod.torch without any problem but I get an error stating that horovod is not installed. I checked the source code to see how module availability was checked and was surprised to see that lightning uses hasattr to check child modules. I guess it comes handy when you are not trying to check for a module but for a function or a class, but I don't understand why the function does not simply try to call import horovod.torch (which works) instead of hasattr(horovod, "torch") (which returns False). Indeed, a submodule is not necessarily an attribute of its parent module unless stated explicitly, which is apparently not the case for horovod. If you still want to handle the case where you are trying to import an object and not a module, you can check for a ModuleNotFoundError and then use hasattr instead.
To Reproduce
With horovod=0.24.1 installed with PyTorch compatibility, use _module_available("horovod.torch") (or simply import horovod; hasattr(horovod, "torch")).
Expected behavior
The function should return True. It has to be noted that if a import horovod.torch has been performed prior to the function call, it returns True, which can serve as a workaround in the meantime.
Environment
PyTorch version: 1.10.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: CentOS Stream release 8 (x86_64)
GCC version: (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10)
Clang version: Could not collect
CMake version: version 3.20.2
Libc version: glibc-2.28
Python version: 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-4.18.0-365.el8.x86_64-x86_64-with-glibc2.28
Is CUDA available: True
CUDA runtime version: 11.6.55
GPU models and configuration:
GPU 0: Tesla P100-PCIE-16GB
GPU 1: Tesla P100-PCIE-16GB
Nvidia driver version: 510.47.03
cuDNN version: Probably one of the following:
/usr/lib64/libcudnn.so.8.3.2
/usr/lib64/libcudnn_adv_infer.so.8.3.2
/usr/lib64/libcudnn_adv_train.so.8.3.2
/usr/lib64/libcudnn_cnn_infer.so.8.3.2
/usr/lib64/libcudnn_cnn_train.so.8.3.2
/usr/lib64/libcudnn_ops_infer.so.8.3.2
/usr/lib64/libcudnn_ops_train.so.8.3.2
HIP runtime version: N/A
MIOpen runtime version: N/A
🐛 Bug
I tried to use horovod accelerator which is correctly installed in my environment, as I can
import horovod.torch
without any problem but I get an error stating that horovod is not installed. I checked the source code to see how module availability was checked and was surprised to see that lightning useshasattr
to check child modules. I guess it comes handy when you are not trying to check for a module but for a function or a class, but I don't understand why the function does not simply try to callimport horovod.torch
(which works) instead ofhasattr(horovod, "torch")
(which returnsFalse
). Indeed, a submodule is not necessarily an attribute of its parent module unless stated explicitly, which is apparently not the case for horovod. If you still want to handle the case where you are trying to import an object and not a module, you can check for aModuleNotFoundError
and then usehasattr
instead.To Reproduce
With horovod=0.24.1 installed with PyTorch compatibility, use
_module_available("horovod.torch")
(or simplyimport horovod; hasattr(horovod, "torch")
).Expected behavior
The function should return
True
. It has to be noted that if aimport horovod.torch
has been performed prior to the function call, it returnsTrue
, which can serve as a workaround in the meantime.Environment
PyTorch version: 1.10.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: CentOS Stream release 8 (x86_64)
GCC version: (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10)
Clang version: Could not collect
CMake version: version 3.20.2
Libc version: glibc-2.28
Python version: 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-4.18.0-365.el8.x86_64-x86_64-with-glibc2.28
Is CUDA available: True
CUDA runtime version: 11.6.55
GPU models and configuration:
GPU 0: Tesla P100-PCIE-16GB
GPU 1: Tesla P100-PCIE-16GB
Nvidia driver version: 510.47.03
cuDNN version: Probably one of the following:
/usr/lib64/libcudnn.so.8.3.2
/usr/lib64/libcudnn_adv_infer.so.8.3.2
/usr/lib64/libcudnn_adv_train.so.8.3.2
/usr/lib64/libcudnn_cnn_infer.so.8.3.2
/usr/lib64/libcudnn_cnn_train.so.8.3.2
/usr/lib64/libcudnn_ops_infer.so.8.3.2
/usr/lib64/libcudnn_ops_train.so.8.3.2
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.22.1
[pip3] pytorch-lightning==1.5.10
[pip3] torch==1.10.1
[pip3] torchaudio==0.10.1
[pip3] torchmetrics==0.7.2
[pip3] torchvision==0.11.2
[pip3] horovod==0.24.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py39h7f8727e_0
[conda] mkl_fft 1.3.1 py39hd3c417c_0
[conda] mkl_random 1.2.2 py39h51133e4_0
[conda] mypy-extensions 0.4.3 pypi_0 pypi
[conda] numpy 1.22.1 pypi_0 pypi
[conda] numpy-base 1.21.2 py39h79a1101_0
[conda] pytorch 1.10.1 py3.9_cuda11.3_cudnn8.2.0_0 pytorch
[conda] pytorch-lightning 1.5.10 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchaudio 0.10.1 py39_cu113 pytorch
[conda] torchmetrics 0.7.2 pypi_0 pypi
[conda] torchvision 0.11.2 py39_cu113 pytorch
Additional context
If you confirm that this is unexpected behavior, I can write a PR that should fix this by directly trying to import the module.
cc @awaelchli
The text was updated successfully, but these errors were encountered: