Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python version for nuwave #21

Open
eyacov opened this issue Feb 25, 2024 · 5 comments
Open

Python version for nuwave #21

eyacov opened this issue Feb 25, 2024 · 5 comments

Comments

@eyacov
Copy link

eyacov commented Feb 25, 2024

Hey
What version of python is this repo compatible with? It doesn't seem to work with python3.11

@junjun3518
Copy link
Contributor

Hy @eyacov!

I am not sure about the Python version compatibility.
It was tested by Python 3.6 and Nvidia's docker image from [nvcr.io/nvidia/pytorch:20.09-py3].(https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)

Could you provide the error messages you're encountering?

@eyacov
Copy link
Author

eyacov commented May 8, 2024

Hey @junjun3518
I can't use this container because it's not available for my GPU: GeForce RTX 4090
when I try to run it without a container I'm getting the following error

Traceback (most recent call last): File "trainer.py", line 139, in <module> train(args) File "trainer.py", line 126, in train trainer.fit(model) File "/home/etay/anaconda3/envs/nuwave/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 510, in fit results = self.accelerator_backend.train() File "/home/etay/anaconda3/envs/nuwave/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 57, in train return self.train_or_test() File "/home/etay/anaconda3/envs/nuwave/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 74, in train_or_test results = self.trainer.train() File "/home/etay/anaconda3/envs/nuwave/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in train self.run_sanity_check(self.get_model()) File "/home/etay/anaconda3/envs/nuwave/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 730, in run_sanity_check _, eval_results = self.run_evaluation(max_batches=self.num_sanity_val_batches) File "/home/etay/anaconda3/envs/nuwave/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 646, in run_evaluation output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx) File "/home/etay/anaconda3/envs/nuwave/lib/python3.6/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 180, in evaluation_step output = self.trainer.accelerator_backend.validation_step(args) File "/home/etay/anaconda3/envs/nuwave/lib/python3.6/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 73, in validation_step return self._step(self.trainer.model.validation_step, args) File "/home/etay/anaconda3/envs/nuwave/lib/python3.6/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 65, in _step output = model_step(*args) File "/media/etay/Daten/localization/nuwave/lightning_model_bird.py", line 239, in validation_step 0, self.max_step, (wav.shape[0], ), device=self.device) + 1 RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

@junjun3518
Copy link
Contributor

junjun3518 commented May 8, 2024

Hi @eyacov ,

It seems that error message is about cuda version issue.

Since this project started at 2020, it is utilizing old version of torch and cuda, which do not support 4090.

For now, I do not have any resource to test and do not have authorization to change this repo.

I recommend to use recent base image such as nvcr.io/nvidia/pytorch:23.07-py3 (I am using it for 4090 now).

@eyacov
Copy link
Author

eyacov commented May 14, 2024

hi @junjun3518

Trying to run the code this docker leads to the following error
Traceback (most recent call last):
File "/media/etay/Daten/localization/nuwave/trainer.py", line 1, in
from lightning_model import NuWave
File "/media/etay/Daten/localization/nuwave/lightning_model.py", line 9, in
import pytorch_lightning as pl
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/init.py", line 66, in
from pytorch_lightning import metrics
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/metrics/init.py", line 14, in
from pytorch_lightning.metrics.metric import Metric
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/metrics/metric.py", line 23, in
from pytorch_lightning.metrics.utils import _flatten, dim_zero_cat, dim_zero_mean, dim_zero_sum
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/metrics/utils.py", line 18, in
from pytorch_lightning.utilities import rank_zero_warn
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/init.py", line 24, in
from pytorch_lightning.utilities.apply_func import move_data_to_device
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/apply_func.py", line 25, in
from torchtext.data import Batch
ImportError: cannot import name 'Batch' from 'torchtext.data' (/usr/local/lib/python3.10/dist-packages/torchtext/data/init.py)

Do I need to change something in requirements.txt to make this work?

@eyacov
Copy link
Author

eyacov commented May 21, 2024

I manged to solve the issue. The requirements file needs to be written as follows:
ffmpeg
torchtext==0.6.0
pytorch_lightning==1.1.6
prefetch_generator
librosa==0.8.0
omegaconf==2.0.6

You might want to consider changing it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants