Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: an illegal memory access was encountered #40

Open
HerculesJL opened this issue Aug 28, 2023 · 2 comments
Open

Comments

@HerculesJL
Copy link

Hello! Thank you for your excellent work. I encountered the following error when I change "cuda:0" to “cuda:1” or other cuda, can you give me some advice ?
I only modified 'cuda: x' in the sample program

Traceback (most recent call last):
  File "/data/songzhenbo/NKSR-public/examples/recons_simple.py", line 27, in <module>
    field = reconstructor.reconstruct(input_xyz, input_normal, detail_level=1.0)
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/nksr/__init__.py", line 256, in reconstruct
    feat, svh, udf_svh = self.network.unet(
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/nksr/nn/unet.py", line 238, in forward
    feat, encoder_svh, feat_depth = module(feat, encoder_svh, feat_depth)
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/nksr/nn/modules.py", line 316, in forward
    feat, svh, depth = module(feat, svh, depth, **kwargs)
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/nksr/nn/modules.py", line 316, in forward
    feat, svh, depth = module(feat, svh, depth, **kwargs)
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/nksr/nn/modules.py", line 112, in forward
    nbmap, nbsizes, _ = self._compute_conv_args(in_grid, out_grid)
  File "/data/songzhenbo/.conda/envs/NKSR/lib/python3.10/site-packages/nksr/nn/modules.py", line 75, in _compute_conv_args
    nbmap = torch.nonzero(kmap != -1).contiguous()
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
@heiwang1997
Copy link
Collaborator

Hey @HerculesJL thank you for reporting this! This is definitely a bug and I will look into this asap.

@smittyjaggerman
Copy link

smittyjaggerman commented Jun 12, 2024

Hey, @heiwang1997 has this problem been resolved / is there a working workaround for it? I am currently facing the same issue.

When using "cuda:1" in any NKSR related script I get following error:

terminate called after throwing an instance of 'thrust::system::system_error'
what(): CUDA free failed: cudaErrorIllegalAddress: an illegal memory access was encountered
Aborted (core dumped)

Help would be really appreciated! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants