RuntimeError: CUDA error: an illegal memory access was encountered #1

SunHongyang10 · 2024-07-21T07:31:16Z

Hello~ Wonderful Work!

I am trying to run the train_coarse.py, then I meet an error:

I tried to solve it, but I failed😭

Is there a problem with my virtual environment?

Vilour · 2024-07-21T18:47:40Z

Same error here. Have you solved it?

Snosixtyboo · 2024-07-21T19:02:44Z

Hi,

unfortunately, this error is not super specific, we have seen it before in 3D Gaussian Splatting. We tried our best to replicate it, but we were never able to get it on any of our machines, so we never worked out how to debug it...

Could you let us know your OS / GPU (how many GPUs are in your machine)? Getting the latest NVIDIA drivers might help, but bottom line, without full access to a setup where it happens, it might be really tough to find it.

Vilour · 2024-07-21T19:24:20Z

Hi,

unfortunately, this error is not super specific, we have seen it before in 3D Gaussian Splatting. We tried our best to replicate it, but we were never able to get it on any of our machines, so we never worked out how to debug it...

Could you let us know your OS / GPU (how many GPUs are in your machine)? Getting the latest NVIDIA drivers might help, but bottom line, without full access to a setup where it happens, it might be really tough to find it.

Hi,

I'm working with Ubuntu 18.04 with one GPU (RTX 3090). I used this setup in 3D Gaussian Splatting before and it works fine. By the way, my nvcc -V returns 11.6 and I created the virtual environment as the repo describes. Does it associate with CUDA version? Is visibility_filter really working here? Maybe I can just comment this line..

Vilour · 2024-07-21T19:27:55Z

The error persisted even if I comment this line..

ameuleman · 2024-07-21T19:28:42Z

Hi,
Did you install pytorch corresponding to cuda 11.x?
pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu118

Vilour · 2024-07-21T19:33:05Z

Hi, Did you install pytorch corresponding to cuda 11.x? pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu118

Yes, I installed pytorch with this command. By the way, the error on my machine appeared at 10040 iterations, not from the beginning.

Vilour · 2024-07-21T20:35:39Z

Hi,

I tied on another machine (Ubuntu 20.04/A6000) with the same dataset, and the error appears again on the 10040 iterations. I guess this issue associates with dataset?

ameuleman · 2024-07-21T20:44:38Z

I just tried downloading SmallCity and running full_train.py. The coarse optimization went smoothly. Are we working with the same dataset?

Vilour · 2024-07-21T20:52:25Z

I just tried downloading SmallCity and running full_train.py. The coarse optimization went smoothly. Are we working with the same dataset?

I was working with a dataset which I collected myself. I'm trying with SmallCity right now.

Vilour · 2024-07-21T21:00:51Z

I just tried downloading SmallCity and running full_train.py. The coarse optimization went smoothly. Are we working with the same dataset?

I just tried with SmallCity and the error appeared immediately.

Snosixtyboo · 2024-07-22T03:08:16Z

@Vilour @SunHongyang10 When it fails, could you try keep an eye on the GPU memory consumption? Is it possible that the system goes out of video memory? This should not happen on a 3090...

SunHongyang10 · 2024-07-22T06:14:40Z

@Vilour @SunHongyang10 When it fails, could you try keep an eye on the GPU memory consumption? Is it possible that the system goes out of video memory? This should not happen on a 3090...

I just tried with small_city dataset, and it fails immediately, my device is a 3090

anchun · 2024-07-22T07:57:57Z

Same here in Ubuntu20.04, with the following call stacks:

File "train_coarse.py", line 190, in
training(lp.extract(args), op.extract(args), pp.extract(args), args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from)
File "train_coarse.py", line 106, in training
loss.backward()
File "/home/anchun/software/miniconda3/envs/hierarchical_3d_gaussians/lib/python3.12/site-packages/torch/_tensor.py", line 525, in backward
torch.autograd.backward(
File "/home/anchun/software/miniconda3/envs/hierarchical_3d_gaussians/lib/python3.12/site-packages/torch/autograd/init.py", line 267, in backward
_engine_run_backward(
File "/home/anchun/software/miniconda3/envs/hierarchical_3d_gaussians/lib/python3.12/site-packages/torch/autograd/graph.py", line 744, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: an illegal memory access was encountered

Vilour · 2024-07-22T08:17:41Z

@Vilour @SunHongyang10 When it fails, could you try keep an eye on the GPU memory consumption? Is it possible that the system goes out of video memory? This should not happen on a 3090...

The video memory consumption is ok. Only takes a few gigabytes.

ameuleman · 2024-07-22T08:20:37Z

Hi,

Could you please provide cuda and nvidia driver versions?

PLUS-WAVE · 2024-07-22T08:25:28Z

Hi,

Could you please provide cuda and nvidia driver versions?

I have same issue, here is my version:

3070 Laptop
Driver Version: 560.70
cuda 12.1

Vilour · 2024-07-22T08:27:12Z

Hi,

Could you please provide cuda and nvidia driver versions?

The driver version is 525.105.17, and torch.version.cuda returns 11.8

ameuleman · 2024-07-22T09:15:28Z

Thanks for providing details. I managed to replicate the error using nvidia/cuda:12.1.0-devel-ubuntu20.04. We will look into it.

kevintsq · 2024-07-23T00:37:21Z

#1 (comment)

my nvcc -V returns 11.6 and I created the virtual environment as the repo describes

.#1 (comment)

pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu**118**

It seems there is a version mismatch here. The nvcc -V version must be the same as the version you install PyTorch.

han-xiangyu · 2024-07-23T05:33:46Z

Hi,

I met the same problem and I was running it on a RTX 6000 Ada and ubuntu 24.04 with cuda 11.8 and driver version 550.90.07. Thanks for help!

The error mesage is:

$ python scripts/full_train.py --project_dir dataset/example_dataset/

creating output dir: dataset/example_dataset/output
Optimizing dataset/example_dataset/output/scaffold
Output folder: dataset/example_dataset/output/scaffold [23/07 01:21:16]
Converting point3d.bin to .ply, will happen only the first time you open the scene. [23/07 01:21:16]
Reading camera 1158/1158 [23/07 01:21:17]
0 test images [23/07 01:21:17]
1158 train images [23/07 01:21:17]
Making Training Dataset [23/07 01:21:17]
Making Test Dataset [23/07 01:21:17]
Number of points at initialisation :  329992 [23/07 01:21:17]
Training progress:   0%|                                                                            | 0/30000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/xiangyu/Projects/hierarchical-3d-gaussians/train_coarse.py", line 190, in <module>
    training(lp.extract(args), op.extract(args), pp.extract(args), args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from)
  File "/home/xiangyu/Projects/hierarchical-3d-gaussians/train_coarse.py", line 110, in training
    gaussians.max_radii2D[visibility_filter] = torch.max(gaussians.max_radii2D[visibility_filter], radii)
                                                         ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Training progress:   0%|                                                                            | 0/30000 [00:00<?, ?it/s]
Error executing train_coarse: Command 'python train_coarse.py -s dataset/example_dataset/camera_calibration/aligned --save_iterations -1 -i ../rectified/images --skybox_num 100000 --model_path dataset/example_dataset/output/scaffold --alpha_masks ../rectified/masks ' returned non-zero exit status 1.`

ameuleman · 2024-07-23T07:34:26Z

Hi,
We are still working on a fix. In the meantime, I ran it without issue with Ubuntu 22.04 and CUDA 12.5.
Here is the corresponding Dockerfile:

FROM nvidia/cuda:12.5.1-cudnn-devel-ubuntu22.04
ARG USER_ID=1000
ARG GROUP_ID=1000
ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install -y --no-install-recommends git wget unzip bzip2 sudo build-essential ca-certificates openssh-server vim ffmpeg libsm6 libxext6 python3-opencv gcc-11 g++-11 cmake

# conda
ENV PATH /opt/conda/bin:$PATH 
RUN wget --quiet \
    https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
    echo 'export PATH=/opt/conda/bin:$PATH' > /etc/profile.d/conda.sh && \
    /bin/bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda && \
    rm -rf /tmp/*

# Create the user
RUN addgroup --gid $GROUP_ID user
RUN useradd --create-home -s /bin/bash --uid $USER_ID --gid $GROUP_ID docker
RUN adduser docker sudo
RUN echo "docker ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
USER docker

# Setup hierarchical_3d_gaussians
RUN /opt/conda/bin/python -m ensurepip
RUN /opt/conda/bin/python -m pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
RUN /opt/conda/bin/python -m pip install plyfile tqdm joblib exif scikit-learn timm==0.4.5 opencv-python==4.9.0.80 gradio_imageslider gradio==4.29.0 matplotlib

With 125.Dockerfile in hierarchical-3d-gaussians/:

DATASET_DIR=<Path to dataset>
docker build -t hierarchical_3d_gaussians125 -f 125.Dockerfile .
docker run -it --gpus=all --rm -v ${PWD}:/host -v ${DATASET_DIR}:/data --network=host --ipc=host hierarchical_3d_gaussians125 /bin/sh -c "cd /host; bash"
rm -r submodules/hierarchy-rasterizer/build submodules/simple-knn/build submodules/gaussianhierarchy/build
cd submodules/gaussianhierarchy
cmake . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j --config Release
cd ../..
/opt/conda/bin/python scripts/full_train.py --project_dir /data

PLUS-WAVE · 2024-07-23T09:54:36Z

I installed CUDA 12.5, uninstalled the original PyTorch 2.3.0, and then reinstalled the latest version of PyTorch (2.3.1). After that, I reran pip install -r requirements.txt. Now it can run normally on SmallCity.

Commands executed:

conda remove pytorch torchvision torchaudio pytorch-cuda=12.1
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt

pip install -r requirements.txt will reinstalled diff_gaussian_rasterization, gaussian_hierarchy-0.0.0, and simple_knn-0.0.0

I suspect it is because I reinstalled diff_gaussian_rasterization, gaussian_hierarchy-0.0.0, and simple_knn-0.0.0 after upgrading to CUDA 12.5. I did not solve the issue after upgrading to PyTorch 2.3.1, but it was resolved after running pip install -r requirements.txt

Snosixtyboo · 2024-07-23T19:53:48Z

There seems to be an issue associated with CUB, which is failing to compute the sum over a CUDA array, for no obvious reason. We are checking what can be done.

Snosixtyboo · 2024-07-23T23:31:00Z

There seem to be unspecified PyTorch/CUB compatibility issues on Ubuntu, we will try to figure out where they come from or if we can get a more robust alternative. In the meantime, if you can, combining PyTorch built for CUDA 12.1 with a CUDA Toolkit 12.5 installation (yes, this should be fine, minor version mismatches are allowed) seems like a good choice on Ubuntu, according to Docker.

Linkersem · 2024-07-24T03:29:12Z

Hi, I built docker (based on my graphics driver, I modified it appropriately) based on the provided dockerfile to run the code, and this is my dockerfile

FROM nvcr.io/nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04
ARG USER_ID=1000
ARG GROUP_ID=1000
ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install -y --no-install-recommends git wget unzip bzip2 sudo build-essential ca-certificates openssh-server vim ffmpeg libsm6 libxext6 python3-opencv gcc-11 g++-11 cmake

# conda
ENV PATH /opt/conda/bin:$PATH 
RUN wget --quiet \
    https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
    echo 'export PATH=/opt/conda/bin:$PATH' > /etc/profile.d/conda.sh && \
    /bin/bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda && \
    rm -rf /tmp/*

# Create the user
RUN addgroup --gid $GROUP_ID user
RUN useradd --create-home -s /bin/bash --uid $USER_ID --gid $GROUP_ID docker
RUN adduser docker sudo
RUN echo "docker ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
# USER docker

# Setup hierarchical_3d_gaussians
RUN /opt/conda/bin/python -m ensurepip
RUN /opt/conda/bin/python -m pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
RUN /opt/conda/bin/python -m pip install plyfile tqdm joblib exif scikit-learn timm==0.4.5 opencv-python==4.9.0.80 gradio_imageslider gradio==4.29.0 matplotlib

but it keeps getting stuck here

Linkersem · 2024-07-24T05:48:40Z

Hi, I built docker (based on my graphics driver, I modified it appropriately) based on the provided dockerfile to run the code, and this is my dockerfile

FROM nvcr.io/nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04
ARG USER_ID=1000
ARG GROUP_ID=1000
ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install -y --no-install-recommends git wget unzip bzip2 sudo build-essential ca-certificates openssh-server vim ffmpeg libsm6 libxext6 python3-opencv gcc-11 g++-11 cmake

# conda
ENV PATH /opt/conda/bin:$PATH 
RUN wget --quiet \
    https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
    echo 'export PATH=/opt/conda/bin:$PATH' > /etc/profile.d/conda.sh && \
    /bin/bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda && \
    rm -rf /tmp/*

# Create the user
RUN addgroup --gid $GROUP_ID user
RUN useradd --create-home -s /bin/bash --uid $USER_ID --gid $GROUP_ID docker
RUN adduser docker sudo
RUN echo "docker ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
# USER docker

# Setup hierarchical_3d_gaussians
RUN /opt/conda/bin/python -m ensurepip
RUN /opt/conda/bin/python -m pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
RUN /opt/conda/bin/python -m pip install plyfile tqdm joblib exif scikit-learn timm==0.4.5 opencv-python==4.9.0.80 gradio_imageslider gradio==4.29.0 matplotlib

but it keeps getting stuck here

Well, same question, after about half an hour of waiting.

Snosixtyboo · 2024-07-24T06:45:45Z

@Linkersem

Hi,
would you mind trying to make a new docker image, but with cuda 12.5.1? There seem to be issues with Cuda 12.1

Linkersem · 2024-07-24T06:55:06Z

hi, I'm sorry, but this is a bit difficult for me, mainly because these operations on the workstation, if i modify the graphics card driver and CUDA(max available version is 12.2), it may affect other people.

Linkersem · 2024-07-25T10:21:31Z

Hello, it doesn't seem to have to be run in a cuda 12.5 environment, I have a cuda 12.3 pytroch 2.3.0 device that works fine, hope that helps.

SunHongyang10 · 2024-07-27T11:30:33Z

by far, cuda12.3 pytorch2.3.0 works

kevintsq · 2024-07-28T11:47:48Z

Currently CUDA 12.4 + PyTorch 2.4 works on Windows.

ForeverAurorak · 2024-07-29T08:43:18Z

Hi, I solved the problem.
The submodules/gaussianhierarchy/setup. py the "extra_compile_args" modified to
{"cxx": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "dependencies/eigen/")]}.
The submodules/hierarchy-rasterizer/setup. py the "extra_compile_args" modified to
{"nvcc": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "third_party/glm/")]}).
Then reinstall the hierarchy-rasterizer "pip install submodules/hierarchy-rasterizer"
Related issues:
graphdeco-inria/gaussian-splatting#41
graphdeco-inria/diff-gaussian-rasterization#10

kevintsq · 2024-07-29T10:26:49Z

Yes it works! Just a caveat, -fno-gnu-unique can only be used on Linux and -Xcompiler can only be passed to nvcc.

GoroYeh-HRI · 2024-07-30T18:15:08Z

I got the same error here (Ubuntu 18.04, nvcc -V 11.6)
I installed using command:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

After using @ForeverAurorak 's solution, it works and it's training now! Thank you so much!!!

Hi, I solved the problem. The submodules/gaussianhierarchy/setup. py the "extra_compile_args" modified to {"cxx": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "dependencies/eigen/")]}. The submodules/hierarchy-rasterizer/setup. py the "extra_compile_args" modified to {"nvcc": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "third_party/glm/")]}). Then reinstall the hierarchy-rasterizer "pip install submodules/hierarchy-rasterizer" Related issues: graphdeco-inria/gaussian-splatting#41 graphdeco-inria/diff-gaussian-rasterization#10

rowellz · 2024-08-01T16:13:08Z

Thank you to everyone in this thread for their awesome contributions, especially @ameuleman for providing a working Dockerfile. I was able to take it and put together a working docker-compose environment. Everything appears to be working but I still haven't figured out a way to connect to the remote viewer. If anyone is interested in running H3DGS via docker compose, here is the link to the complete diff: https://github.com/graphdeco-inria/hierarchical-3d-gaussians/pull/31/files

BTW I am running a RTX 3060 12GB with CUDA 12.3 installed on my host machine

Gaaaavin · 2024-08-02T19:33:13Z

Hi, I solved the problem. The submodules/gaussianhierarchy/setup. py the "extra_compile_args" modified to {"cxx": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "dependencies/eigen/")]}. The submodules/hierarchy-rasterizer/setup. py the "extra_compile_args" modified to {"nvcc": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "third_party/glm/")]}). Then reinstall the hierarchy-rasterizer "pip install submodules/hierarchy-rasterizer" Related issues: graphdeco-inria/gaussian-splatting#41 graphdeco-inria/diff-gaussian-rasterization#10

I think it should be __file__ instead of file in the two lines.

Yes it works! Just a caveat, -fno-gnu-unique can only be used on Linux and -Xcompiler can only be passed to nvcc.

This is a good point. In conclusion, the following code modification works for me on Ubuntu:
line 29 in submodules/hierarchy-rasterizer/setup.py:

extra_compile_args={"nvcc": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(__file__)), "third_party/glm/")]})

line 29 in submodules/gaussianhierarchy/setup.py:

extra_compile_args={"cxx": ["-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(__file__)), "dependencies/eigen/")]}

By modifying the two files above and reinstall via pip install -r requirements.txt, I got it to run smoothly.
I'm using a Ubuntu 20.04 machine, with PyTorch 2.3.0 (build for CUDA 12.1) and CUDA 12.1 runtime (nvcc --version = 12.1)
Note that this doesn't work if I'm using a CUDA 12.5 runtime as instructed above.

alanvinx · 2024-08-07T14:50:55Z

Hi thank you for your feedbacks, I pushed the fix to https://github.com/graphdeco-inria/hierarchy-rasterizer, please update your rasterizer using git submodule update --remote.

Regarding gaussianhierarchy I could run full_train.py successfully without modifying submodules/gaussianhierarchy/setup.py using the following Dockerfile with cuda 11.8 and 12.1.

Hi, We are still working on a fix. In the meantime, I ran it without issue with Ubuntu 22.04 and CUDA 12.5. Here is the corresponding Dockerfile:

FROM nvidia/cuda:12.5.1-cudnn-devel-ubuntu22.04
ARG USER_ID=1000
ARG GROUP_ID=1000
ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install -y --no-install-recommends git wget unzip bzip2 sudo build-essential ca-certificates openssh-server vim ffmpeg libsm6 libxext6 python3-opencv gcc-11 g++-11 cmake

# conda
ENV PATH /opt/conda/bin:$PATH 
RUN wget --quiet \
    https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
    echo 'export PATH=/opt/conda/bin:$PATH' > /etc/profile.d/conda.sh && \
    /bin/bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda && \
    rm -rf /tmp/*

# Create the user
RUN addgroup --gid $GROUP_ID user
RUN useradd --create-home -s /bin/bash --uid $USER_ID --gid $GROUP_ID docker
RUN adduser docker sudo
RUN echo "docker ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
USER docker

# Setup hierarchical_3d_gaussians
RUN /opt/conda/bin/python -m ensurepip
RUN /opt/conda/bin/python -m pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
RUN /opt/conda/bin/python -m pip install plyfile tqdm joblib exif scikit-learn timm==0.4.5 opencv-python==4.9.0.80 gradio_imageslider gradio==4.29.0 matplotlib

With 125.Dockerfile in hierarchical-3d-gaussians/:

DATASET_DIR=<Path to dataset>
docker build -t hierarchical_3d_gaussians125 -f 125.Dockerfile .
docker run -it --gpus=all --rm -v ${PWD}:/host -v ${DATASET_DIR}:/data --network=host --ipc=host hierarchical_3d_gaussians125 /bin/sh -c "cd /host; bash"
rm -r submodules/hierarchy-rasterizer/build submodules/simple-knn/build submodules/gaussianhierarchy/build
cd submodules/gaussianhierarchy
cmake . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j --config Release
cd ../..
/opt/conda/bin/python scripts/full_train.py --project_dir /data

haofengsiji · 2024-08-28T08:52:45Z

Hi, I solved the problem. The submodules/gaussianhierarchy/setup. py the "extra_compile_args" modified to {"cxx": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "dependencies/eigen/")]}. The submodules/hierarchy-rasterizer/setup. py the "extra_compile_args" modified to {"nvcc": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "third_party/glm/")]}). Then reinstall the hierarchy-rasterizer "pip install submodules/hierarchy-rasterizer" Related issues: graphdeco-inria/gaussian-splatting#41 graphdeco-inria/diff-gaussian-rasterization#10

I think it should be __file__ instead of file in the two lines.

Yes it works! Just a caveat, -fno-gnu-unique can only be used on Linux and -Xcompiler can only be passed to nvcc.

This is a good point. In conclusion, the following code modification works for me on Ubuntu: line 29 in submodules/hierarchy-rasterizer/setup.py:
extra_compile_args={"nvcc": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(__file__)), "third_party/glm/")]})
line 29 in submodules/gaussianhierarchy/setup.py:
extra_compile_args={"cxx": ["-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(__file__)), "dependencies/eigen/")]}
By modifying the two files above and reinstall via pip install -r requirements.txt, I got it to run smoothly. I'm using a Ubuntu 20.04 machine, with PyTorch 2.3.0 (build for CUDA 12.1) and CUDA 12.1 runtime (nvcc --version = 12.1) Note that this doesn't work if I'm using a CUDA 12.5 runtime as instructed above.

fix my promblem, thanks !!!

pytorch 2.3.0+cu121, nvcc 12.1

alancneves · 2024-10-13T16:37:16Z

Hi, I solved the problem. The submodules/gaussianhierarchy/setup. py the "extra_compile_args" modified to {"cxx": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "dependencies/eigen/")]}. The submodules/hierarchy-rasterizer/setup. py the "extra_compile_args" modified to {"nvcc": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(file)), "third_party/glm/")]}). Then reinstall the hierarchy-rasterizer "pip install submodules/hierarchy-rasterizer" Related issues: graphdeco-inria/gaussian-splatting#41 graphdeco-inria/diff-gaussian-rasterization#10

I think it should be __file__ instead of file in the two lines.

Yes it works! Just a caveat, -fno-gnu-unique can only be used on Linux and -Xcompiler can only be passed to nvcc.

This is a good point. In conclusion, the following code modification works for me on Ubuntu: line 29 in submodules/hierarchy-rasterizer/setup.py:
extra_compile_args={"nvcc": ["-Xcompiler", "-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(__file__)), "third_party/glm/")]})
line 29 in submodules/gaussianhierarchy/setup.py:
extra_compile_args={"cxx": ["-fno-gnu-unique","-I" + os.path.join(os.path.dirname(os.path.abspath(__file__)), "dependencies/eigen/")]}
By modifying the two files above and reinstall via pip install -r requirements.txt, I got it to run smoothly. I'm using a Ubuntu 20.04 machine, with PyTorch 2.3.0 (build for CUDA 12.1) and CUDA 12.1 runtime (nvcc --version = 12.1) Note that this doesn't work if I'm using a CUDA 12.5 runtime as instructed above.

Fixed my problem on Ubuntu 22.04 + CUDA 11.8 + PyTorch 2.3.0 on docker!

This was referenced Aug 2, 2024

fix txt writing issue #11

Open

MemoryError: std::bad_alloc: cudaErrorMemoryAllocation: out of memory #35

Open

RuntimeError: CUDA error: an illegal memory access was encountered #1

RuntimeError: CUDA error: an illegal memory access was encountered #1

Comments

SunHongyang10 commented Jul 21, 2024

Vilour commented Jul 21, 2024

Snosixtyboo commented Jul 21, 2024

Vilour commented Jul 21, 2024

Vilour commented Jul 21, 2024

ameuleman commented Jul 21, 2024

Vilour commented Jul 21, 2024

Vilour commented Jul 21, 2024

ameuleman commented Jul 21, 2024

Vilour commented Jul 21, 2024

Vilour commented Jul 21, 2024

Snosixtyboo commented Jul 22, 2024

SunHongyang10 commented Jul 22, 2024

anchun commented Jul 22, 2024

Vilour commented Jul 22, 2024

ameuleman commented Jul 22, 2024

PLUS-WAVE commented Jul 22, 2024

Vilour commented Jul 22, 2024

ameuleman commented Jul 22, 2024

kevintsq commented Jul 23, 2024

han-xiangyu commented Jul 23, 2024 • edited Loading

ameuleman commented Jul 23, 2024

PLUS-WAVE commented Jul 23, 2024 • edited Loading

Snosixtyboo commented Jul 23, 2024

Snosixtyboo commented Jul 23, 2024

Linkersem commented Jul 24, 2024

Linkersem commented Jul 24, 2024

Snosixtyboo commented Jul 24, 2024 • edited Loading

Linkersem commented Jul 24, 2024

Linkersem commented Jul 25, 2024

SunHongyang10 commented Jul 27, 2024

kevintsq commented Jul 28, 2024

ForeverAurorak commented Jul 29, 2024

kevintsq commented Jul 29, 2024

GoroYeh-HRI commented Jul 30, 2024

rowellz commented Aug 1, 2024 • edited Loading

Gaaaavin commented Aug 2, 2024

alanvinx commented Aug 7, 2024

haofengsiji commented Aug 28, 2024 • edited Loading

alancneves commented Oct 13, 2024

han-xiangyu commented Jul 23, 2024 •

edited

Loading

PLUS-WAVE commented Jul 23, 2024 •

edited

Loading

Snosixtyboo commented Jul 24, 2024 •

edited

Loading

rowellz commented Aug 1, 2024 •

edited

Loading

haofengsiji commented Aug 28, 2024 •

edited

Loading