Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the environment requirements of this code #1

Closed
Apocalypse2403 opened this issue Dec 6, 2021 · 12 comments
Closed

the environment requirements of this code #1

Apocalypse2403 opened this issue Dec 6, 2021 · 12 comments

Comments

@Apocalypse2403
Copy link

Hi Benedikt,
I have read your paper, your work is great and thanks for releasing your code !
When I use your code, it seems that there are some conflicts between your code and my environment, especially about the part of Chamfer Distance.

Would you mind sharing the requirements about your code?
I'm using RTX 3090 GPU, CUDA11.3 and pytorch1.10.

Thanks

@qifang-robotics
Copy link

qifang-robotics commented Dec 6, 2021

Hi Apocalypse2403,

Recently I just successfully built the whole project in an RTX3090 machine with CUDA11.1, Ubuntu 20.04.
After following the instructions in setup doc, I installed torch of version 1.8.0 with support for cuda111, just simply install it by

pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html

Moreover, just make sure your g++/gcc version is not higher than 8, since the project won't be built successfully under higher g++/gcc versions.

I hope this will help you.

@benemer
Copy link
Member

benemer commented Dec 6, 2021

Hi Apocalypse2403,

Thank you for your interest in our work! We use a Quadro RTX 5000 with CUDA 10.2, Ubuntu 20.04.

Unfortunately, poetry does not (yet) support torch+cuda build variants, see here and here. However, as n0tva1id pointed out, you can manually pip install Pytorch with the CUDA version you need. In your case, this should be

pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

after running the poetry install command.

I hope that helped.

@Apocalypse2403
Copy link
Author

Hi Apocalypse2403,

Recently I just successfully built the whole project in an RTX3090 machine with CUDA11.1, Ubuntu 20.04. After following the instructions in setup doc, I installed torch of version 1.8.0 with support for cuda111, just simply install it by

pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html

Moreover, just make sure your g++/gcc version is not higher than 8, since the project won't be built successfully under higher g++/gcc versions.

I hope this will help you.

Thanks for your replying and I have already establish the torch1.8.0+cuda11.1 environment. But when I run the train.py, I got a problem seems about the gcc version:

ImportError: /home/jz/.cache/torch_extensions/cd/cd.so: undefined symbol: _ZNK2at6Tensor4sizeEl

My gcc/g++ version is 7.5.0. Do you have any idea about that?

@Apocalypse2403
Copy link
Author

Hi Apocalypse2403,

Thank you for your interest in our work! We use a Quadro RTX 5000 with CUDA 10.2, Ubuntu 20.04.

Unfortunately, poetry does not (yet) support torch+cuda build variants, see here and here. However, as n0tva1id pointed out, you can manually pip install Pytorch with the CUDA version you need. In your case, this should be

pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

after running the poetry install command.

I hope that helped.

Thanks for your replying !

I solved the running environment, but met another issue when I built the c++ code about Chamfer Distance part.

Traceback (most recent call last):
File "/home/jz/point-cloud-prediction-main/pcf/train.py", line 14, in
from pcf.models.TCNet import TCNet
File "/home/jz/point-cloud-prediction-main/pcf/models/TCNet.py", line 9, in
from pcf.models.base import BasePredictionModel
File "/home/jz/point-cloud-prediction-main/pcf/models/base.py", line 13, in
from pcf.models.loss import Loss
File "/home/jz/point-cloud-prediction-main/pcf/models/loss.py", line 9, in
from pyTorchChamferDistance.chamfer_distance import ChamferDistance
File "/home/jz/point-cloud-prediction-main/pcf/pyTorchChamferDistance/chamfer_distance/init.py", line 1, in
from .chamfer_distance import ChamferDistance
File "/home/jz/point-cloud-prediction-main/pcf/pyTorchChamferDistance/chamfer_distance/chamfer_distance.py", line 6, in
cd = load(name="cd",
File "/home/jz/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1079, in load
return _jit_compile(
File "/home/jz/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1317, in _jit_compile
return _import_module_from_library(name, build_directory, is_python_module)
File "/home/jz/anaconda3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1703, in _import_module_from_library
return imp.load_module(module_name, file, path, description) # type: ignore
File "/home/jz/anaconda3/lib/python3.8/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
File "/home/jz/anaconda3/lib/python3.8/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: /home/jz/.cache/torch_extensions/cd/cd.so: undefined symbol: _ZNK2at6Tensor4sizeEl

Seems like there is a problem about gcc/g++ version? My gcc/g++ version is 7.5.0. But I tried using other versions, such as 6.5, 5.5, it still didn't work.

Do you have any idea about that ? Thanks again.

@benemer
Copy link
Member

benemer commented Dec 7, 2021

I now tested on two machines, one with Ubuntu 18.04 and gcc/g++ 7.5.0 and one with Ubuntu 20.04 and gcc/g++ 9.3.0, both worked. Can you try to reinstall the poetry environment again?

This can be done by

rm -rf ~/.cache/pypoetry/virtualenvs/point-cloud-prediction-*

also clear poetry's pypi cache by running

poetry cache clear pypi --all

and remove the compiled Chamfer distance files from the cache:

rm -rf ~/.cache/torch_extensions/cd

Finally, reinstall with

poetry install

and activate the environment by

poetry shell

@benemer
Copy link
Member

benemer commented Dec 7, 2021

For further diagnostics, please run this Python diagnosis code modified from here in your poetry environment and post the result:

import sys
import os
import platform
import subprocess


def parse_nvidia_smi():
    sp = subprocess.Popen(
        ["nvidia-smi", "-q"], stdout=subprocess.PIPE, stderr=subprocess.PIPE
    )
    out_dict = dict()
    for item in sp.communicate()[0].decode("utf-8").split("\n"):
        if item.count(":") == 1:
            key, val = [i.strip() for i in item.split(":")]
            out_dict[key] = val
    return out_dict


def print_diagnostics():
    print("==========System==========")
    print(platform.platform())
    os.system("cat /etc/lsb-release")
    print(sys.version)

    print("==========Pytorch==========")
    try:
        import torch

        print(torch.__version__)
        print(f"torch.cuda.is_available(): {torch.cuda.is_available()}")
    except ImportError:
        print("torch not installed")

    print("==========NVIDIA-SMI==========")
    os.system("which nvidia-smi")
    for k, v in parse_nvidia_smi().items():
        if "version" in k.lower():
            print(k, v)

    print("==========NVCC==========")
    os.system("which nvcc")
    os.system("nvcc --version")

    print("==========CC==========")
    CC = "c++"
    if "CC" in os.environ or "CXX" in os.environ:
        # distutils only checks CC not CXX
        if "CXX" in os.environ:
            os.environ["CC"] = os.environ["CXX"]
            CC = os.environ["CXX"]
        else:
            CC = os.environ["CC"]
        print(f"CC={CC}")
    os.system(f"which {CC}")
    os.system(f"{CC} --version")

    print("==========ChamferDistance==========")
    try:
        from pyTorchChamferDistance.chamfer_distance import ChamferDistance
        print("ChamferDistance is installed")
    except ImportError:
        print("ChamferDistance not installed")


if __name__ == "__main__":
    print_diagnostics()

@Apocalypse2403
Copy link
Author

For further diagnostics, please run this Python diagnosis code modified from here in your poetry environment and post the result:

import sys
import os
import platform
import subprocess


def parse_nvidia_smi():
    sp = subprocess.Popen(
        ["nvidia-smi", "-q"], stdout=subprocess.PIPE, stderr=subprocess.PIPE
    )
    out_dict = dict()
    for item in sp.communicate()[0].decode("utf-8").split("\n"):
        if item.count(":") == 1:
            key, val = [i.strip() for i in item.split(":")]
            out_dict[key] = val
    return out_dict


def print_diagnostics():
    print("==========System==========")
    print(platform.platform())
    os.system("cat /etc/lsb-release")
    print(sys.version)

    print("==========Pytorch==========")
    try:
        import torch

        print(torch.__version__)
        print(f"torch.cuda.is_available(): {torch.cuda.is_available()}")
    except ImportError:
        print("torch not installed")

    print("==========NVIDIA-SMI==========")
    os.system("which nvidia-smi")
    for k, v in parse_nvidia_smi().items():
        if "version" in k.lower():
            print(k, v)

    print("==========NVCC==========")
    os.system("which nvcc")
    os.system("nvcc --version")

    print("==========CC==========")
    CC = "c++"
    if "CC" in os.environ or "CXX" in os.environ:
        # distutils only checks CC not CXX
        if "CXX" in os.environ:
            os.environ["CC"] = os.environ["CXX"]
            CC = os.environ["CXX"]
        else:
            CC = os.environ["CC"]
        print(f"CC={CC}")
    os.system(f"which {CC}")
    os.system(f"{CC} --version")

    print("==========ChamferDistance==========")
    try:
        from pyTorchChamferDistance.chamfer_distance import ChamferDistance
        print("ChamferDistance is installed")
    except ImportError:
        print("ChamferDistance not installed")


if __name__ == "__main__":
    print_diagnostics()

Here's the result:

==========System==========
Linux-4.15.0-162-generic-x86_64-with-glibc2.10
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0]
==========Pytorch==========
1.8.0+cu111
torch.cuda.is_available(): True
==========NVIDIA-SMI==========
/usr/bin/nvidia-smi
Driver Version 470.74
CUDA Version 11.4
VBIOS Version 94.02.26.40.92
Image Version N/A
GSP Firmware Version N/A
==========NVCC==========
/usr/local/cuda/bin/nvcc
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0
==========CC==========
/usr/bin/c++
c++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

==========ChamferDistance==========
ChamferDistance is installed

@Apocalypse2403
Copy link
Author

For further diagnostics, please run this Python diagnosis code modified from here in your poetry environment and post the result:

import sys
import os
import platform
import subprocess


def parse_nvidia_smi():
    sp = subprocess.Popen(
        ["nvidia-smi", "-q"], stdout=subprocess.PIPE, stderr=subprocess.PIPE
    )
    out_dict = dict()
    for item in sp.communicate()[0].decode("utf-8").split("\n"):
        if item.count(":") == 1:
            key, val = [i.strip() for i in item.split(":")]
            out_dict[key] = val
    return out_dict


def print_diagnostics():
    print("==========System==========")
    print(platform.platform())
    os.system("cat /etc/lsb-release")
    print(sys.version)

    print("==========Pytorch==========")
    try:
        import torch

        print(torch.__version__)
        print(f"torch.cuda.is_available(): {torch.cuda.is_available()}")
    except ImportError:
        print("torch not installed")

    print("==========NVIDIA-SMI==========")
    os.system("which nvidia-smi")
    for k, v in parse_nvidia_smi().items():
        if "version" in k.lower():
            print(k, v)

    print("==========NVCC==========")
    os.system("which nvcc")
    os.system("nvcc --version")

    print("==========CC==========")
    CC = "c++"
    if "CC" in os.environ or "CXX" in os.environ:
        # distutils only checks CC not CXX
        if "CXX" in os.environ:
            os.environ["CC"] = os.environ["CXX"]
            CC = os.environ["CXX"]
        else:
            CC = os.environ["CC"]
        print(f"CC={CC}")
    os.system(f"which {CC}")
    os.system(f"{CC} --version")

    print("==========ChamferDistance==========")
    try:
        from pyTorchChamferDistance.chamfer_distance import ChamferDistance
        print("ChamferDistance is installed")
    except ImportError:
        print("ChamferDistance not installed")


if __name__ == "__main__":
    print_diagnostics()

After that I run the train.py and I got:

fatal: not a git repository (or any of the parent directories): .git
Traceback (most recent call last):
File "pcf/train.py", line 104, in
subprocess.check_output(["git", "rev-parse", "--short", "HEAD"]).strip()
File "/home/jz/anaconda3/lib/python3.8/subprocess.py", line 411, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/home/jz/anaconda3/lib/python3.8/subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'rev-parse', '--short', 'HEAD']' returned non-zero exit status 128.

Thanks for your help.

@Apocalypse2403
Copy link
Author

Hi Apocalypse2403,

Recently I just successfully built the whole project in an RTX3090 machine with CUDA11.1, Ubuntu 20.04. After following the instructions in setup doc, I installed torch of version 1.8.0 with support for cuda111, just simply install it by

pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html

Moreover, just make sure your g++/gcc version is not higher than 8, since the project won't be built successfully under higher g++/gcc versions.

I hope this will help you.

After solving the environment as you do, I run the train.py and I got:

fatal: not a git repository (or any of the parent directories): .git
Traceback (most recent call last):
File "pcf/train.py", line 104, in
subprocess.check_output(["git", "rev-parse", "--short", "HEAD"]).strip()
File "/home/jz/anaconda3/lib/python3.8/subprocess.py", line 411, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/home/jz/anaconda3/lib/python3.8/subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'rev-parse', '--short', 'HEAD']' returned non-zero exit status 128.

Do you have any idea about that?

@benemer
Copy link
Member

benemer commented Dec 7, 2021

Since the diagnostics script returns

==========ChamferDistance==========
ChamferDistance is installed

the problem with the Chamfer distance submodule seems to be resolved. Can you confirm that?

Your recent error message

fatal: not a git repository (or any of the parent directories): .git

indicates that your point-cloud-prediction directory is not a git repository and therefore the git commit ID can not be retrieved for the log ID. Did you clone the repo as described in the README?

Can you run git status in the point-cloud-prediction directory?

@Apocalypse2403
Copy link
Author

Since the diagnostics script returns

==========ChamferDistance==========
ChamferDistance is installed

the problem with the Chamfer distance submodule seems to be resolved. Can you confirm that?

Your recent error message

fatal: not a git repository (or any of the parent directories): .git

indicates that your point-cloud-prediction directory is not a git repository and therefore the git commit ID can not be retrieved for the log ID. Did you clone the repo as described in the README?

Can you run git status in the point-cloud-prediction directory?

Hi benemer,

I found where my problems came from. I just download the code to my pc to use it as the past. Now I clone it as described in the README and the code begins to run. Although it is just processing the sequences, at least the environment issue is solved. Maybe there will be other problems which I need your help to solve about the code.

Here I have a problem about the paper. I noticed that only the Chamfer Distance and L1 loss are used for evaluating the prediction results. Is there any other metrics to compare the gt and the prediction ?

Thanks for your help.

@benemer
Copy link
Member

benemer commented Dec 8, 2021

I am happy to hear that the environment issues are solved. I will now close this issue. If you have other questions not related to the environment, please open a new issue to not mix up topics.

Regarding your question: Since the pixel-wise L1 loss on the range image is not suitable for 3D point-based prediction methods, we only use the Chamfer distance for comparison. Another metric would be the Earth mover's distance which is not evaluated in our paper.

@benemer benemer closed this as completed Dec 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants