Error Building torch_tvm [NGC Container] #112

SrivastavaKshitij · 2019-09-06T15:03:50Z

I am trying to build torch_tvm inside pytorch ngc container [19.08-py3]. However, I am encountering the same error as in #77 .

CMakeFiles/_torch_tvm.dir/build.make:218: recipe for target 'CMakeFiles/_torch_tvm.dir/torch_tvm/fusion_pass.cpp.o' failed
make[2]: *** [CMakeFiles/_torch_tvm.dir/torch_tvm/fusion_pass.cpp.o] Error 1
In file included from /tvm/torch_tvm/compiler.h:13:0,
                 from /tvm/torch_tvm/register.cpp:8:
/tvm/torch_tvm/memory_utils.h: In member function ‘void torch_tvm::utils::DLManagedTensorDeleter::operator()(DLManagedTensor*)’:
/tvm/torch_tvm/memory_utils.h:22:24: warning: deleting ‘void*’ is undefined [-Wdelete-incomplete]
       delete dl_tensor.data;
                        ^~~~
CMakeFiles/Makefile2:73: recipe for target 'CMakeFiles/_torch_tvm.dir/all' failed
make[1]: *** [CMakeFiles/_torch_tvm.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2
Traceback (most recent call last):
  File "setup.py", line 273, in <module>
    url='https://github.com/pytorch/tvm',
  File "/opt/conda/lib/python3.6/site-packages/setuptools/__init__.py", line 145, in setup
    return distutils.core.setup(**attrs)
  File "/opt/conda/lib/python3.6/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/opt/conda/lib/python3.6/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/opt/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "setup.py", line 203, in run
    setuptools.command.install.install.run(self)
  File "/opt/conda/lib/python3.6/site-packages/setuptools/command/install.py", line 65, in run
    orig.install.run(self)
  File "/opt/conda/lib/python3.6/distutils/command/install.py", line 545, in run
    self.run_command('build')
  File "/opt/conda/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/opt/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/opt/conda/lib/python3.6/distutils/command/build.py", line 135, in run
    self.run_command(cmd_name)
  File "/opt/conda/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/opt/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "setup.py", line 187, in run
    self.run_command('cmake_build')
  File "/opt/conda/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/opt/conda/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "setup.py", line 176, in run
    self._run_build()
  File "setup.py", line 165, in _run_build
    subprocess.check_call(build_args)
  File "/opt/conda/lib/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/opt/conda/bin/cmake', '--build', '.', '--', '-j', '12']' returned non-zero exit status 2.

I tried different methods described here , here and here but I havent had any success.

How can this issue be fixed ?

The text was updated successfully, but these errors were encountered:

SrivastavaKshitij · 2019-09-06T15:05:07Z

Downloaded llvm using wget http://releases.llvm.org/8.0.0/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-16.04.tar.xz and made sure its in path when building torch_tvm

SrivastavaKshitij · 2019-09-10T20:05:59Z

I tried nightly build container from Pytorch docker hub pytorch/pytorch:nightly-devel-cuda10.0-cudnn7 and I encounter same errors.

kimishpatel · 2019-09-11T14:23:26Z

Can you paste repro instructions? It is not clear where the error is coming from. The memory_utils.h stuff seems like a warning and it does not seem that treating warning as error is enabled either.

SrivastavaKshitij · 2019-09-11T16:57:50Z

Repro instructions:

docker pull nvcr.io/nvidia/pytorch:19.06-py3
docker_image=nvcr.io/nvidia/pytorch:19.06-py3
docker run -e NVIDIA_VISIBLE_DEVICES=0 --gpus 0 -it --shm-size=1g --ulimit memlock=-1  --rm  -v $PWD:/workspace/work $docker_image

[Inside the container], I go to the base directory : cd /
wget http://releases.llvm.org/8.0.0/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-16.04.tar.xz
tar -xf clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-16.04.tar.xz
export PATH=$PATH:/clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-16.04/bin/
ln -s /clang+llvm-8.0.0-x86_64-linux-gnu-ubuntu-16.04/bin/llvm-config /usr/bin/llvm-config
git clone --recursive https://github.com/pytorch/tvm.git
cd tvm/
python setup.py install --cmake

I have attached the full output:

build.txt

kimishpatel · 2019-09-11T17:12:18Z

@SrivastavaKshitij, error seems to be coming from change in pytorch API.

/tvm/torch_tvm/compiler.cpp: In static member function ‘static tvm::relay::Var TVMCompiler::convertToRelay(torch::jit::Value*, TVMContext)’:
/tvm/torch_tvm/compiler.cpp:130:39: error: ‘using element_type = struct c10::TensorType {aka struct c10::TensorType}’ has no member named ‘device’
     auto optional_device_type = pt_t->device();
                                       ^~~~~~

Maybe try with latest release?

kimishpatel · 2019-09-11T17:12:40Z

@bwasti ^^

SrivastavaKshitij · 2019-09-11T17:29:59Z

@kimishpatel : I tried the latest ngc container [19.08-py3] and have the same error

SrivastavaKshitij · 2019-09-16T17:31:07Z

I was wondering if there is any update ?

bwasti · 2019-09-17T16:10:16Z

I'm not entirely sure what version of PT NGC containers are shipping, but we've kept this repo up to date with PyTorch's master branch. Would you be able to try building PyTorch from source first? There is an API mismatch in the build that indicates you are using too old a version of PT.

SrivastavaKshitij · 2019-09-17T19:06:04Z

I have to try torch_tvm on different gpus present in different workstations and so the feasible way for me is to build one docker image and pass it around. There is a latest docker image from pytorch on Docker Hub that was released 4 days ago. I used 1.2-cuda10.0-cudnn7-devel tag and I still get the same error.

bwasti · 2019-09-17T20:09:40Z

that image is shipped with PT 1.2, which is unfortunately not compatible with torch_tvm. Can you build a docker image with PT built from source with a recent master checkout instead?

SrivastavaKshitij · 2019-09-18T00:45:35Z

Hey @bwasti : I was able to create a docker image as you suggested. It works. Here are the steps if anybody wants to install torch_tvm inside a container.

Also, is it possible to package torch_tvm as a part of pytorch container in future ? Reason: It's a very cumbersome process to install torch_tvm inside a container , phew !!

doublejtoh · 2019-12-19T07:26:56Z

Hi, @SrivastavaKshitij
Thanks to your steps to install torch tvm,
while following your suggestions, i successfully installed torch tvm,

but i got below import error, as you previously suffered.

Can you inform me the exact version of pytorch you built?

SrivastavaKshitij · 2019-12-19T21:29:14Z

I did it many months ago but i think it was pytorch 1.2 from master.

SrivastavaKshitij closed this as completed Oct 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error Building torch_tvm [NGC Container] #112

Error Building torch_tvm [NGC Container] #112

SrivastavaKshitij commented Sep 6, 2019

SrivastavaKshitij commented Sep 6, 2019

SrivastavaKshitij commented Sep 10, 2019

kimishpatel commented Sep 11, 2019

SrivastavaKshitij commented Sep 11, 2019

kimishpatel commented Sep 11, 2019

kimishpatel commented Sep 11, 2019

SrivastavaKshitij commented Sep 11, 2019

SrivastavaKshitij commented Sep 16, 2019

bwasti commented Sep 17, 2019

SrivastavaKshitij commented Sep 17, 2019 •

edited

Loading

bwasti commented Sep 17, 2019

SrivastavaKshitij commented Sep 18, 2019 •

edited

Loading

doublejtoh commented Dec 19, 2019 •

edited

Loading

SrivastavaKshitij commented Dec 19, 2019

Error Building torch_tvm [NGC Container] #112

Error Building torch_tvm [NGC Container] #112

Comments

SrivastavaKshitij commented Sep 6, 2019

SrivastavaKshitij commented Sep 6, 2019

SrivastavaKshitij commented Sep 10, 2019

kimishpatel commented Sep 11, 2019

SrivastavaKshitij commented Sep 11, 2019

kimishpatel commented Sep 11, 2019

kimishpatel commented Sep 11, 2019

SrivastavaKshitij commented Sep 11, 2019

SrivastavaKshitij commented Sep 16, 2019

bwasti commented Sep 17, 2019

SrivastavaKshitij commented Sep 17, 2019 • edited Loading

bwasti commented Sep 17, 2019

SrivastavaKshitij commented Sep 18, 2019 • edited Loading

doublejtoh commented Dec 19, 2019 • edited Loading

SrivastavaKshitij commented Dec 19, 2019

SrivastavaKshitij commented Sep 17, 2019 •

edited

Loading

SrivastavaKshitij commented Sep 18, 2019 •

edited

Loading

doublejtoh commented Dec 19, 2019 •

edited

Loading