Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name 'tutel_custom_kernel' from 'tutel.impls.jit_compiler' #198

Open
zhaojiancheng007 opened this issue Mar 30, 2023 · 12 comments

Comments

@zhaojiancheng007
Copy link

No description provided.

@ghostplant
Copy link
Contributor

It is usually due to environmental issue that Pytorch fails to find CUDA SDK.
Can you print the log of installation command below:

python3 -m pip install --verbose --user --upgrade git+https://github.com/microsoft/tutel@main

@zhaojiancheng007
Copy link
Author

Using pip 23.0.1 from /home/ubuntu/anaconda3/envs/snerf/lib/python3.8/site-packages/pip (python 3.8)
Looking in indexes: https://mirrors.bfsu.edu.cn/pypi/web/simple/
Collecting git+https://github.com/microsoft/tutel@main
Cloning https://github.com/microsoft/tutel (to revision main) to /tmp/pip-req-build-f3vo8y7s
Running command git version
git version 2.25.1
Running command git clone --filter=blob:none https://github.com/microsoft/tutel /tmp/pip-req-build-f3vo8y7s
Cloning into '/tmp/pip-req-build-f3vo8y7s'...
Updating files: 3% (2/61)
Updating files: 4% (3/61)
Updating files: 6% (4/61)
Updating files: 8% (5/61)
Updating files: 9% (6/61)
Updating files: 11% (7/61)
Updating files: 13% (8/61)
Updating files: 14% (9/61)
Updating files: 16% (10/61)
Updating files: 18% (11/61)
Updating files: 19% (12/61)
Updating files: 21% (13/61)
Updating files: 22% (14/61)
Updating files: 24% (15/61)
Updating files: 26% (16/61)
Updating files: 27% (17/61)
Updating files: 29% (18/61)
Updating files: 31% (19/61)
Updating files: 32% (20/61)
Updating files: 34% (21/61)
Updating files: 36% (22/61)
Updating files: 37% (23/61)
Updating files: 39% (24/61)
Updating files: 40% (25/61)
Updating files: 42% (26/61)
Updating files: 44% (27/61)
Updating files: 45% (28/61)
Updating files: 47% (29/61)
Updating files: 49% (30/61)
Updating files: 50% (31/61)
Updating files: 52% (32/61)
Updating files: 54% (33/61)
Updating files: 55% (34/61)
Updating files: 57% (35/61)
Updating files: 59% (36/61)
Updating files: 60% (37/61)
Updating files: 62% (38/61)
Updating files: 63% (39/61)
Updating files: 65% (40/61)
Updating files: 67% (41/61)
Updating files: 68% (42/61)
Updating files: 70% (43/61)
Updating files: 72% (44/61)
Updating files: 73% (45/61)
Updating files: 75% (46/61)
Updating files: 77% (47/61)
Updating files: 78% (48/61)
Updating files: 80% (49/61)
Updating files: 81% (50/61)
Updating files: 83% (51/61)
Updating files: 85% (52/61)
Updating files: 86% (53/61)
Updating files: 88% (54/61)
Updating files: 90% (55/61)
Updating files: 91% (56/61)
Updating files: 93% (57/61)
Updating files: 95% (58/61)
Updating files: 96% (59/61)
Updating files: 98% (60/61)
Updating files: 100% (61/61)
Updating files: 100% (61/61), done.
Running command git show-ref main
1456b49 refs/heads/main
1456b49 refs/remotes/origin/main
Running command git symbolic-ref -q HEAD
refs/heads/main
Resolved https://github.com/microsoft/tutel to commit 1456b49
Running command git rev-parse HEAD
1456b49
Running command python setup.py egg_info
running egg_info
creating /tmp/pip-pip-egg-info-aiosqnkd/tutel.egg-info
writing manifest file '/tmp/pip-pip-egg-info-aiosqnkd/tutel.egg-info/SOURCES.txt'
writing manifest file '/tmp/pip-pip-egg-info-aiosqnkd/tutel.egg-info/SOURCES.txt'
Preparing metadata (setup.py) ... done
Building wheels for collected packages: tutel
Running command git rev-parse HEAD
1456b49
Running command python setup.py bdist_wheel
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.8
creating build/lib.linux-x86_64-3.8/tutel
copying tutel/system.py -> build/lib.linux-x86_64-3.8/tutel
copying tutel/net.py -> build/lib.linux-x86_64-3.8/tutel
copying tutel/jit.py -> build/lib.linux-x86_64-3.8/tutel
copying tutel/moe.py -> build/lib.linux-x86_64-3.8/tutel
copying tutel/init.py -> build/lib.linux-x86_64-3.8/tutel
creating build/lib.linux-x86_64-3.8/tutel/jit_kernels
copying tutel/jit_kernels/gating.py -> build/lib.linux-x86_64-3.8/tutel/jit_kernels
copying tutel/jit_kernels/sparse.py -> build/lib.linux-x86_64-3.8/tutel/jit_kernels
copying tutel/jit_kernels/init.py -> build/lib.linux-x86_64-3.8/tutel/jit_kernels
creating build/lib.linux-x86_64-3.8/tutel/parted
copying tutel/parted/patterns.py -> build/lib.linux-x86_64-3.8/tutel/parted
copying tutel/parted/spmdx.py -> build/lib.linux-x86_64-3.8/tutel/parted
copying tutel/parted/init.py -> build/lib.linux-x86_64-3.8/tutel/parted
copying tutel/parted/solver.py -> build/lib.linux-x86_64-3.8/tutel/parted
creating build/lib.linux-x86_64-3.8/tutel/examples
copying tutel/examples/moe_mnist.py -> build/lib.linux-x86_64-3.8/tutel/examples
copying tutel/examples/helloworld_from_scratch.py -> build/lib.linux-x86_64-3.8/tutel/examples
copying tutel/examples/helloworld.py -> build/lib.linux-x86_64-3.8/tutel/examples
copying tutel/examples/helloworld_amp.py -> build/lib.linux-x86_64-3.8/tutel/examples
copying tutel/examples/moe_cifar10.py -> build/lib.linux-x86_64-3.8/tutel/examples
copying tutel/examples/helloworld_ddp_tutel.py -> build/lib.linux-x86_64-3.8/tutel/examples
copying tutel/examples/helloworld_deepspeed.py -> build/lib.linux-x86_64-3.8/tutel/examples
copying tutel/examples/init.py -> build/lib.linux-x86_64-3.8/tutel/examples
copying tutel/examples/helloworld_ddp.py -> build/lib.linux-x86_64-3.8/tutel/examples
creating build/lib.linux-x86_64-3.8/tutel/experts
copying tutel/experts/ffn.py -> build/lib.linux-x86_64-3.8/tutel/experts
copying tutel/experts/init.py -> build/lib.linux-x86_64-3.8/tutel/experts
creating build/lib.linux-x86_64-3.8/tutel/checkpoint
copying tutel/checkpoint/scatter.py -> build/lib.linux-x86_64-3.8/tutel/checkpoint
copying tutel/checkpoint/init.py -> build/lib.linux-x86_64-3.8/tutel/checkpoint
copying tutel/checkpoint/gather.py -> build/lib.linux-x86_64-3.8/tutel/checkpoint
creating build/lib.linux-x86_64-3.8/tutel/custom
copying tutel/custom/init.py -> build/lib.linux-x86_64-3.8/tutel/custom
creating build/lib.linux-x86_64-3.8/tutel/launcher
copying tutel/launcher/run.py -> build/lib.linux-x86_64-3.8/tutel/launcher
copying tutel/launcher/execl.py -> build/lib.linux-x86_64-3.8/tutel/launcher
copying tutel/launcher/init.py -> build/lib.linux-x86_64-3.8/tutel/launcher
creating build/lib.linux-x86_64-3.8/tutel/gates
copying tutel/gates/cosine_top.py -> build/lib.linux-x86_64-3.8/tutel/gates
copying tutel/gates/top.py -> build/lib.linux-x86_64-3.8/tutel/gates
copying tutel/gates/init.py -> build/lib.linux-x86_64-3.8/tutel/gates
creating build/lib.linux-x86_64-3.8/tutel/impls
copying tutel/impls/fast_dispatch.py -> build/lib.linux-x86_64-3.8/tutel/impls
copying tutel/impls/jit_compiler.py -> build/lib.linux-x86_64-3.8/tutel/impls
copying tutel/impls/moe_layer.py -> build/lib.linux-x86_64-3.8/tutel/impls
copying tutel/impls/overlap.py -> build/lib.linux-x86_64-3.8/tutel/impls
copying tutel/impls/communicate.py -> build/lib.linux-x86_64-3.8/tutel/impls
copying tutel/impls/init.py -> build/lib.linux-x86_64-3.8/tutel/impls
copying tutel/impls/losses.py -> build/lib.linux-x86_64-3.8/tutel/impls
creating build/lib.linux-x86_64-3.8/tutel/parted/backend
copying tutel/parted/backend/init.py -> build/lib.linux-x86_64-3.8/tutel/parted/backend
creating build/lib.linux-x86_64-3.8/tutel/parted/backend/torch
copying tutel/parted/backend/torch/config.py -> build/lib.linux-x86_64-3.8/tutel/parted/backend/torch
copying tutel/parted/backend/torch/executor.py -> build/lib.linux-x86_64-3.8/tutel/parted/backend/torch
copying tutel/parted/backend/torch/init.py -> build/lib.linux-x86_64-3.8/tutel/parted/backend/torch
running build_ext
creating /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8
creating /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/tutel
creating /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/tutel/custom
Emitting ninja build file /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/tutel/custom/custom_kernel.o.d -pthread -B /home/ubuntu/anaconda3/envs/snerf/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/TH -I/home/ubuntu/.local/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-11.6/include -I/home/ubuntu/anaconda3/envs/snerf/include/python3.8 -c -c /tmp/pip-req-build-f3vo8y7s/tutel/custom/custom_kernel.cpp -o /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/tutel/custom/custom_kernel.o -Wno-sign-compare -Wno-unused-but-set-variable -Wno-terminate -Wno-unused-function -Wno-strict-aliasing -DUSE_GPU -DUSE_NCCL -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=tutel_custom_kernel -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
g++ -pthread -shared -B /home/ubuntu/anaconda3/envs/snerf/compiler_compat -L/home/ubuntu/anaconda3/envs/snerf/lib -Wl,-rpath=/home/ubuntu/anaconda3/envs/snerf/lib -Wl,--no-as-needed -Wl,--sysroot=/ /tmp/pip-req-build-f3vo8y7s/build/temp.linux-x86_64-3.8/./tutel/custom/custom_kernel.o -L/usr/local/cuda/lib64/stubs -L/home/ubuntu/.local/lib/python3.8/site-packages/torch/lib -L/usr/local/cuda-11.6/lib64 -lcuda -lnvrtc -lnccl -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda_cu -ltorch_cuda_cpp -o build/lib.linux-x86_64-3.8/tutel_custom_kernel.cpython-38-x86_64-linux-gnu.so
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/tutel
creating build/bdist.linux-x86_64/wheel/tutel/jit_kernels
creating build/bdist.linux-x86_64/wheel/tutel/parted
creating build/bdist.linux-x86_64/wheel/tutel/parted/backend
creating build/bdist.linux-x86_64/wheel/tutel/parted/backend/torch
creating build/bdist.linux-x86_64/wheel/tutel/examples
creating build/bdist.linux-x86_64/wheel/tutel/experts
creating build/bdist.linux-x86_64/wheel/tutel/checkpoint
creating build/bdist.linux-x86_64/wheel/tutel/custom
creating build/bdist.linux-x86_64/wheel/tutel/launcher
creating build/bdist.linux-x86_64/wheel/tutel/gates
creating build/bdist.linux-x86_64/wheel/tutel/impls
running install_egg_info
running egg_info
creating tutel.egg-info
writing manifest file 'tutel.egg-info/SOURCES.txt'
writing manifest file 'tutel.egg-info/SOURCES.txt'
Copying tutel.egg-info to build/bdist.linux-x86_64/wheel/tutel-0.1-py3.8.egg-info
running install_scripts
creating build/bdist.linux-x86_64/wheel/tutel-0.1.dist-info/WHEEL
creating '/tmp/pip-wheel-fsgwko7i/tutel-0.1-cp38-cp38-linux_x86_64.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
adding 'tutel_custom_kernel.cpython-38-x86_64-linux-gnu.so'
adding 'tutel/init.py'
adding 'tutel/jit.py'
adding 'tutel/moe.py'
adding 'tutel/net.py'
adding 'tutel/system.py'
adding 'tutel/checkpoint/init.py'
adding 'tutel/checkpoint/gather.py'
adding 'tutel/checkpoint/scatter.py'
adding 'tutel/custom/init.py'
adding 'tutel/examples/init.py'
adding 'tutel/examples/helloworld.py'
adding 'tutel/examples/helloworld_amp.py'
adding 'tutel/examples/helloworld_ddp.py'
adding 'tutel/examples/helloworld_ddp_tutel.py'
adding 'tutel/examples/helloworld_deepspeed.py'
adding 'tutel/examples/helloworld_from_scratch.py'
adding 'tutel/examples/moe_cifar10.py'
adding 'tutel/examples/moe_mnist.py'
adding 'tutel/experts/init.py'
adding 'tutel/experts/ffn.py'
adding 'tutel/gates/init.py'
adding 'tutel/gates/cosine_top.py'
adding 'tutel/gates/top.py'
adding 'tutel/impls/init.py'
adding 'tutel/impls/communicate.py'
adding 'tutel/impls/fast_dispatch.py'
adding 'tutel/impls/jit_compiler.py'
adding 'tutel/impls/losses.py'
adding 'tutel/impls/moe_layer.py'
adding 'tutel/impls/overlap.py'
adding 'tutel/jit_kernels/init.py'
adding 'tutel/jit_kernels/gating.py'
adding 'tutel/jit_kernels/sparse.py'
adding 'tutel/launcher/init.py'
adding 'tutel/launcher/execl.py'
adding 'tutel/launcher/run.py'
adding 'tutel/parted/init.py'
adding 'tutel/parted/patterns.py'
adding 'tutel/parted/solver.py'
adding 'tutel/parted/spmdx.py'
adding 'tutel/parted/backend/init.py'
adding 'tutel/parted/backend/torch/init.py'
adding 'tutel/parted/backend/torch/config.py'
adding 'tutel/parted/backend/torch/executor.py'
adding 'tutel-0.1.dist-info/LICENSE'
adding 'tutel-0.1.dist-info/METADATA'
adding 'tutel-0.1.dist-info/WHEEL'
adding 'tutel-0.1.dist-info/top_level.txt'
adding 'tutel-0.1.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
Building wheel for tutel (setup.py) ... done
Created wheel for tutel: filename=tutel-0.1-cp38-cp38-linux_x86_64.whl size=3818720 sha256=c9229a1d4450e51722ce8c3ee1ac1f168c52eb336f50cb8b74541f46db9908d6
Stored in directory: /tmp/pip-ephem-wheel-cache-8p0gi8d8/wheels/fd/b8/fb/efc186bf3c0931e42fd89af67fe0cfcdece6fb5b055e69ec0a
Successfully built tutel
Installing collected packages: tutel
Running command git rev-parse HEAD
1456b49
Successfully installed tutel-0.1

@ghostplant
Copy link
Contributor

Thanks. What about the standard output of this:

python3 -c 'import torch; import tutel_custom_kernel'

@zhaojiancheng007
Copy link
Author

Thanks! seems like it doesn't have the module 'torch_custom_tutel'

@ghostplant
Copy link
Contributor

ghostplant commented Mar 30, 2023

Can you search where is the OS path of this file in your anaconda3 environment:

find /home/ubuntu/anaconda3 | grep tutel_custom_kernel

Your anaconda3 doesn't automatically add it to the PYTHON_PATH.

For PYPI installation instead of anaconda, I don't think there would be such problem, and the file is usually installed at some path like:

/usr/local/lib/python3.8/dist-packages/tutel_custom_kernel.cpython-38m-x86_64-linux-gnu.so

@zhaojiancheng007
Copy link
Author

I sorry that I did follow the installation procedures, I still couldn't find the file 'tutel_custom_kernel', in the dist-packages. I'm not sure which part went wrong. I use CUDA11.6 and torch==1.10.0+cu113.
Another error that always shows up 'ImportError: libnvrtc.so.11.0: cannot open shared object file: No such file or directory'

@ghostplant
Copy link
Contributor

I sorry that I did follow the installation procedures, I still couldn't find the file 'tutel_custom_kernel', in the dist-packages. I'm not sure which part went wrong. I use CUDA11.6 and torch==1.10.0+cu113. Another error that always shows up 'ImportError: libnvrtc.so.11.0: cannot open shared object file: No such file or directory'

OK, so the problem is not from anaconda's site location, but your Pytorch fails to detach CUDA library environment and related versioning.

You have several options:

  1. find the location of libnvrtc.so.11.0 and put it to LD_LIBRARY_PATH.
  2. find the location of libnvrtc.so.11.6 and create a symbolic link for it and name it as libnvrtc.so.11.0

@ghostplant
Copy link
Contributor

Because those shared libraries fails to locate on the disk, so Pytorch C++ modules can't load at initialization.

@zhaojiancheng007
Copy link
Author

Thanks for your patience, I did what you told me, the problem is still unsolved,. I think maybe something wrong with the ninja compiler while installing? I paste the installation log here. And I use CUDA10.2, with torch version torch1.10.0+cu102
Thanks a lot!

running install
running bdist_egg
running egg_info
writing manifest file 'tutel.egg-info/SOURCES.txt'
running install_lib
running build_py
running build_ext
Emitting ninja build file /home/ubuntu/zcq/tutel/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
g++ -pthread -shared -B /home/ubuntu/anaconda3/envs/snerf/compiler_compat -L/home/ubuntu/anaconda3/envs/snerf/lib -Wl,-rpath=/home/ubuntu/anaconda3/envs/snerf/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/ubuntu/zcq/tutel/build/temp.linux-x86_64-3.8/./tutel/custom/custom_kernel.o -L/usr/local/cuda/lib64/stubs -L/home/ubuntu/.local/lib/python3.8/site-packages/torch/lib -L/usr/local/cuda-11.6/lib64 -ldl -lcuda -lnvrtc -lnccl -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.8/tutel_custom_kernel.cpython-38-x86_64-linux-gnu.so
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/tutel
creating build/bdist.linux-x86_64/egg/tutel/jit_kernels
creating build/bdist.linux-x86_64/egg/tutel/parted
creating build/bdist.linux-x86_64/egg/tutel/parted/backend
creating build/bdist.linux-x86_64/egg/tutel/parted/backend/torch
creating build/bdist.linux-x86_64/egg/tutel/examples
creating build/bdist.linux-x86_64/egg/tutel/custom
creating build/bdist.linux-x86_64/egg/tutel/launcher
creating build/bdist.linux-x86_64/egg/tutel/impls
byte-compiling build/bdist.linux-x86_64/egg/tutel/system.py to system.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/jit_kernels/gating.py to gating.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/jit_kernels/sparse.py to sparse.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/jit_kernels/init.py to init.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/net.py to net.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/jit.py to jit.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/backend/init.py to init.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/backend/torch/config.py to config.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/backend/torch/executor.py to executor.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/backend/torch/init.py to init.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/patterns.py to patterns.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/spmdx.py to spmdx.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/init.py to init.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/parted/solver.py to solver.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld_from_scratch.py to helloworld_from_scratch.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld.py to helloworld.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld_amp.py to helloworld_amp.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld_deepspeed.py to helloworld_deepspeed.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/init.py to init.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/examples/helloworld_ddp.py to helloworld_ddp.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/custom/init.py to init.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/moe.py to moe.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/launcher/run.py to run.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/launcher/execl.py to execl.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/launcher/init.py to init.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/init.py to init.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/fast_dispatch.py to fast_dispatch.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/jit_compiler.py to jit_compiler.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/moe_layer.py to moe_layer.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/communicate.py to communicate.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel/impls/init.py to init.cpython-38.pyc
byte-compiling build/bdist.linux-x86_64/egg/tutel_custom_kernel.py to tutel_custom_kernel.cpython-38.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying tutel.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying tutel.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying tutel.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying tutel.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying tutel.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
pycache.tutel_custom_kernel.cpython-38: module references file
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
creating /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg
Extracting tutel-0.1-py3.8-linux-x86_64.egg to /home/ubuntu/.local/lib/python3.8/site-packages
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel_custom_kernel.py to tutel_custom_kernel.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/init.py to init.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/jit.py to jit.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/moe.py to moe.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/net.py to net.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/system.py to system.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/custom/init.py to init.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/init.py to init.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld.py to helloworld.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld_amp.py to helloworld_amp.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld_ddp.py to helloworld_ddp.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld_deepspeed.py to helloworld_deepspeed.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/examples/helloworld_from_scratch.py to helloworld_from_scratch.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/init.py to init.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/communicate.py to communicate.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/fast_dispatch.py to fast_dispatch.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/jit_compiler.py to jit_compiler.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/impls/moe_layer.py to moe_layer.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/jit_kernels/init.py to init.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/jit_kernels/gating.py to gating.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/jit_kernels/sparse.py to sparse.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/launcher/init.py to init.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/launcher/execl.py to execl.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/launcher/run.py to run.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/init.py to init.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/patterns.py to patterns.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/solver.py to solver.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/spmdx.py to spmdx.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/backend/init.py to init.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/backend/torch/init.py to init.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/backend/torch/config.py to config.cpython-38.pyc
byte-compiling /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg/tutel/parted/backend/torch/executor.py to executor.cpython-38.pyc
Adding tutel 0.1 to easy-install.pth file

Installed /home/ubuntu/.local/lib/python3.8/site-packages/tutel-0.1-py3.8-linux-x86_64.egg
Processing dependencies for tutel==0.1
Finished processing dependencies for tutel==0.1

@zhaojiancheng007
Copy link
Author

Thanks! I reinstall CUDA and torch, update tutel to the latest version, and it works! Thanks for your patience, that really helps me a lot.

@zachary62
Copy link

Thanks! I reinstall CUDA and torch, update tutel to the latest version, and it works! Thanks for your patience, that really helps me a lot.

Can you share your CUDA and Pytorch version? I have the same issue, and reinstall doesn't work

@ghostplant
Copy link
Contributor

ghostplant commented May 25, 2023

Thanks! I reinstall CUDA and torch, update tutel to the latest version, and it works! Thanks for your patience, that really helps me a lot.

Can you share your CUDA and Pytorch version? I have the same issue, and reinstall doesn't work.

Mostly it is related to Pytorch fails to import standard C++ extension due to improper/messed-up extension location.

Here are several possibilities.

  1. Pytorch user is the root-cause (e.g. root or non-root) because Pytorch is installed by an unknown else users.
  2. Multiple C++ extension is found at different site locations (e.g. a version exists in root sites, and another version exists in user sites), making Pytorch imports a improper one.
  3. CUDA environment is not configured correctly, making C++ extension failed in setup procedure or library loading procedure. However for this case, you can usually see those related error logs during installation, e.g. nvcc or libcuda.so is not found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants