mmcv-full not compiled when building inside docker #1154

lingcong-k · 2021-06-28T19:13:49Z

Checklist

I know this error has been brought up several times

open-mmlab/mmdetection#2686
open-mmlab/mmdetection#4075

But Iv checked all solutions, all didnt work out for me.

I am building mmcv in docker
I am using this pytorch image: FROM nvcr.io/nvidia/pytorch:20.11-py3 (which has pytorch 1.8.0, cuda 11.1.0)

I tried this

FROM nvcr.io/nvidia/pytorch:20.11-py3
........(omit other comands which are irrelevant)...........

RUN git clone https://github.com/open-mmlab/mmcv.git && \
cd mmcv && \
MMCV_WITH_OPS=1 pip install -e .

FROM nvcr.io/nvidia/pytorch:20.11-py3
pip install mmcv-full==1.3.8 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.8.0/index.html

and many more versions, both didnt work..
according to mmcv installation guide.. mmcv-full 1.3.8 should complied with pytorch 1.3.8 cuda 11.1.0.
isnt it?

I ran out of ideas.. stuck here for few days.. can someone please help me out.. thanks

The text was updated successfully, but these errors were encountered:

zhouzaida · 2021-06-29T09:24:48Z

hi @lingcong-k , could you try to print echo $CUDA_HOME?

lingcong-k · 2021-06-29T09:30:33Z

echo $CUDA_HOME

Hi thanks for reply.
U mean in the docker image when i run it? coz am running it on clouds?

On my own machine which is used to build the docker file, echo $CUDA_HOME return empty

lingcong-k · 2021-06-29T09:33:24Z

but nvidia-smi gives cuda version 11.2

lingcong-k · 2021-06-29T09:42:46Z

i have multiple cuda versons installed in my pc where i build the docker.. do u mean that i need to make my cuda home 11.1 before i build the docker, then it ll be alright? @zhouzaida

I assume its the base docker image "nvcr.io/nvidia/pytorch:20.11-py3" which defines the cuda version inside the docker container tho

zhouzaida · 2021-07-10T09:23:24Z

Launch the image by docker run -it --runtime=nvidia nvcr.io/nvidia/pytorch:20.11-py3 and run follow commands

git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
MMCV_WITH_OPS=1 pip install -e .
pytest tests/test_ops/test_nms.py

zhouzaida · 2021-07-12T02:45:19Z

you could try the command docker run -it --runtime=nvidia nvcr.io/nvidia/pytorch:20.11-py3 to launch your image

lingcong-k · 2021-07-12T14:18:12Z

you could try the command docker run -it --runtime=nvidia nvcr.io/nvidia/pytorch:20.11-py3 to launch your image

@zhouzaida actually u provided a really good debugging approach to check the nms

however I notice sth realllly weird..

So if I do

FROM nvcr.io/nvidia/pytorch:21.02-py3

RUN apt-get update
RUN apt install -y libgl1-mesa-glx   #this is for opencv import error when test nms.py

RUN git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
MMCV_WITH_OPS=1 pip install -e .

and launch the docker image and do

pytest tests/test_ops/test_nms.py

It failed saying : RuntimeError: nms is not compiled with GPU support

BUT if I then inside the docker image and manually do:

cd mmcv
MMCV_WITH_OPS=1 pip install -e .

It then uninstalled the one I installed when building docker and install again. and then no error, nms test pass.

SO there seems some bug or sth to do when install mmcv-full inside Docker (no error threw when building it tho) . Manually do it will always work no matter inside docker or on local machine.

But in my case, I need to build it successfully inside the Docker coz my training pipeline launch and autoscale my training automatically.
What do you think? :) Thanks in advance

the log showing that building in docker failed but manually did it inside the dockerimage worked

root@bc530dbd64e2:/workspace# cd mmcv
root@bc530dbd64e2:/workspace/mmcv# pytest tests/test_ops/test_nms.py
=============================================================================== test session starts ================================================================================
platform linux -- Python 3.8.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /workspace/mmcv
plugins: cov-2.11.1, pythonpath-0.7.3, hypothesis-4.50.8
collected 4 items                                                                                                                                                                  

tests/test_ops/test_nms.py F...                                                                                                                                              [100%]

===================================================================================== FAILURES =====================================================================================
____________________________________________________________________________ Testnms.test_nms_allclose _____________________________________________________________________________

self = <test_nms.Testnms object at 0x7ff0ae72dd00>

    def test_nms_allclose(self):
        if not torch.cuda.is_available():
            return
        from mmcv.ops import nms
        np_boxes = np.array([[6.0, 3.0, 8.0, 7.0], [3.0, 6.0, 9.0, 11.0],
                             [3.0, 7.0, 10.0, 12.0], [1.0, 4.0, 13.0, 7.0]],
                            dtype=np.float32)
        np_scores = np.array([0.6, 0.9, 0.7, 0.2], dtype=np.float32)
        np_inds = np.array([1, 0, 3])
        np_dets = np.array([[3.0, 6.0, 9.0, 11.0, 0.9],
                            [6.0, 3.0, 8.0, 7.0, 0.6],
                            [1.0, 4.0, 13.0, 7.0, 0.2]])
        boxes = torch.from_numpy(np_boxes)
        scores = torch.from_numpy(np_scores)
        dets, inds = nms(boxes, scores, iou_threshold=0.3, offset=0)
        assert np.allclose(dets, np_dets)  # test cpu
        assert np.allclose(inds, np_inds)  # test cpu
>       dets, inds = nms(
            boxes.cuda(), scores.cuda(), iou_threshold=0.3, offset=0)

tests/test_ops/test_nms.py:25: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
mmcv/utils/misc.py:330: in new_func
    output = old_func(*args, **kwargs)
mmcv/ops/nms.py:171: in nms
    inds = NMSop.apply(boxes, scores, iou_threshold, offset,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ctx = <torch.autograd.function.NMSopBackward object at 0x7ff08d4e1d60>
bboxes = tensor([[ 6.,  3.,  8.,  7.],
        [ 3.,  6.,  9., 11.],
        [ 3.,  7., 10., 12.],
        [ 1.,  4., 13.,  7.]], device='cuda:0')
scores = tensor([0.6000, 0.9000, 0.7000, 0.2000], device='cuda:0'), iou_threshold = 0.3, offset = 0, score_threshold = 0, max_num = -1

    @staticmethod
    def forward(ctx, bboxes, scores, iou_threshold, offset, score_threshold,
                max_num):
        is_filtering_by_score = score_threshold > 0
        if is_filtering_by_score:
            valid_mask = scores > score_threshold
            bboxes, scores = bboxes[valid_mask], scores[valid_mask]
            valid_inds = torch.nonzero(
                valid_mask, as_tuple=False).squeeze(dim=1)
    
>       inds = ext_module.nms(
            bboxes, scores, iou_threshold=float(iou_threshold), offset=offset)
E       RuntimeError: nms is not compiled with GPU support

mmcv/ops/nms.py:26: RuntimeError
================================================================================= warnings summary =================================================================================
tests/test_ops/test_nms.py::Testnms::test_nms_allclose
  /opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py:3: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp

tests/test_ops/test_nms.py::Testnms::test_nms_allclose
  /workspace/mmcv/mmcv/ops/fused_bias_leakyrelu.py:191: DeprecationWarning: invalid escape sequence \s
    """Fused bias leaky ReLU.

tests/test_ops/test_nms.py::Testnms::test_nms_allclose
  /workspace/mmcv/mmcv/ops/fused_bias_leakyrelu.py:226: DeprecationWarning: invalid escape sequence \s
    """Fused bias leaky ReLU function.

-- Docs: https://docs.pytest.org/en/stable/warnings.html
============================================================================= short test summary info ==============================================================================
FAILED tests/test_ops/test_nms.py::Testnms::test_nms_allclose - RuntimeError: nms is not compiled with GPU support
===================================================================== 1 failed, 3 passed, 3 warnings in 2.95s ======================================================================
root@bc530dbd64e2:/workspace/mmcv# MMCV_WITH_OPS=1 pip install -e .
Obtaining file:///workspace/mmcv
Requirement already satisfied: addict in /opt/conda/lib/python3.8/site-packages (from mmcv-full==1.3.9) (2.4.0)
Requirement already satisfied: numpy in /opt/conda/lib/python3.8/site-packages (from mmcv-full==1.3.9) (1.19.2)
Requirement already satisfied: Pillow in /opt/conda/lib/python3.8/site-packages (from mmcv-full==1.3.9) (8.3.1)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.8/site-packages (from mmcv-full==1.3.9) (5.4.1)
Requirement already satisfied: yapf in /opt/conda/lib/python3.8/site-packages (from mmcv-full==1.3.9) (0.31.0)
Installing collected packages: mmcv-full
  Attempting uninstall: mmcv-full
    Found existing installation: mmcv-full 1.3.9
    Uninstalling mmcv-full-1.3.9:
      Successfully uninstalled mmcv-full-1.3.9
  Running setup.py develop for mmcv-full
Successfully installed mmcv-full
root@bc530dbd64e2:/workspace/mmcv# pytest tests/test_ops/test_nms.py
=============================================================================== test session starts ================================================================================
platform linux -- Python 3.8.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /workspace/mmcv
plugins: cov-2.11.1, pythonpath-0.7.3, hypothesis-4.50.8
collected 4 items                                                                                                                                                                  

tests/test_ops/test_nms.py ....                                                                                                                                              [100%]

================================================================================= warnings summary =================================================================================
tests/test_ops/test_nms.py::Testnms::test_nms_allclose
  /opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py:3: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp

-- Docs: https://docs.pytest.org/en/stable/warnings.html

zhouzaida · 2021-07-13T03:16:41Z

please provide your command for building image

the command should be docker build --runtime=nvidia

lingcong-k · 2021-07-13T07:34:19Z

--runtime=nvidia
@zhouzaida

i build with "DOCKER_BUILDKIT=1 docker build **********'

So its a must to have --runtime-nvidia ?

I try to add this flag but it says unknown flag --runtime

zhouzaida · 2021-07-13T07:47:49Z

--runtime=nvidia
@zhouzaida

i build with "DOCKER_BUILDKIT=1 docker build **********'

So its a must to have --runtime-nvidia ???

yet, maybe you could have a try. I think it will work

lingcong-k · 2021-07-13T21:40:48Z

--runtime=nvidia
@zhouzaida

i build with "DOCKER_BUILDKIT=1 docker build **********'
So its a must to have --runtime-nvidia ???

yet, maybe you could have a try. I think it will work

@zhouzaida
how could u run
docker build --runtime=nvidia tho..
I can only use the runtime flag for docker run not docker build
docker run --runtime=nvidia will throw unknown flag runtime error

my default runtime setting in docker config is alreadu nvidia

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}

thanks

zhouzaida · 2021-07-14T06:21:31Z

refer to https://github.com/NVIDIA/nvidia-docker/wiki/Advanced-topics#default-runtime, maybe it is helpful

lingcong-k · 2021-07-15T08:45:45Z

@zhouzaida Thanks.. I found the issue.

so if anybody else facing the same issue. check two things

is the default runtime set to nvidia or not (under /etc/docker/daemon.json)

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}

if u r building docker using DOCKER_BUILDKIT
it has issue of preventing access to nvidia runtime
No nvidia GPU access during build moby/buildkit#1800
so dont use it

ganjbakhshali · 2021-10-30T05:28:56Z

in docker these commands worked for me
`RUN git clone https://github.com/open-mmlab/mmcv.git

WORKDIR mmcv

RUN MMCV_WITH_OPS=1 pip install -e .`

BrianPugh · 2021-11-10T01:31:51Z

fwiw, I was able to resolve this (while still using buildkit) by adding the following to my dockerfile (before installing mmcv)

ARG TORCH_CUDA_ARCH_LIST="7.5;6.1"
ENV FORCE_CUDA="1"

you can specify whatever compute capabiliies you want based on the hardware you are going to be running:
https://developer.nvidia.com/cuda-gpus

linzy5 · 2024-06-03T03:40:04Z

I encountered the same problem. After some search and try, finally solve this issue by referring to to official dockerfile:https://github.com/open-mmlab/mmcv/blob/main/docker/dev/Dockerfile

You can add these lines in your dockerfile:

ENV TORCH_CUDA_ARCH_LIST=7.5+PTX
ENV FORCE_CUDA="1"
RUN cd /home/docker/ && \
    wget https://github.com/open-mmlab/mmcv/archive/refs/tags/v1.7.2.tar.gz && \
    tar -xzf v1.7.2.tar.gz && \
    rm -rf v1.7.2.tar.gz && \
    cd mmcv-1.7.2 && \
    MMCV_WITH_OPS=1 pip install --no-cache-dir -e .[all] -v

jeanchristopheruel · 2024-08-19T12:26:53Z

Hey, just a quick update it you want to compile for latest architectures using docker build, use this to your Dockerfile

ARG TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0 7.5 8.0 8.6+PTX"
ENV FORCE_CUDA="1"

See all the latest arch here.

zhouzaida mentioned this issue Jul 12, 2021

mmcv 安装环境问题 #1142

Closed

lingcong-k closed this as completed Jul 13, 2021

lingcong-k reopened this Jul 13, 2021

lingcong-k closed this as completed Jul 15, 2021

lingcong-k changed the title ~~RuntimeError: nms is not compiled with GPU support.. plz help~~ mmcv-full not compiled when building inside docker Jul 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mmcv-full not compiled when building inside docker #1154

mmcv-full not compiled when building inside docker #1154

lingcong-k commented Jun 28, 2021 •

edited

Loading

zhouzaida commented Jun 29, 2021

lingcong-k commented Jun 29, 2021

lingcong-k commented Jun 29, 2021

lingcong-k commented Jun 29, 2021 •

edited

Loading

zhouzaida commented Jul 10, 2021 •

edited

Loading

zhouzaida commented Jul 12, 2021

lingcong-k commented Jul 12, 2021 •

edited

Loading

zhouzaida commented Jul 13, 2021 •

edited

Loading

lingcong-k commented Jul 13, 2021 •

edited

Loading

zhouzaida commented Jul 13, 2021

lingcong-k commented Jul 13, 2021 •

edited

Loading

zhouzaida commented Jul 14, 2021

lingcong-k commented Jul 15, 2021

ganjbakhshali commented Oct 30, 2021 •

edited

Loading

BrianPugh commented Nov 10, 2021 •

edited

Loading

linzy5 commented Jun 3, 2024

jeanchristopheruel commented Aug 19, 2024

mmcv-full not compiled when building inside docker #1154

mmcv-full not compiled when building inside docker #1154

Comments

lingcong-k commented Jun 28, 2021 • edited Loading

zhouzaida commented Jun 29, 2021

lingcong-k commented Jun 29, 2021

lingcong-k commented Jun 29, 2021

lingcong-k commented Jun 29, 2021 • edited Loading

zhouzaida commented Jul 10, 2021 • edited Loading

zhouzaida commented Jul 12, 2021

lingcong-k commented Jul 12, 2021 • edited Loading

however I notice sth realllly weird..

BUT if I then inside the docker image and manually do:

It then uninstalled the one I installed when building docker and install again. and then no error, nms test pass.

SO there seems some bug or sth to do when install mmcv-full inside Docker (no error threw when building it tho) . Manually do it will always work no matter inside docker or on local machine.

zhouzaida commented Jul 13, 2021 • edited Loading

lingcong-k commented Jul 13, 2021 • edited Loading

zhouzaida commented Jul 13, 2021

lingcong-k commented Jul 13, 2021 • edited Loading

zhouzaida commented Jul 14, 2021

lingcong-k commented Jul 15, 2021

ganjbakhshali commented Oct 30, 2021 • edited Loading

BrianPugh commented Nov 10, 2021 • edited Loading

linzy5 commented Jun 3, 2024

jeanchristopheruel commented Aug 19, 2024

lingcong-k commented Jun 28, 2021 •

edited

Loading

lingcong-k commented Jun 29, 2021 •

edited

Loading

zhouzaida commented Jul 10, 2021 •

edited

Loading

lingcong-k commented Jul 12, 2021 •

edited

Loading

zhouzaida commented Jul 13, 2021 •

edited

Loading

lingcong-k commented Jul 13, 2021 •

edited

Loading

lingcong-k commented Jul 13, 2021 •

edited

Loading

ganjbakhshali commented Oct 30, 2021 •

edited

Loading

BrianPugh commented Nov 10, 2021 •

edited

Loading