Tests fail on GPU (on windows) #154

FynnBe · 2021-11-22T20:59:45Z

while the following test cases pass on cpu, they fail on gpu (on windows; only difference: CUDA_VISIBLE_DEVICES="")

tests\prediction_pipeline\test_prediction_pipeline.py:36 (test_prediction_pipeline_torchscript[unet2d_multi_tensor])
tests\prediction_pipeline\test_prediction_pipeline.py:36 (test_prediction_pipeline_torchscript[unet2d_nuclei_broad_model])

@constantinpape I suspect this is independent of OS and only related to cuda+torchscript...

1.:

tests\prediction_pipeline\test_prediction_pipeline.py:36 (test_prediction_pipeline_torchscript[unet2d_multi_tensor])
any_torchscript_model = WindowsPath('/Users/fbeut/bioimageio_cache/packages/multi-tensorp39.zip')

    def test_prediction_pipeline_torchscript(any_torchscript_model):
>       _test_prediction_pipeline(any_torchscript_model, "pytorch_script")

test_prediction_pipeline.py:38: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
test_prediction_pipeline.py:20: in _test_prediction_pipeline
    outputs = pp.forward(*inputs)
..\..\bioimageio\core\prediction_pipeline\_prediction_pipeline.py:129: in forward
    prediction = self.predict(*preprocessed)
..\..\bioimageio\core\prediction_pipeline\_prediction_pipeline.py:124: in predict
    return self._model.forward(*input_tensors)
..\..\bioimageio\core\prediction_pipeline\_model_adapters\_model_adapter.py:59: in forward
    return self._forward(*input_tensors)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <bioimageio.core.prediction_pipeline._model_adapters._torchscript_model_adapter.TorchscriptModelAdapter object at 0x00000251FDC143A0>
batch = (<xarray.DataArray (b: 1, c: 1, y: 256, x: 256)>
array([[[[-0.34453768, -0.5941573 , -0.85205287, ..., -0.5926006 ,
  ... , ..., -0.4026966 ,
          -0.43629348, -0.40746877]]]], dtype=float32)
Dimensions without coordinates: b, c, y, x)
torch_tensor = [tensor([[[[-0.3445, -0.5942, -0.8521,  ..., -0.5926, -0.7318, -0.7813],
          [-1.1676, -0.3543, -0.4170,  ..., -... -0.5141,  ..., -0.4316, -0.3849, -0.4363],
          [-0.5178, -0.5888, -0.5254,  ..., -0.4027, -0.4363, -0.4075]]]])]

    def _forward(self, *batch: xr.DataArray) -> List[xr.DataArray]:
        with torch.no_grad():
            torch_tensor = [torch.from_numpy(b.data) for b in batch]
>           result = self._model.forward(*torch_tensor)
E           RuntimeError: The following operation failed in the TorchScript interpreter.
E           Traceback of TorchScript, serialized code (most recent call last):
E             File "code/__torch__/multi_tensor_unet.py", line 17, in forward
E               _3 = self.encoder
E               input = torch.cat([argument_1, argument_2], 1)
E               _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, _16, _17, _18, _19, _20, _21, _22, _23, _24, _25, _26, _27, _28, _29, _30, _31, = (_3).forward(input, )
E                                                                                                                                                        ~~~~~~~~~~~ <--- HERE
E               _32 = (_1).forward((_2).forward(_4, ), _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, _16, _17, _18, _19, _20, _21, _22, _23, _24, _25, _26, _27, _28, _29, _30, _31, )
E               _33 = (_0).forward(_32, )
E             File "code/__torch__/multi_tensor_unet.py", line 37, in forward
E               _40 = getattr(self.blocks, "1")
E               _41 = getattr(self.poolers, "0")
E               _42 = (getattr(self.blocks, "0")).forward(input, )
E                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
E               _43 = (_40).forward((_41).forward(_42, ), )
E               _44 = (_38).forward((_39).forward(_43, ), )
E             File "code/__torch__/multi_tensor_unet.py", line 49, in forward
E             def forward(self: __torch__.multi_tensor_unet.ConvBlock2d,
E               input: Tensor) -> Tensor:
E               return (self.block).forward(input, )
E                       ~~~~~~~~~~~~~~~~~~~ <--- HERE
E           class Decoder(Module):
E             __parameters__ = []
E             File "code/__torch__/torch/nn/modules/container.py", line 21, in forward
E               _1 = getattr(self, "2")
E               _2 = getattr(self, "1")
E               _3 = (getattr(self, "0")).forward(input, )
E                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
E               _4 = (_0).forward((_1).forward((_2).forward(_3, ), ), )
E               return _4
E             File "code/__torch__/torch/nn/modules/conv.py", line 10, in forward
E               input: Tensor) -> Tensor:
E               _0 = self.bias
E               input0 = torch._convolution(input, self.weight, _0, [1, 1], [1, 1], [1, 1], False, [0, 0], 1, False, False, True, True)
E                        ~~~~~~~~~~~~~~~~~~ <--- HERE
E               return input0
E           
E           Traceback of TorchScript, original code (most recent call last):
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/conv.py(419): _conv_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/conv.py(423): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/container.py(117): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /g/kreshuk/pape/Work/my_projects/torch-em/scripts/bioimageio-examples/multi-tensor/multi_tensor_unet.py(268): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /g/kreshuk/pape/Work/my_projects/torch-em/scripts/bioimageio-examples/multi-tensor/multi_tensor_unet.py(149): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /g/kreshuk/pape/Work/my_projects/torch-em/scripts/bioimageio-examples/multi-tensor/multi_tensor_unet.py(70): _apply_default
E           /g/kreshuk/pape/Work/my_projects/torch-em/scripts/bioimageio-examples/multi-tensor/multi_tensor_unet.py(88): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/jit/_trace.py(934): trace_module
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
E           /g/kreshuk/pape/Work/bioimageio/python-bioimage-io/bioimageio/core/weight_converter/torch/torchscript.py(92): convert_weights_to_pytorch_script
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch_em/util/modelzoo.py(928): _convert_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch_em/util/modelzoo.py(947): convert_to_pytorch_script
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch_em/util/modelzoo.py(966): add_weight_formats
E           create_example.py(123): export_model
E           create_example.py(159): <module>
E           RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

..\..\bioimageio\core\prediction_pipeline\_model_adapters\_torchscript_model_adapter.py:30: RuntimeError

2.:

tests\prediction_pipeline\test_prediction_pipeline.py:36 (test_prediction_pipeline_torchscript[unet2d_nuclei_broad_model])
any_torchscript_model = WindowsPath('/Users/fbeut/bioimageio_cache/packages/UNet_2D_Nuclei_Broad_0_1_3p39.zip')

    def test_prediction_pipeline_torchscript(any_torchscript_model):
>       _test_prediction_pipeline(any_torchscript_model, "pytorch_script")

test_prediction_pipeline.py:38: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
test_prediction_pipeline.py:20: in _test_prediction_pipeline
    outputs = pp.forward(*inputs)
..\..\bioimageio\core\prediction_pipeline\_prediction_pipeline.py:129: in forward
    prediction = self.predict(*preprocessed)
..\..\bioimageio\core\prediction_pipeline\_prediction_pipeline.py:124: in predict
    return self._model.forward(*input_tensors)
..\..\bioimageio\core\prediction_pipeline\_model_adapters\_model_adapter.py:59: in forward
    return self._forward(*input_tensors)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <bioimageio.core.prediction_pipeline._model_adapters._torchscript_model_adapter.TorchscriptModelAdapter object at 0x000002519C174880>
batch = (<xarray.DataArray (b: 1, c: 1, y: 512, x: 512)>
array([[[[-0.531971  , -0.5176083 , -0.531971  , ..., -0.58942187,
  ..., ..., -0.48888284,
          -0.5128207 , -0.5176083 ]]]], dtype=float32)
Dimensions without coordinates: b, c, y, x,)
torch_tensor = [tensor([[[[-0.5320, -0.5176, -0.5320,  ..., -0.5894, -0.5894, -0.5942],
          [-0.5080, -0.5655, -0.5415,  ..., -...  1.2203,  ..., -0.5415, -0.5272, -0.5224],
          [ 1.0814,  0.8756,  0.7463,  ..., -0.4889, -0.5128, -0.5176]]]])]

    def _forward(self, *batch: xr.DataArray) -> List[xr.DataArray]:
        with torch.no_grad():
            torch_tensor = [torch.from_numpy(b.data) for b in batch]
>           result = self._model.forward(*torch_tensor)
E           RuntimeError: The following operation failed in the TorchScript interpreter.
E           Traceback of TorchScript, serialized code (most recent call last):
E             File "code/__torch__/module_from_source/unet2d.py", line 24, in forward
E               _9 = getattr(self.encoders, "1")
E               _10 = getattr(self.downsamplers, "0")
E               _11 = (getattr(self.encoders, "0")).forward(input, )
E                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
E               _12 = (_9).forward((_10).forward(_11, ), )
E               _13 = (_8).forward((_10).forward1(_12, ), )
E             File "code/__torch__/torch/nn/modules/container.py", line 19, in forward
E               _0 = getattr(self, "2")
E               _1 = getattr(self, "1")
E               _2 = (getattr(self, "0")).forward(input, )
E                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
E               return (_0).forward((_1).forward(_2, ), )
E             File "code/__torch__/torch/nn/modules/conv.py", line 10, in forward
E               input: Tensor) -> Tensor:
E               _0 = self.bias
E               input0 = torch._convolution(input, self.weight, _0, [1, 1], [1, 1], [1, 1], False, [0, 0], 1, False, False, True, True)
E                        ~~~~~~~~~~~~~~~~~~ <--- HERE
E               return input0
E           
E           Traceback of TorchScript, original code (most recent call last):
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/conv.py(419): _conv_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/conv.py(423): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/container.py(117): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /g/kreshuk/pape/Work/bioimageio/spec-bioimage-io/example_specs/models/unet2d_nuclei_broad/unet2d.py(55): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/jit/_trace.py(934): trace_module
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
E           /g/kreshuk/pape/Work/bioimageio/python-bioimage-io/bioimageio/core/weight_converter/torch/torchscript.py(96): convert_weights_to_pytorch_script
E           /g/kreshuk/pape/Work/bioimageio/python-bioimage-io/bioimageio/core/weight_converter/torch/torchscript.py(115): main
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/bin/bioimageio-convert_torch_to_torchscript(33): <module>
E           RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

..\..\bioimageio\core\prediction_pipeline\_model_adapters\_torchscript_model_adapter.py:30: RuntimeError

The text was updated successfully, but these errors were encountered:

constantinpape · 2021-11-23T08:16:42Z

I am a bit confused, how do you even run tests on the gpu?

constantinpape · 2021-11-23T08:22:51Z

Ok, I see we have tests for this in #155 now.

FynnBe · 2021-11-23T08:46:52Z

we have the test_prediction_pipeline test already. I ran them locally on gpu and stumbled into it...
the new tests in #155 are the same tests slightly adapted for multiple inference.

constantinpape · 2021-11-23T09:24:27Z

we have the test_prediction_pipeline test already.

Yes, but it tests without GPU?! Or does it take a GPU if available?

FynnBe assigned constantinpape Nov 22, 2021

This was referenced Nov 22, 2021

Update package command and clean up root_path #153

Merged

add unload for PredictionPipeline and ModelAdapters #155

Merged

constantinpape mentioned this issue Nov 23, 2021

Fix torchscript prediction with GPU #156

Merged

FynnBe closed this as completed in #156 Nov 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests fail on GPU (on windows) #154

Tests fail on GPU (on windows) #154

FynnBe commented Nov 22, 2021 •

edited

Loading

constantinpape commented Nov 23, 2021

constantinpape commented Nov 23, 2021

FynnBe commented Nov 23, 2021

constantinpape commented Nov 23, 2021

Tests fail on GPU (on windows) #154

Tests fail on GPU (on windows) #154

Comments

FynnBe commented Nov 22, 2021 • edited Loading

constantinpape commented Nov 23, 2021

constantinpape commented Nov 23, 2021

FynnBe commented Nov 23, 2021

constantinpape commented Nov 23, 2021

FynnBe commented Nov 22, 2021 •

edited

Loading