Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests fail on GPU (on windows) #154

Closed
FynnBe opened this issue Nov 22, 2021 · 4 comments · Fixed by #156
Closed

Tests fail on GPU (on windows) #154

FynnBe opened this issue Nov 22, 2021 · 4 comments · Fixed by #156
Assignees

Comments

@FynnBe
Copy link
Member

FynnBe commented Nov 22, 2021

while the following test cases pass on cpu, they fail on gpu (on windows; only difference: CUDA_VISIBLE_DEVICES="")

  1. tests\prediction_pipeline\test_prediction_pipeline.py:36 (test_prediction_pipeline_torchscript[unet2d_multi_tensor])
  2. tests\prediction_pipeline\test_prediction_pipeline.py:36 (test_prediction_pipeline_torchscript[unet2d_nuclei_broad_model])

@constantinpape I suspect this is independent of OS and only related to cuda+torchscript...

1.:

tests\prediction_pipeline\test_prediction_pipeline.py:36 (test_prediction_pipeline_torchscript[unet2d_multi_tensor])
any_torchscript_model = WindowsPath('/Users/fbeut/bioimageio_cache/packages/multi-tensorp39.zip')

    def test_prediction_pipeline_torchscript(any_torchscript_model):
>       _test_prediction_pipeline(any_torchscript_model, "pytorch_script")

test_prediction_pipeline.py:38: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
test_prediction_pipeline.py:20: in _test_prediction_pipeline
    outputs = pp.forward(*inputs)
..\..\bioimageio\core\prediction_pipeline\_prediction_pipeline.py:129: in forward
    prediction = self.predict(*preprocessed)
..\..\bioimageio\core\prediction_pipeline\_prediction_pipeline.py:124: in predict
    return self._model.forward(*input_tensors)
..\..\bioimageio\core\prediction_pipeline\_model_adapters\_model_adapter.py:59: in forward
    return self._forward(*input_tensors)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <bioimageio.core.prediction_pipeline._model_adapters._torchscript_model_adapter.TorchscriptModelAdapter object at 0x00000251FDC143A0>
batch = (<xarray.DataArray (b: 1, c: 1, y: 256, x: 256)>
array([[[[-0.34453768, -0.5941573 , -0.85205287, ..., -0.5926006 ,
  ... , ..., -0.4026966 ,
          -0.43629348, -0.40746877]]]], dtype=float32)
Dimensions without coordinates: b, c, y, x)
torch_tensor = [tensor([[[[-0.3445, -0.5942, -0.8521,  ..., -0.5926, -0.7318, -0.7813],
          [-1.1676, -0.3543, -0.4170,  ..., -... -0.5141,  ..., -0.4316, -0.3849, -0.4363],
          [-0.5178, -0.5888, -0.5254,  ..., -0.4027, -0.4363, -0.4075]]]])]

    def _forward(self, *batch: xr.DataArray) -> List[xr.DataArray]:
        with torch.no_grad():
            torch_tensor = [torch.from_numpy(b.data) for b in batch]
>           result = self._model.forward(*torch_tensor)
E           RuntimeError: The following operation failed in the TorchScript interpreter.
E           Traceback of TorchScript, serialized code (most recent call last):
E             File "code/__torch__/multi_tensor_unet.py", line 17, in forward
E               _3 = self.encoder
E               input = torch.cat([argument_1, argument_2], 1)
E               _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, _16, _17, _18, _19, _20, _21, _22, _23, _24, _25, _26, _27, _28, _29, _30, _31, = (_3).forward(input, )
E                                                                                                                                                        ~~~~~~~~~~~ <--- HERE
E               _32 = (_1).forward((_2).forward(_4, ), _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, _16, _17, _18, _19, _20, _21, _22, _23, _24, _25, _26, _27, _28, _29, _30, _31, )
E               _33 = (_0).forward(_32, )
E             File "code/__torch__/multi_tensor_unet.py", line 37, in forward
E               _40 = getattr(self.blocks, "1")
E               _41 = getattr(self.poolers, "0")
E               _42 = (getattr(self.blocks, "0")).forward(input, )
E                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
E               _43 = (_40).forward((_41).forward(_42, ), )
E               _44 = (_38).forward((_39).forward(_43, ), )
E             File "code/__torch__/multi_tensor_unet.py", line 49, in forward
E             def forward(self: __torch__.multi_tensor_unet.ConvBlock2d,
E               input: Tensor) -> Tensor:
E               return (self.block).forward(input, )
E                       ~~~~~~~~~~~~~~~~~~~ <--- HERE
E           class Decoder(Module):
E             __parameters__ = []
E             File "code/__torch__/torch/nn/modules/container.py", line 21, in forward
E               _1 = getattr(self, "2")
E               _2 = getattr(self, "1")
E               _3 = (getattr(self, "0")).forward(input, )
E                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
E               _4 = (_0).forward((_1).forward((_2).forward(_3, ), ), )
E               return _4
E             File "code/__torch__/torch/nn/modules/conv.py", line 10, in forward
E               input: Tensor) -> Tensor:
E               _0 = self.bias
E               input0 = torch._convolution(input, self.weight, _0, [1, 1], [1, 1], [1, 1], False, [0, 0], 1, False, False, True, True)
E                        ~~~~~~~~~~~~~~~~~~ <--- HERE
E               return input0
E           
E           Traceback of TorchScript, original code (most recent call last):
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/conv.py(419): _conv_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/conv.py(423): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/container.py(117): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /g/kreshuk/pape/Work/my_projects/torch-em/scripts/bioimageio-examples/multi-tensor/multi_tensor_unet.py(268): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /g/kreshuk/pape/Work/my_projects/torch-em/scripts/bioimageio-examples/multi-tensor/multi_tensor_unet.py(149): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /g/kreshuk/pape/Work/my_projects/torch-em/scripts/bioimageio-examples/multi-tensor/multi_tensor_unet.py(70): _apply_default
E           /g/kreshuk/pape/Work/my_projects/torch-em/scripts/bioimageio-examples/multi-tensor/multi_tensor_unet.py(88): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/jit/_trace.py(934): trace_module
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
E           /g/kreshuk/pape/Work/bioimageio/python-bioimage-io/bioimageio/core/weight_converter/torch/torchscript.py(92): convert_weights_to_pytorch_script
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch_em/util/modelzoo.py(928): _convert_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch_em/util/modelzoo.py(947): convert_to_pytorch_script
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch_em/util/modelzoo.py(966): add_weight_formats
E           create_example.py(123): export_model
E           create_example.py(159): <module>
E           RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

..\..\bioimageio\core\prediction_pipeline\_model_adapters\_torchscript_model_adapter.py:30: RuntimeError

2.:

tests\prediction_pipeline\test_prediction_pipeline.py:36 (test_prediction_pipeline_torchscript[unet2d_nuclei_broad_model])
any_torchscript_model = WindowsPath('/Users/fbeut/bioimageio_cache/packages/UNet_2D_Nuclei_Broad_0_1_3p39.zip')

    def test_prediction_pipeline_torchscript(any_torchscript_model):
>       _test_prediction_pipeline(any_torchscript_model, "pytorch_script")

test_prediction_pipeline.py:38: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
test_prediction_pipeline.py:20: in _test_prediction_pipeline
    outputs = pp.forward(*inputs)
..\..\bioimageio\core\prediction_pipeline\_prediction_pipeline.py:129: in forward
    prediction = self.predict(*preprocessed)
..\..\bioimageio\core\prediction_pipeline\_prediction_pipeline.py:124: in predict
    return self._model.forward(*input_tensors)
..\..\bioimageio\core\prediction_pipeline\_model_adapters\_model_adapter.py:59: in forward
    return self._forward(*input_tensors)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <bioimageio.core.prediction_pipeline._model_adapters._torchscript_model_adapter.TorchscriptModelAdapter object at 0x000002519C174880>
batch = (<xarray.DataArray (b: 1, c: 1, y: 512, x: 512)>
array([[[[-0.531971  , -0.5176083 , -0.531971  , ..., -0.58942187,
  ..., ..., -0.48888284,
          -0.5128207 , -0.5176083 ]]]], dtype=float32)
Dimensions without coordinates: b, c, y, x,)
torch_tensor = [tensor([[[[-0.5320, -0.5176, -0.5320,  ..., -0.5894, -0.5894, -0.5942],
          [-0.5080, -0.5655, -0.5415,  ..., -...  1.2203,  ..., -0.5415, -0.5272, -0.5224],
          [ 1.0814,  0.8756,  0.7463,  ..., -0.4889, -0.5128, -0.5176]]]])]

    def _forward(self, *batch: xr.DataArray) -> List[xr.DataArray]:
        with torch.no_grad():
            torch_tensor = [torch.from_numpy(b.data) for b in batch]
>           result = self._model.forward(*torch_tensor)
E           RuntimeError: The following operation failed in the TorchScript interpreter.
E           Traceback of TorchScript, serialized code (most recent call last):
E             File "code/__torch__/module_from_source/unet2d.py", line 24, in forward
E               _9 = getattr(self.encoders, "1")
E               _10 = getattr(self.downsamplers, "0")
E               _11 = (getattr(self.encoders, "0")).forward(input, )
E                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
E               _12 = (_9).forward((_10).forward(_11, ), )
E               _13 = (_8).forward((_10).forward1(_12, ), )
E             File "code/__torch__/torch/nn/modules/container.py", line 19, in forward
E               _0 = getattr(self, "2")
E               _1 = getattr(self, "1")
E               _2 = (getattr(self, "0")).forward(input, )
E                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
E               return (_0).forward((_1).forward(_2, ), )
E             File "code/__torch__/torch/nn/modules/conv.py", line 10, in forward
E               input: Tensor) -> Tensor:
E               _0 = self.bias
E               input0 = torch._convolution(input, self.weight, _0, [1, 1], [1, 1], [1, 1], False, [0, 0], 1, False, False, True, True)
E                        ~~~~~~~~~~~~~~~~~~ <--- HERE
E               return input0
E           
E           Traceback of TorchScript, original code (most recent call last):
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/conv.py(419): _conv_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/conv.py(423): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/container.py(117): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /g/kreshuk/pape/Work/bioimageio/spec-bioimage-io/example_specs/models/unet2d_nuclei_broad/unet2d.py(55): forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(709): _slow_forward
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/nn/modules/module.py(725): _call_impl
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/jit/_trace.py(934): trace_module
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/lib/python3.8/site-packages/torch/jit/_trace.py(733): trace
E           /g/kreshuk/pape/Work/bioimageio/python-bioimage-io/bioimageio/core/weight_converter/torch/torchscript.py(96): convert_weights_to_pytorch_script
E           /g/kreshuk/pape/Work/bioimageio/python-bioimage-io/bioimageio/core/weight_converter/torch/torchscript.py(115): main
E           /home/pape/Work/software/conda/miniconda3/envs/torch17/bin/bioimageio-convert_torch_to_torchscript(33): <module>
E           RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

..\..\bioimageio\core\prediction_pipeline\_model_adapters\_torchscript_model_adapter.py:30: RuntimeError

@constantinpape
Copy link
Contributor

I am a bit confused, how do you even run tests on the gpu?

@constantinpape
Copy link
Contributor

Ok, I see we have tests for this in #155 now.

@FynnBe
Copy link
Member Author

FynnBe commented Nov 23, 2021

we have the test_prediction_pipeline test already. I ran them locally on gpu and stumbled into it...
the new tests in #155 are the same tests slightly adapted for multiple inference.

@constantinpape
Copy link
Contributor

we have the test_prediction_pipeline test already.

Yes, but it tests without GPU?! Or does it take a GPU if available?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants