Got an error when running bundle inference with TensorRT torchscript #6124

binliunls · 2023-03-10T06:19:25Z

Describe the bug
When I run the bundle inference with TensorRT based torchscript models, the below error appeared.

  File "/usr/local/lib/python3.8/dist-packages/monai/handlers/stats_handler.py", line 179, in exception_raised
    raise e
  File "/usr/local/lib/python3.8/dist-packages/ignite/engine/engine.py", line 1068, in _run_once_on_dataset_as_gen
    self.state.output = self._process_function(self, self.state.batch)
  File "/usr/local/lib/python3.8/dist-packages/monai/engines/evaluator.py", line 300, in _iteration
    with engine.mode(engine.network):
  File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.8/dist-packages/monai/networks/utils.py", line 389, in eval_mode
    training = [n for n in nets if n.training]
  File "/usr/local/lib/python3.8/dist-packages/monai/networks/utils.py", line 389, in <listcomp>
    training = [n for n in nets if n.training]
  File "/usr/local/lib/python3.8/dist-packages/torch/jit/_script.py", line 785, in __getattr__
    return super(RecursiveScriptModule, self).__getattr__(attr)
  File "/usr/local/lib/python3.8/dist-packages/torch/jit/_script.py", line 502, in __getattr__
    return super(ScriptModule, self).__getattr__(attr)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1587, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'RecursiveScriptModule' object has no attribute 'training'

This should relate to the with engine.mode(engine.network): code here, in which the engine.network.training method of the torchscript model is directly used without check. I think we should add some checks in the evaluator.py file to avoid this issue.

To Reproduce
Steps to reproduce the behavior:

Start a MONAI docker.
Convert a MONAI bundle in model zoo to a TensorRT based torchscript.
Add import torch_tensorrt to the import part of inference.json.
Modify the network_def to torch.jit.load(<model_name>).
Disable the CheckpointLoader handler.
Run the command python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

Expected behavior
Run the bundle and get the inference results.

The text was updated successfully, but these errors were encountered:

Nic-Ma · 2023-03-10T06:51:47Z

Hi @yiheng-wang-nv ,

I think this issue is related to the documentation: https://github.com/Project-MONAI/model-zoo/blob/dev/CONTRIBUTING.md#verify-torchscript. Could you please help double confirm it?

Thanks in advance.

yiheng-wang-nv · 2023-03-13T08:37:07Z

https://github.com/Project-MONAI/model-zoo/blob/dev/CONTRIBUTING.md#verify-torchscript

Hi @Nic-Ma , the doc is correct and the issue here seems in the engine side.

Fixes #6124 . ### Description When running the inference with torchscript wrapped TensorRT models, the evaluator would give an error. This is caused by the `with engine.mode()` code run the `training` method of `engine.network` without checking. In this PR, an attribute check has been added to cover this issue. ### Types of changes  - [x] Non-breaking change (fix or new feature that would not break existing functionality). - [ ] Breaking change (fix or new feature that would cause existing functionality to change). - [ ] New tests added to cover the changes. - [ ] Integration tests passed locally by running `./runtests.sh -f -u --net --coverage`. - [ ] Quick tests passed locally by running `./runtests.sh --quick --unittests --disttests`. - [ ] In-line docstrings updated. - [ ] Documentation updated, tested `make html` command in the `docs/` folder. --------- Signed-off-by: binliu <binliu@nvidia.com>

Nic-Ma assigned yiheng-wang-nv and binliunls Mar 10, 2023

binliunls mentioned this issue Mar 12, 2023

6124-add-training-attribute-check #6132

Merged

7 tasks

wyli closed this as completed in #6132 Mar 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Got an error when running bundle inference with TensorRT torchscript #6124

Got an error when running bundle inference with TensorRT torchscript #6124

binliunls commented Mar 10, 2023

Nic-Ma commented Mar 10, 2023

yiheng-wang-nv commented Mar 13, 2023

Got an error when running bundle inference with TensorRT torchscript #6124

Got an error when running bundle inference with TensorRT torchscript #6124

Comments

binliunls commented Mar 10, 2023

Nic-Ma commented Mar 10, 2023

yiheng-wang-nv commented Mar 13, 2023