Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Got an error when running bundle inference with TensorRT torchscript #6124

Closed
binliunls opened this issue Mar 10, 2023 · 2 comments · Fixed by #6132
Closed

Got an error when running bundle inference with TensorRT torchscript #6124

binliunls opened this issue Mar 10, 2023 · 2 comments · Fixed by #6132
Assignees

Comments

@binliunls
Copy link
Contributor

Describe the bug
When I run the bundle inference with TensorRT based torchscript models, the below error appeared.

  File "/usr/local/lib/python3.8/dist-packages/monai/handlers/stats_handler.py", line 179, in exception_raised
    raise e
  File "/usr/local/lib/python3.8/dist-packages/ignite/engine/engine.py", line 1068, in _run_once_on_dataset_as_gen
    self.state.output = self._process_function(self, self.state.batch)
  File "/usr/local/lib/python3.8/dist-packages/monai/engines/evaluator.py", line 300, in _iteration
    with engine.mode(engine.network):
  File "/usr/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.8/dist-packages/monai/networks/utils.py", line 389, in eval_mode
    training = [n for n in nets if n.training]
  File "/usr/local/lib/python3.8/dist-packages/monai/networks/utils.py", line 389, in <listcomp>
    training = [n for n in nets if n.training]
  File "/usr/local/lib/python3.8/dist-packages/torch/jit/_script.py", line 785, in __getattr__
    return super(RecursiveScriptModule, self).__getattr__(attr)
  File "/usr/local/lib/python3.8/dist-packages/torch/jit/_script.py", line 502, in __getattr__
    return super(ScriptModule, self).__getattr__(attr)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1587, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'RecursiveScriptModule' object has no attribute 'training'

This should relate to the with engine.mode(engine.network): code here, in which the engine.network.training method of the torchscript model is directly used without check. I think we should add some checks in the evaluator.py file to avoid this issue.

To Reproduce
Steps to reproduce the behavior:

  1. Start a MONAI docker.
  2. Convert a MONAI bundle in model zoo to a TensorRT based torchscript.
  3. Add import torch_tensorrt to the import part of inference.json.
  4. Modify the network_def to torch.jit.load(<model_name>).
  5. Disable the CheckpointLoader handler.
  6. Run the command python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

Expected behavior
Run the bundle and get the inference results.

@Nic-Ma
Copy link
Contributor

Nic-Ma commented Mar 10, 2023

Hi @yiheng-wang-nv ,

I think this issue is related to the documentation: https://github.com/Project-MONAI/model-zoo/blob/dev/CONTRIBUTING.md#verify-torchscript. Could you please help double confirm it?

Thanks in advance.

@yiheng-wang-nv
Copy link
Contributor

https://github.com/Project-MONAI/model-zoo/blob/dev/CONTRIBUTING.md#verify-torchscript

Hi @Nic-Ma , the doc is correct and the issue here seems in the engine side.

wyli pushed a commit that referenced this issue Mar 14, 2023
Fixes #6124 .

### Description

When running the inference with torchscript wrapped TensorRT models, the
evaluator would give an error. This is caused by the `with
engine.mode()` code run the `training` method of `engine.network`
without checking. In this PR, an attribute check has been added to cover
this issue.

### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [ ] Breaking change (fix or new feature that would cause existing
functionality to change).
- [ ] New tests added to cover the changes.
- [ ] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.
- [ ] Quick tests passed locally by running `./runtests.sh --quick
--unittests --disttests`.
- [ ] In-line docstrings updated.
- [ ] Documentation updated, tested `make html` command in the `docs/`
folder.

---------

Signed-off-by: binliu <binliu@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants