Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Bug] Fallback for torch.nn.functional.one_hot fails #814

Closed
chaoz-dev opened this issue Jan 18, 2022 · 10 comments · Fixed by #902
Closed

🐛 [Bug] Fallback for torch.nn.functional.one_hot fails #814

chaoz-dev opened this issue Jan 18, 2022 · 10 comments · Fixed by #902
Assignees
Labels
bug Something isn't working

Comments

@chaoz-dev
Copy link
Contributor

Bug Description

Fallback for torch.nn.functional.one_hot, whether automatic or forced, appears to fail with the following message:

WARNING: [Torch-TensorRT] - Input type for doing shape analysis could not be determined, defaulting to F32
Traceback (most recent call last):
  File "/home/chaoz/av/experimental/chaoz/examples/test_trtorch.py", line 32, in <module>
    model_trt = torchtrt.compile(
  File "/home/chaoz/.anaconda3/envs/trt-8/lib/python3.9/site-packages/torch_tensorrt/_compile.py", line 97, in compile
    return torch_tensorrt.ts.compile(ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs)
  File "/home/chaoz/.anaconda3/envs/trt-8/lib/python3.9/site-packages/torch_tensorrt/ts/_compiler.py", line 119, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/home/chaoz/av/experimental/chaoz/examples/test_trtorch.py", line 21, in forward
    def forward(self, a):
        return torch.nn.functional.one_hot(a)
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
RuntimeError: one_hot is only applicable to index tensor.

It appears that we attempt to pass floating point values to one_hot during compilation, which will fail as one_hot only takes integer types.

To Reproduce

Run the following:

import torch
import torch_tensorrt as torchtrt

import torch_tensorrt.logging as logging
logging.set_reportable_log_level(logging.Level.Warning)

torch.manual_seed(0)

DEVICE = torch.device("cuda:0")
SHAPE = (10,)

class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, a):
        return torch.nn.functional.one_hot(a)


if __name__ == "__main__":
    tensor = torch.ones(SHAPE, dtype=torch.int32, device=DEVICE)

    with torch.no_grad():
        model = Model().eval().to(DEVICE)

        model_trt = torchtrt.compile(
            model,
            inputs=[
                torchtrt.Input(shape=SHAPE, dtype=torch.int32),
            ],
            enabled_precisions={torch.float},
            torch_executed_ops = ['aten::one_hot']
        )
        out_trt = model(tensor)

Expected behavior

Expect the above to compile without issues.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0): 1.0.0
  • PyTorch Version (e.g. 1.0): 1.10
  • CPU Architecture: x86-64
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, libtorch, source): conda
  • Build command you used (if compiling from source): python setup.py install
  • Are you using local sources or building from archives: local
  • Python version: 3.9
  • CUDA version: 11.4
  • GPU models and configuration: T4
  • Any other relevant information:

Additional context

@chaoz-dev chaoz-dev added the bug Something isn't working label Jan 18, 2022
@narendasan narendasan self-assigned this Feb 22, 2022
@narendasan
Copy link
Collaborator

@chaoz-dev I tried this repro without any Torch-TensorRT code and I still get the same error. I do have a fix for the defaulting to FP32 issue however.

narendasan added a commit that referenced this issue Mar 1, 2022
inferred type.

fixes: #814

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
@chaoz-dev
Copy link
Contributor Author

@chaoz-dev I tried this repro without any Torch-TensorRT code and I still get the same error.
Can you elaborate on what you mean by not using any Torch-TensorRT code?

With regards to the fix, can you elaborate on "user settings"? Is this going to be from input type annotations during graph compilation?

@narendasan
Copy link
Collaborator

I just tried running the model in PyTorch and got the same error (just commented out the compile step)

The fix was to fix an issue where the type map for partitioning wasn't populated properly in the case where we couldn't infer the type

@chaoz-dev
Copy link
Contributor Author

Ah gotcha gotcha 👍🏼

@chaoz-dev
Copy link
Contributor Author

chaoz-dev commented Apr 19, 2022

Updating the above script so that running one_hot is correct in the non-TRT case:

  import torch
  import torch_tensorrt as torchtrt
  
  import torch_tensorrt.logging as logging
  logging.set_reportable_log_level(logging.Level.Warning)
  
  torch.manual_seed(0)
  
  DEVICE = torch.device("cuda:0")
  SHAPE = (10,)
  
  class Model(torch.nn.Module):
      def __init__(self):
          super().__init__()
  
      def forward(self, a):
          return torch.nn.functional.one_hot(a)
  
  
  if __name__ == "__main__":
      tensor = torch.ones(SHAPE, dtype=torch.int64, device=DEVICE)
  
      model = Model().eval().to(DEVICE)
      out = model(tensor)
      print(out)
  
      model = torchtrt.compile(
          model,
          inputs=[
              torchtrt.Input(shape=SHAPE),
          ],
          torch_executed_ops = ['aten::one_hot']
      )
      out_trt = model(tensor)
      print(out_trt)

@chaoz-dev
Copy link
Contributor Author

I'm still seeing the same error however:

root@fdce2b183980:/workspace/Torch-TensorRT# python /scripts/one_hot.py
tensor([[0, 1],
        [0, 1],
        [0, 1],
        [0, 1],
        [0, 1],
        [0, 1],
        [0, 1],
        [0, 1],
        [0, 1],
        [0, 1]], device='cuda:0')
WARNING: [Torch-TensorRT] - Cannot infer input type from calcuations in graph for input a.1. Assuming it is Float32. If not, specify input type explicity
WARNING: [Torch-TensorRT] - Input type for doing shape analysis could not be determined, defaulting to F32
Traceback (most recent call last):
  File "/scripts/one_hot.py", line 27, in <module>
    model = torchtrt.compile(
  File "/usr/local/lib/python3.8/dist-packages/torch_tensorrt/_compile.py", line 115, in compile
    return torch_tensorrt.ts.compile(ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch_tensorrt/ts/_compiler.py", line 113, in compile
    compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/scripts/one_hot.py", line 17, in forward
    def forward(self, a):
        return torch.nn.functional.one_hot(a)
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
RuntimeError: one_hot is only applicable to index tensor.

@chaoz-dev
Copy link
Contributor Author

chaoz-dev commented Apr 19, 2022

Just tested this using NGC nvcr.io/nvidia/tensorrt:22.03-py3 using latest master commit 7330c4:

Seems like we're still inferring F32 here

WARNING: [Torch-TensorRT] - Cannot infer input type from calcuations in graph for input a.1. Assuming it is Float32. If not, specify input type explicity
WARNING: [Torch-TensorRT] - Input type for doing shape analysis could not be determined, defaulting to F32

Although it's possible the issue is that we're trying to compile just one op that's immediately at the beginning and end of the graph, so we're not falling back as we should and leaving the tensor alone (since it's of int64, which cannot be converted in TRT).

@chaoz-dev
Copy link
Contributor Author

Should I reopen this ticket or create a new one?

@narendasan
Copy link
Collaborator

I would say create a new one. Also have you tried setting the dtype of the input to int32?

@narendasan
Copy link
Collaborator

Also if its one unsupported op in the graph, the expected behavior is to return the original module back with no changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants