[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

inocsin · 2022-06-11T07:09:03Z

…l change input name

Signed-off-by: inocsin vcheungyi@163.com

Description

When (1) network only have aten::to layer or (2) the output of aten::to is same as input and the input of aten::to is network input, will change the input tensor's name, which will case an error.

class Net(nn.Module):
  def __init__(self):
    super(Net, self).__init__()

  def forward(self, data, index):
    index = index.to(torch.int64) # in trt, output == input
    src = 1
    data = data.scatter_(1,index,src) # in torch
    return data

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

…l change input name Signed-off-by: inocsin <vcheungyi@163.com>

inocsin · 2022-06-11T07:11:11Z

@narendasan please review this change

narendasan · 2022-06-17T00:18:40Z

This seems fine to me but I think it should be part of more comprehensive changes to catch this class of error. cc: @bowang007

bowang007 · 2022-06-22T23:09:25Z

looks like this issue is related to this one #982. Both these issues are triggered by changing the names of ITensor.
I'm wondering if there are other similar issues comes from other converters, as what we have discussed @narendasan .
If we introduce some kind of detection mechanism to prevent renaming ITensors, then this change would be unnecessary.

bowang007 · 2022-06-22T23:10:08Z

what's the error message that you have now? @inocsin
I'm seeing a segmentation fault.

inocsin · 2022-06-26T09:50:03Z

what's the error message that you have now? @inocsin I'm seeing a segmentation fault.

Error message is here, because the input with name input_0 is changed to output value named 4, so the binding will fail.

DEBUG: [Torch-TensorRT - Debug Build] - Running JIT version
DEBUG: [Torch-TensorRT - Debug Build] - Running TRT version
DEBUG: [Torch-TensorRT - Debug Build] - Pairing 0: y.1 : Input(shape: [3], dtype: Float32, format: NCHW\Contiguous\Linear)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init CUDA: CPU +318, GPU +0, now: CPU 3018, GPU 1632 (MiB)
INFO: [Torch-TensorRT - Debug Build] - Settings requested for TensorRT engine:
    Enabled Precisions: Float32
    TF32 Floating Point Computation Enabled: 1
    Truncate Long and Double: 0
    Make Refittable Engine: 0
    Debuggable Engine: 0
    GPU ID: 0
    Allow GPU Fallback (if running on DLA): 0
    Min Timing Iterations: 2
    Avg Timing Iterations: 1
    Max Workspace Size: 1073741824
    Device Type: GPU
    GPU ID: 0
    Engine Capability: standard
    Calibrator Created: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Converting Block
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - graph(%y.1 : Tensor):
  %1 : int = prim::Constant[value=6]()
  %2 : bool = prim::Constant[value=0]()
  %3 : NoneType = prim::Constant()
  %4 : Tensor = aten::to(%y.1, %1, %2, %2, %3)
  return (%4)

DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Input Dimension Specs: {
    y.1 : Input(shape: [3], dtype: Float32, format: NCHW\Contiguous\Linear),}
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Input y.1 (named: input_0): Input(shape: [3], dtype: Float32, format: NCHW\Contiguous\Linear) in engine (conversion.AddInputs)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Evaluating %1 : int = prim::Constant[value=6]()
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Found the value to be: 6
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Evaluating %2 : bool = prim::Constant[value=0]()
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Found the value to be: False
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Evaluating %3 : NoneType = prim::Constant()
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Found the value to be: None
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Layer %4 : Tensor = aten::to(%y.1, %1, %2, %2, %3) (ctx.AddLayer)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is an already converted tensor
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT - Debug Build] - ITensor shape: [3]
DEBUG: [Torch-TensorRT - Debug Build] - ITensor type: Float32
DEBUG: [Torch-TensorRT - Debug Build] - [aten::to.dtype] Output tensor shape: [3]
DEBUG: [Torch-TensorRT - Debug Build] - One of the inputs named 4 to the network is marked as an output tensor. Applying an identity layer and marking this tensor as output
INFO: [Torch-TensorRT TorchScript Conversion Context] - Marking Output 4 named output_0 in engine (ctx.MarkOutput)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] Builder begin: CPU 3018 MiB, GPU 1632 MiB
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Applying generic optimizations to the graph for inference.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Original: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After dead-layer removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After Myelin optimization: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After scale fusion: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After vertical fusions: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After dupe layer removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After final dead-layer removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After tensor merging: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After concat removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Graph construction and optimization completed in 0.0130252 seconds.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +322, GPU +166, now: CPU 3340, GPU 1798 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuDNN: CPU +454, GPU +204, now: CPU 3794, GPU 2002 (MiB)
WARNING: [Torch-TensorRT TorchScript Conversion Context] - Detected invalid timing cache, setup a local cache instead
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Constructing optimization profile number 0 [1/1].
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - *************** Autotuning format combination: Float(1) -> Float(1) ***************
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - --------------- Timing Runner: (Unnamed Layer* 0) [Identity] (Cast)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Cast has no valid tactics for this config, skipping
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - --------------- Timing Runner: (Unnamed Layer* 0) [Identity] (Reformat)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Tactic: 1002 Time: 0.011776
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Tactic: 0 Time: 0.006272
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Fastest Tactic: 0 Time: 0.006272
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - >>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Formats and tactics selection completed in 0.00822353 seconds.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After reformat layers: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Block size 1073741824
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Total Activation Memory: 1073741824
INFO: [Torch-TensorRT TorchScript Conversion Context] - Detected 1 inputs and 1 output network tensors.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Layer: (Unnamed Layer* 0) [Identity] HostPersistent: 0 DevicePersistent: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Total Host Persistent Memory: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Total Device Persistent Memory: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Total Scratch Memory: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 4 MiB
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3794, GPU 2010 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3794, GPU 2018 (MiB)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 2002 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Engine generation completed in 1.7908 seconds.
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Engine Layer Information:
Layer(Reformat): (Unnamed Layer* 0) [Identity], Tactic: 0, 4[Float(3)] -> output_0[Float(3)]
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] Builder end: CPU 3794 MiB, GPU 1984 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Running TRT version
DEBUG: [Torch-TensorRT - Debug Build] - Target Device: Device(ID: 0, Name: Tesla T4, SM Capability: 7.5, Type: GPU)
DEBUG: [Torch-TensorRT - Debug Build] - Setting Device(ID: 0, Name: Tesla T4, SM Capability: 7.5, Type: GPU) as active device
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
INFO: [Torch-TensorRT - Debug Build] - Loaded engine size: 0 MB
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] deserializeCudaEngine begin: CPU 3794 MiB, GPU 1984 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT - Debug Build] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 3794, GPU 1994 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3794, GPU 2002 (MiB)
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Deserialization required 25742 microseconds.
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] deserializeCudaEngine end: CPU 3794 MiB, GPU 1984 MiB
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] ExecutionContext creation begin: CPU 3794 MiB, GPU 1984 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT - Debug Build] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 3794, GPU 1994 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3794, GPU 2002 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Total per-runner device memory is 0
DEBUG: [Torch-TensorRT - Debug Build] - Total per-runner host memory is 0
DEBUG: [Torch-TensorRT - Debug Build] - Allocated activation device memory of size 0
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] ExecutionContext creation end: CPU 3794 MiB, GPU 2002 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Binding name: 4
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
unknown file: Failure
C++ exception with description "[Error thrown at core/runtime/TRTEngine.cpp:65] Expected delim != std::string::npos to be true but got false
Unable to determine binding index for input 4
Ensure module was compiled with Torch-TensorRT.ts or follows Torch-TensorRT Runtime conventions
" thrown in the test body.
[  FAILED  ] Converters.ATenToSingleConvertsCorrectly (7865 ms)
[----------] 1 test from Converters (7865 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (7865 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] Converters.ATenToSingleConvertsCorrectly

 1 FAILED TEST

bowang007 · 2022-06-27T17:32:52Z

same error with #982.

inocsin · 2022-06-29T02:58:45Z

@bowang007 Delete this line will also solve the problem https://github.com/pytorch/TensorRT/blob/master/core/conversion/conversionctx/ConversionCtx.cpp#L133

bowang007 · 2022-06-30T22:37:46Z

@bowang007 Delete this line will also solve the problem https://github.com/pytorch/TensorRT/blob/master/core/conversion/conversionctx/ConversionCtx.cpp#L133

yes, we discussed this WAR in the channel.
However, not sure if this deletion would trigger other issues.

bowang007

LGTM

Signed-off-by: inocsin <vcheungyi@163.com>

inocsin · 2022-07-22T01:25:16Z

@dheerajperi reverted change

[fix]: fix bug in aten::to, when network only have aten::to layer wil…

ebfb086

…l change input name Signed-off-by: inocsin <vcheungyi@163.com>

facebook-github-bot added the cla signed label Jun 11, 2022

github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: tests Issues re: Tests labels Jun 11, 2022

github-actions bot requested review from andi4191, bowang007, narendasan and peri044 June 24, 2022 00:54

bowang007 approved these changes Jul 5, 2022

View reviewed changes

bowang007 mentioned this pull request Jul 6, 2022

fix: converter renaming already named tensors #1167

Merged

7 tasks

ncomly-nvidia added the release: v1.2 Tagged to be included in v1.2 label Jul 15, 2022

github-actions bot requested a review from bowang007 July 15, 2022 01:20

narendasan added the Story: Binding Names Issues related to binding names, format and uniqueness label Jul 15, 2022

fix: revert changes in castITensor

f69cfc4

Signed-off-by: inocsin <vcheungyi@163.com>

peri044 approved these changes Jul 22, 2022

View reviewed changes

peri044 merged commit fc04d4a into pytorch:master Jul 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

Uh oh!

inocsin commented Jun 11, 2022 •

edited

Loading

Uh oh!

inocsin commented Jun 11, 2022

Uh oh!

narendasan commented Jun 17, 2022

Uh oh!

bowang007 commented Jun 22, 2022

Uh oh!

bowang007 commented Jun 22, 2022 •

edited

Loading

Uh oh!

inocsin commented Jun 26, 2022

Uh oh!

bowang007 commented Jun 27, 2022

Uh oh!

inocsin commented Jun 29, 2022

Uh oh!

bowang007 commented Jun 30, 2022

Uh oh!

bowang007 left a comment

Uh oh!

inocsin commented Jul 22, 2022

Uh oh!

Uh oh!

[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

[fix]: fix bug in aten::to, when network only have aten::to layer wil… #1108

Uh oh!

Conversation

inocsin commented Jun 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist:

Uh oh!

inocsin commented Jun 11, 2022

Uh oh!

narendasan commented Jun 17, 2022

Uh oh!

bowang007 commented Jun 22, 2022

Uh oh!

bowang007 commented Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

inocsin commented Jun 26, 2022

Uh oh!

bowang007 commented Jun 27, 2022

Uh oh!

inocsin commented Jun 29, 2022

Uh oh!

bowang007 commented Jun 30, 2022

Uh oh!

bowang007 left a comment

Choose a reason for hiding this comment

Uh oh!

inocsin commented Jul 22, 2022

Uh oh!

Uh oh!

inocsin commented Jun 11, 2022 •

edited

Loading

bowang007 commented Jun 22, 2022 •

edited

Loading