Skip to content

Conversation

inocsin
Copy link
Contributor

@inocsin inocsin commented Jun 11, 2022

…l change input name

Signed-off-by: inocsin vcheungyi@163.com

Description

When (1) network only have aten::to layer or (2) the output of aten::to is same as input and the input of aten::to is network input, will change the input tensor's name, which will case an error.

class Net(nn.Module):
  def __init__(self):
    super(Net, self).__init__()

  def forward(self, data, index):
    index = index.to(torch.int64) # in trt, output == input
    src = 1
    data = data.scatter_(1,index,src) # in torch
    return data

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

…l change input name

Signed-off-by: inocsin <vcheungyi@163.com>
@github-actions github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: tests Issues re: Tests labels Jun 11, 2022
@inocsin
Copy link
Contributor Author

inocsin commented Jun 11, 2022

@narendasan please review this change

@narendasan
Copy link
Collaborator

This seems fine to me but I think it should be part of more comprehensive changes to catch this class of error. cc: @bowang007

@bowang007
Copy link
Collaborator

looks like this issue is related to this one #982. Both these issues are triggered by changing the names of ITensor.
I'm wondering if there are other similar issues comes from other converters, as what we have discussed @narendasan .
If we introduce some kind of detection mechanism to prevent renaming ITensors, then this change would be unnecessary.

@bowang007
Copy link
Collaborator

bowang007 commented Jun 22, 2022

what's the error message that you have now? @inocsin
I'm seeing a segmentation fault.

@inocsin
Copy link
Contributor Author

inocsin commented Jun 26, 2022

what's the error message that you have now? @inocsin I'm seeing a segmentation fault.

Error message is here, because the input with name input_0 is changed to output value named 4, so the binding will fail.

DEBUG: [Torch-TensorRT - Debug Build] - Running JIT version
DEBUG: [Torch-TensorRT - Debug Build] - Running TRT version
DEBUG: [Torch-TensorRT - Debug Build] - Pairing 0: y.1 : Input(shape: [3], dtype: Float32, format: NCHW\Contiguous\Linear)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init CUDA: CPU +318, GPU +0, now: CPU 3018, GPU 1632 (MiB)
INFO: [Torch-TensorRT - Debug Build] - Settings requested for TensorRT engine:
    Enabled Precisions: Float32
    TF32 Floating Point Computation Enabled: 1
    Truncate Long and Double: 0
    Make Refittable Engine: 0
    Debuggable Engine: 0
    GPU ID: 0
    Allow GPU Fallback (if running on DLA): 0
    Min Timing Iterations: 2
    Avg Timing Iterations: 1
    Max Workspace Size: 1073741824
    Device Type: GPU
    GPU ID: 0
    Engine Capability: standard
    Calibrator Created: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Converting Block
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - graph(%y.1 : Tensor):
  %1 : int = prim::Constant[value=6]()
  %2 : bool = prim::Constant[value=0]()
  %3 : NoneType = prim::Constant()
  %4 : Tensor = aten::to(%y.1, %1, %2, %2, %3)
  return (%4)

DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Input Dimension Specs: {
    y.1 : Input(shape: [3], dtype: Float32, format: NCHW\Contiguous\Linear),}
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Input y.1 (named: input_0): Input(shape: [3], dtype: Float32, format: NCHW\Contiguous\Linear) in engine (conversion.AddInputs)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Evaluating %1 : int = prim::Constant[value=6]()
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Found the value to be: 6
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Evaluating %2 : bool = prim::Constant[value=0]()
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Found the value to be: False
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Evaluating %3 : NoneType = prim::Constant()
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Found the value to be: None
INFO: [Torch-TensorRT TorchScript Conversion Context] - Adding Layer %4 : Tensor = aten::to(%y.1, %1, %2, %2, %3) (ctx.AddLayer)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is an already converted tensor
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Node input is a result of a previously evaluated value
DEBUG: [Torch-TensorRT - Debug Build] - ITensor shape: [3]
DEBUG: [Torch-TensorRT - Debug Build] - ITensor type: Float32
DEBUG: [Torch-TensorRT - Debug Build] - [aten::to.dtype] Output tensor shape: [3]
DEBUG: [Torch-TensorRT - Debug Build] - One of the inputs named 4 to the network is marked as an output tensor. Applying an identity layer and marking this tensor as output
INFO: [Torch-TensorRT TorchScript Conversion Context] - Marking Output 4 named output_0 in engine (ctx.MarkOutput)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] Builder begin: CPU 3018 MiB, GPU 1632 MiB
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Applying generic optimizations to the graph for inference.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Original: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After dead-layer removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After Myelin optimization: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After scale fusion: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After vertical fusions: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After dupe layer removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After final dead-layer removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After tensor merging: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After concat removal: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Graph construction and optimization completed in 0.0130252 seconds.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +322, GPU +166, now: CPU 3340, GPU 1798 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuDNN: CPU +454, GPU +204, now: CPU 3794, GPU 2002 (MiB)
WARNING: [Torch-TensorRT TorchScript Conversion Context] - Detected invalid timing cache, setup a local cache instead
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Constructing optimization profile number 0 [1/1].
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - *************** Autotuning format combination: Float(1) -> Float(1) ***************
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - --------------- Timing Runner: (Unnamed Layer* 0) [Identity] (Cast)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Cast has no valid tactics for this config, skipping
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - --------------- Timing Runner: (Unnamed Layer* 0) [Identity] (Reformat)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Tactic: 1002 Time: 0.011776
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Tactic: 0 Time: 0.006272
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Fastest Tactic: 0 Time: 0.006272
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - >>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Formats and tactics selection completed in 0.00822353 seconds.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - After reformat layers: 1 layers
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Block size 1073741824
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Total Activation Memory: 1073741824
INFO: [Torch-TensorRT TorchScript Conversion Context] - Detected 1 inputs and 1 output network tensors.
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Layer: (Unnamed Layer* 0) [Identity] HostPersistent: 0 DevicePersistent: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Total Host Persistent Memory: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Total Device Persistent Memory: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - Total Scratch Memory: 0
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 4 MiB
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT TorchScript Conversion Context] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 3794, GPU 2010 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3794, GPU 2018 (MiB)
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 2002 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Engine generation completed in 1.7908 seconds.
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
DEBUG: [Torch-TensorRT TorchScript Conversion Context] - Engine Layer Information:
Layer(Reformat): (Unnamed Layer* 0) [Identity], Tactic: 0, 4[Float(3)] -> output_0[Float(3)]
INFO: [Torch-TensorRT TorchScript Conversion Context] - [MemUsageSnapshot] Builder end: CPU 3794 MiB, GPU 1984 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Running TRT version
DEBUG: [Torch-TensorRT - Debug Build] - Target Device: Device(ID: 0, Name: Tesla T4, SM Capability: 7.5, Type: GPU)
DEBUG: [Torch-TensorRT - Debug Build] - Setting Device(ID: 0, Name: Tesla T4, SM Capability: 7.5, Type: GPU) as active device
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
INFO: [Torch-TensorRT - Debug Build] - Loaded engine size: 0 MB
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] deserializeCudaEngine begin: CPU 3794 MiB, GPU 1984 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT - Debug Build] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 3794, GPU 1994 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3794, GPU 2002 (MiB)
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Deserialization required 25742 microseconds.
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] deserializeCudaEngine end: CPU 3794 MiB, GPU 1984 MiB
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] ExecutionContext creation begin: CPU 3794 MiB, GPU 1984 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Using cublasLt a tactic source
WARNING: [Torch-TensorRT - Debug Build] - TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 3794, GPU 1994 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Using cuDNN as a tactic source
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 3794, GPU 2002 (MiB)
DEBUG: [Torch-TensorRT - Debug Build] - Total per-runner device memory is 0
DEBUG: [Torch-TensorRT - Debug Build] - Total per-runner host memory is 0
DEBUG: [Torch-TensorRT - Debug Build] - Allocated activation device memory of size 0
INFO: [Torch-TensorRT - Debug Build] - [MemUsageSnapshot] ExecutionContext creation end: CPU 3794 MiB, GPU 2002 MiB
DEBUG: [Torch-TensorRT - Debug Build] - Binding name: 4
INFO: [Torch-TensorRT - Debug Build] - [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 3794, GPU 1984 (MiB)
unknown file: Failure
C++ exception with description "[Error thrown at core/runtime/TRTEngine.cpp:65] Expected delim != std::string::npos to be true but got false
Unable to determine binding index for input 4
Ensure module was compiled with Torch-TensorRT.ts or follows Torch-TensorRT Runtime conventions
" thrown in the test body.
[  FAILED  ] Converters.ATenToSingleConvertsCorrectly (7865 ms)
[----------] 1 test from Converters (7865 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (7865 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] Converters.ATenToSingleConvertsCorrectly

 1 FAILED TEST

@bowang007
Copy link
Collaborator

same error with #982.

@inocsin
Copy link
Contributor Author

inocsin commented Jun 29, 2022

@bowang007
Copy link
Collaborator

@bowang007 Delete this line will also solve the problem https://github.com/pytorch/TensorRT/blob/master/core/conversion/conversionctx/ConversionCtx.cpp#L133

yes, we discussed this WAR in the channel.
However, not sure if this deletion would trigger other issues.

Copy link
Collaborator

@bowang007 bowang007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ncomly-nvidia ncomly-nvidia added the release: v1.2 Tagged to be included in v1.2 label Jul 15, 2022
@github-actions github-actions bot requested a review from bowang007 July 15, 2022 01:20
@narendasan narendasan added the Story: Binding Names Issues related to binding names, format and uniqueness label Jul 15, 2022
Signed-off-by: inocsin <vcheungyi@163.com>
@inocsin
Copy link
Contributor Author

inocsin commented Jul 22, 2022

@dheerajperi reverted change

@peri044 peri044 merged commit fc04d4a into pytorch:master Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: tests Issues re: Tests release: v1.2 Tagged to be included in v1.2 Story: Binding Names Issues related to binding names, format and uniqueness
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants