Skip to content

Commit

Permalink
[Doc] Update FAQ for TensorRT (#96)
Browse files Browse the repository at this point in the history
* update FAQ

* comment
  • Loading branch information
q.yao authored Jan 26, 2022
1 parent 0556fee commit d522874
Show file tree
Hide file tree
Showing 2 changed files with 40 additions and 0 deletions.
29 changes: 29 additions & 0 deletions docs/en/faq.md
Original file line number Diff line number Diff line change
@@ -1 +1,30 @@
## Frequently Asked Questions

### TensorRT

- "WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected."

Fp16 mode requires a device with full-rate fp16 support.

- "error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]"

When building an `ICudaEngine` from an `INetworkDefinition` that has dynamically resizable inputs, users need to specify at least one optimization profile. Which can be set in deploy config:

```python
backend_config = dict(
common_config=dict(max_workspace_size=1 << 30),
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 3, 320, 320],
opt_shape=[1, 3, 800, 1344],
max_shape=[1, 3, 1344, 1344])))
])
```

The input tensor shape should be limited between `min_shape` and `max_shape`.

- "error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS"

TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the defaulted choice for SM version >= 7.0. You may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you dont want to upgrade.
11 changes: 11 additions & 0 deletions mmdeploy/backend/tensorrt/wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,18 @@ def forward(self, inputs: Dict[str,
assert self._output_names is not None
bindings = [None] * (len(self._input_names) + len(self._output_names))

profile_id = 0
for input_name, input_tensor in inputs.items():
# check if input shape is valid
profile = self.engine.get_profile_shape(profile_id, input_name)
assert input_tensor.dim() == len(
profile[0]), 'Input dim is different from engine profile.'
for s_min, s_input, s_max in zip(profile[0], input_tensor.shape,
profile[2]):
assert s_min <= s_input <= s_max, \
'Input shape should be between ' \
+ f'{profile[0]} and {profile[2]}' \
+ f' but get {tuple(input_tensor.shape)}.'
idx = self.engine.get_binding_index(input_name)

# All input tensors must be gpu variables
Expand Down

0 comments on commit d522874

Please sign in to comment.