FAQ

FAQ
- Model Conversion
  - Q: WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
- Model Inference

This page provides some frequently asked questions and their solutions.

Model Conversion

Q: WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected.

The device does not have full-rate fp16 model.

Model Inference

Q: Inference take a long time on a single image.

The model will do some initial when first inference. Please warm up the model before inference on real data.

Q: Memory leak when inference.

This is a bug of on old version TensorRT, read this for detail. Please update TensorRT version.

Q: error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]

The input tensor shape is out of the range. Please enlarge the opt_shape_param when converting the model.

shape_ranges=dict(
    x=dict(
        min=[1,3,320,320],
        opt=[1,3,800,1344],
        max=[1,3,1344,1344],
    )
)

Q: FP16 model is slower than FP32 model

Please check this to see if your device has full-rate fp16 support.

Q: error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS

This is the answer from Nvidia developer forums:

TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the defaulted choice for SM version >= 7.0. However,you may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you dont want to upgrade.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ.md

FAQ.md

FAQ

Model Conversion

Q: WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected.

Model Inference

Q: Inference take a long time on a single image.

Q: Memory leak when inference.

Q: error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]

Q: FP16 model is slower than FP32 model

Q: error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS

Files

FAQ.md

Latest commit

History

FAQ.md

File metadata and controls

FAQ

Model Conversion

Q: WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected.

Model Inference

Q: Inference take a long time on a single image.

Q: Memory leak when inference.

Q: error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]

Q: FP16 model is slower than FP32 model

Q: error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS