- FAQ
- Model Conversion
- Model Inference
- Q: Inference take a long time on a single image.
- Q: Memory leak when inference.
- Q: error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]
- Q: FP16 model is slower than FP32 model
- Q: error: [TensorRT] INTERNAL ERROR: Assertion failed: cublasStatus == CUBLAS_STATUS_SUCCESS
This page provides some frequently asked questions and their solutions.
Q: WARNING: Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
The device does not have full-rate fp16 model.
The model will do some initial when first inference. Please warm up the model before inference on real data.
This is a bug of on old version TensorRT, read this for detail. Please update TensorRT version.
Q: error: parameter check failed at: engine.cpp::setBindingDimensions::1046, condition: profileMinDims.d[i] <= dimensions.d[i]
The input tensor shape is out of the range. Please enlarge the opt_shape_param
when converting the model.
shape_ranges=dict(
x=dict(
min=[1,3,320,320],
opt=[1,3,800,1344],
max=[1,3,1344,1344],
)
)
Please check this to see if your device has full-rate fp16 support.
This is the answer from Nvidia developer forums:
TRT 7.2.1 switches to use cuBLASLt (previously it was cuBLAS). cuBLASLt is the defaulted choice for SM version >= 7.0. However,you may need CUDA-10.2 Patch 1 (Released Aug 26, 2020) to resolve some cuBLASLt issues. Another option is to use the new TacticSource API and disable cuBLASLt tactics if you dont want to upgrade.