Fix gather_op to avoid cudaErrorLaunchFailure for solov2, test=develop #34200
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR types
Bug fixes
PR changes
OPs
Describe
Fix gather_op to avoid cudaErrorLaunchFailure for solov2
icafe卡片:https://console.cloud.baidu-int.com/devops/icafe/issue/DLTP-32060/show
由于框架dev的改动,导致solov2模型评估/预测报错
经排查:在gather_op的cuda_kernel_loop对index做越界分析时,将index与上界input_size(the size of input)作比较时,报错信息如下:
目前未找到合适的解决方法,先取消上界越界分析,后续找到解决方法后再修复。
PR链接:#34096
再更:(解决方法是将input_size在GPUGather内从cpu拷贝到gpu后再传入cuda_kernel中进行比较,但会影响此基础Op的性能)