-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add the yolov5 gpu preprocess #395
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
|
Codecov Report
@@ Coverage Diff @@
## main #395 +/- ##
=======================================
Coverage 98.58% 98.58%
=======================================
Files 11 11
Lines 778 778
=======================================
Hits 767 767
Misses 11 11
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
关于warpaffine的cuda实现,一开始测试,我以为warpaffine的实现resize bilinear lettor box没有对齐 关于warpaffine和resize总共两个问题 1.padding问题,查询了资料,该cuda的实现的变换矩阵采用了padding,因为使用了缩放和平移, 并且带越界处理问题 [变换矩阵](https://blog.csdn.net/weixin_42398658/article/details/121019668)参考这个解释 就是使用带偏移和scale的矩阵,由于使用的是双线性插值https://zhuanlan.zhihu.com/p/89684929 [2] 需要的是dst img 到原图的srcimg的value映射关系,所以需要有dst 坐标 640 480 0 0--> 0 0 配合boarder opencv resize linear
warpaffine在源码中变换 warpaffine matrix
测试640 480 则 0 0 -->0 0
cuda 实现中采用边界处理的方式
获得四个角点坐标执行公式即参考
w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx; 其中w1为 v1[0]该点的权重值,v1为像素值 后续部分是rgb通道互换还有NHWC转NCHW
2.bilinear的实现形式,的确在warpaffine的实现中使用的是opencv bilinear但是在使用过程中会出现边界问题,参考如下文章 https://zhuanlan.zhihu.com/p/89684929 https://zhuanlan.zhihu.com/p/99626808 但是考虑到运算过程调用中 import numpy
import cv2
outputsize = 8
scale = outputsize/4
d = numpy.array([[255, 200, 0, 50],
[200, 255, 50, 0],
[ 0, 50, 255, 200],
[ 50, 0, 200, 255]], numpy.uint8)
cv2.resize(d, (outputsize,outputsize))
cv2.warpAffine(d, numpy.matrix([[scale,0,scale*0.5-0.5],[0,scale,scale*0.5-0.5]]), (outputsize,outputsize))
#输出结果并没有对齐
#resize
array([[255, 241, 214, 150, 50, 12, 37, 50],
[241, 234, 221, 163, 63, 19, 31, 38],
[214, 221, 234, 190, 88, 31, 19, 13],
[150, 163, 190, 178, 127, 88, 63, 50],
[ 50, 63, 88, 127, 178, 190, 163, 150],
[ 13, 19, 31, 88, 190, 234, 221, 214],
[ 38, 31, 19, 63, 163, 221, 234, 241],
[ 50, 37, 12, 50, 150, 214, 241, 255]], dtype=uint8)
#warpaffine
array([[143, 181, 160, 113, 38, 9, 28, 28],
[181, 234, 221, 163, 63, 19, 31, 28],
[160, 221, 234, 190, 88, 31, 19, 9],
[113, 163, 190, 178, 127, 88, 63, 38],
[ 38, 63, 88, 127, 178, 190, 163, 113],
[ 9, 19, 31, 88, 190, 234, 221, 160],
[ 28, 31, 19, 63, 163, 221, 234, 181],
[ 28, 28, 9, 38, 113, 160, 181, 143]], dtype=uint8)
#询问后,对齐要采用
#borderMode的问题,warpAffine的borderMode指定成BORDER_REPLICATE就一样了
cv2.warpAffine(d, numpy.matrix([[scale,0,scale*0.5-0.5],[0,scale,scale*0.5-0.5]]), (outputsize,outputsize),borderMode=cv2.BORDER_REPLICATE)
array([[255, 241, 214, 150, 50, 13, 38, 50],
[241, 234, 221, 163, 63, 19, 31, 38],
[214, 221, 234, 190, 88, 31, 19, 13],
[150, 163, 190, 178, 127, 88, 63, 50],
[ 50, 63, 88, 127, 178, 190, 163, 150],
[ 13, 19, 31, 88, 190, 234, 221, 214],
[ 38, 31, 19, 63, 163, 221, 234, 241],
[ 50, 38, 13, 50, 150, 214, 241, 255]], dtype=uint8) 那么borderMode 指定BORDER_REPLICATE的作用 官方opencv给的解释 BORDER_REPLICATE 但是cuda的实现其实已经考虑了boarder边界问题最关键还是偏移位置设定的问题 后赖尝试采用bilinear使用cuda,但是不会处理padding问题,感觉速度还是很慢
|
add the gpu preprocess and batch inference in cpp ,but not set the py inference