立体匹配算法原理

[[TOC]]

立体匹配

① 极线纠正后，同名点位于同一行内，行坐标相同，只存在列坐标偏差，叫做视差

② 通过任一视图的像素坐标以及视差，即可计算其在另一视图的同名点。

③ 存储每个像素视差值的图像叫做视差图（disparity map）。以像素为单位

④ 视差图可结合基线和相机内参计算深度图（depth map）。是空间单位

代价计算

代价计算的目的就是获取一个有视差范围的三维空间，第三个维度表示这个视差范围内每个视差的代价值。常用的方法有：AD、SAD、BT、NCC、Census-Hamming、HMI、Daisy以及基于深度学习方法的匹配代价。

AD

$$C_{AD}(p,d)=|I_i^{left}(p)-I_i^{right}(p,d)|$$

def ComputeCost(left_,right_,option):
    min_disparity = option.min_disparity;
    max_disparity = option.max_disparity;
    disp_range = max_disparity - min_disparity;
    cost_init_ = np.zeros([option.width,option.height,disp_range])
    for i in range(option.height):
        for j in range(option.width):
            for d in range(min_disparity,max_disparity):
                if (j - d < 0) | (j - d >= option.width):     
                    cost_init_[i,j,d-min_disparity] = 255/2
                    continue
                if (option.census_size == "Census5x5"):
                    val_l = left_[i][j]
                    # 右影像对应像点的像素值
                    val_r = right_[i][j-d]
                    # 计算匹配代价
                    cost_init_[i,j,d] = np.abs(val_l-val_r);
    return cost_init_

AD代价只是计算的是两个像素值之间的像素差的绝对值，其相似度的度量形式较弱，

SAD

$$C_{AD}(p,d)=\sum_{i=N_P}|I_i^{left}(p)-I_i^{right}(p,d)|$$

def ComputeCost(left_,right_,option):
    min_disparity = option.min_disparity;
    max_disparity = option.max_disparity;
    disp_range = max_disparity - min_disparity;
    cost_init_ = np.zeros([option.width,option.height,disp_range])
    for i in range(option.height):
        for j in range(option.width):
            for d in range(min_disparity,max_disparity):
                if (j - d < 0) | (j - d >= option.width):     
                    cost_init_[i,j,d-min_disparity] = 255/2
                    continue
                  val_l_r = 0
                  for s in range(-1,2):
                      for k in range(-1,2):
                          val_l = left_[i+s][j+k]
                          val_r = right_[i+s][j+K - d]
                          val_l_r += np.abs(val_l - val_r)
                    # 计算匹配代价
                  cost_init_[i, j, d] = val_l_r;
    return cost_init_

该算法是计算一个匹配窗口中的视差代价，并获取代价空间。

AD-Census

$$C_{AD}(p,d)=\frac{1}{3}\sum_{i=R,G,B}|I_i^{left}(p)-I_i^{right}(p,d)|$$

其中$I(p)$表示p的像素值，$I(p,d)$表示p的视差为d的像素值。

$$C(p,d)=\rho(C_{census}(p,d),\lambda_{census})+\rho(C_{AD}(p,d),\lambda_{AD})$$

其中$\rho(c,\lambda)$的计算公式为：

$$\rho(c,\lambda)=1-exp(-\frac{c}{\lambda})$$

计算窗口census值

def Census(left,right,option):
    min_disparity = option.min_disparity;
    max_disparity = option.max_disparity;
    disp_range = max_disparity - min_disparity;
    cost_init=np.zeros(shape=(option.height,option.width,disp_range),dtype=np.float32)
    count = 0.0
    for i in range(option.height):
        for j in range(option.wight):
            for d in range(min_disparity,max_disparity):
                # 对图像边界进行判断
                if (j - d < 0) | (j - d >= option.width):
                    cost_init[i, j, d - min_disparity] = 255 / 2
                    continue
                val_l_r = 0
                for x in range(i-3,i+4):
                    for y in range(j-4,j+5):
                        if (x < 0) | (y < 0) | (x >= option.width) | (y >= option.width) | (y-d >= option.height):
                            val_l_r += 0
                            continue
                        if (left[x,y]<left[i,j])!=(right[x,y-d]<right[i,j-d]):
                            count=count+1.0
                cost_init[i,j,d]=count
                count=0
    return cost_init

计算颜色AD值

def Computer_AD(left_color,right_color,option):
    min_disparity = option.min_disparity;
    max_disparity = option.max_disparity;
    disp_range = max_disparity - min_disparity;
    cost_init=np.zeros(shape=(option.height,option.width,disp_range),dtype=np.float32)
    cost_init[:,:,:]=255.0
    for h in range(left_color.shape[0]):
        for w in range(left_color.shape[1]):
            for d in range(disp_range):
                if w-d>=0:
                    cost_init[h,w,d]=np.sum(np.abs(left_color[h,w]-right_color[h,w-d]))/3.0
    return cost_init

计算代价空间

def CostVolume(Cc,Ce,option,a1=30,a2=10):
    min_disparity = option.min_disparity;
    max_disparity = option.max_disparity;
    disp_range = max_disparity - min_disparity;
    CostVolumes=np.zeros(shape=(option.height,option.width,disp_range),dtype=np.float32)
    CostVolumes[:,:,:]=2.0
    Cc=Cc*(-1/a1)
    Ce=Ce*(-1/a2)
    CostVolumes=CostVolumes-np.exp(Cc)-np.exp(Ce)
    return CostVolumes

Census-Hamming

Census变换是使用像素邻域内的局部灰度差异将像素灰度转换为比特串，思路非常简单，通过将邻域窗口（窗口大小为n × m，n和m都为奇数）内的像素灰度值与窗口中心像素的灰度值进行比较，将比较得到的布尔值映射到一个比特串中，最后用比特串的值作为中心像素的Census变换值，如公式1所示：

$$C_s(u,v):=\bigotimes_{i=-n^{'}}^{n^{'}}\bigotimes_{j=-m^{'}}^{m^{'}}\xi(I(u,v),I(u+i,v+j)) \tag{1-1}$$

其中，$n^{'}$ 和 $m^{'}$分别为不大于n和m的一半的最大整数，$\bigotimes$为比特位的逐位连接运算，$\xi$运算则由公式2定义：

$$\xi(x,y)=\begin{cases} 0, & { x \leq y} \\ 1, & { x > y} \end{cases} \tag{1-2}$$

其实现代码为：

def census_transform_5x5(source,width,height):
    # 逐像素计算census值
    census = np.zeros([width,height])
    for i in range(2,height-2):
        for j in range(2,width-2):
#        中心像素值
            gray_center = source[i][j]
#        遍历大小为5x5的窗口内邻域像素，逐一比较像素值与中心像素值的的大小，计算census值
            census_val = 0
            for r in range(-2,3):
                for c in range(-2,3):
                    census_val <<= 1;
                    gray = source[i+r][j+c]
                    if (gray < gray_center): 
                        census_val += 1
#        中心像素的census值
            census[i][j] = census_val  
    return census

基于Census变换的匹配代价计算方法是计算左右影像对应的两个像素的Census变换值的汉明（Hamming）距离，即

$$C(u,v,d):=Hamming(C_{sl}(u,v),C_{sr}(u-d,v)) \tag{1-3}$$

Hamming距离即两个比特串的对应位不相同的数量，计算方法为将两个比特串进行亦或运算，再统计亦或运算结果的比特位中不为1的个数。

图1 汉明距离示意图

汉明距离计算

def Hamming(x, y):
    dist = 0
    val = x ^ y
#  Count the number of set bits
    while (val):
        dist += 1
        val &= val - 1;
    return dist

视差代价计算

def ComputeCost(census_left_,census_right_,option):
    min_disparity = option.min_disparity;
    max_disparity = option.max_disparity;
    disp_range = max_disparity - min_disparity;
    cost_init_ = np.zeros([option.width,option.height,disp_range])
#  计算代价（基于Hamming距离）
    for i in range(option.height):
        for j in range(option.width):
            for d in range(min_disparity,max_disparity):
                if (j - d < 0) | (j - d >= option.width):     
                    cost_init_[i,j,d-min_disparity] = 255/2
                    continue
                if (option.census_size == "Census5x5"):
                    census_val_l = census_left_[i][j]
                    # 右影像对应像点的census值
                    census_val_r = census_right_[i][j-d]
                    # 计算匹配代价
                    cost_init_[i,j,d] = Hamming(census_val_l, census_val_r);
    return cost_init_

BT

BT的代价也是像素灰度值差值的绝对值，不同之处在于BT利用了亚像素的灰度信息。

BT代价计算

$$e(x_R,y,d)=min \{ \min_{x_R-\frac{1}{2}<=x<=x_{R}+\frac{1}{2}}|I_R(x_R,y)-\hat{I}(x+d,y)|,\min_{x_R-\frac{1}{2}<=x<=x_{R}+\frac{1}{2}}|I_T(x_{R}+d,y)-\hat{I_R}(x,y)| \}$$

如图所示，首先计算左图中像素$(x_R-0.5,y)$和$(x_R+0.5,y)$之间亚像素位置$(x_R+x,y)$的灰度值$\hat{I}(x_R,y)$,然后计算右图中像素$(x_R+d-0.5,y)$和$(x_R+d+0.5,y)$之间亚像素位置$(x_R+d+x,y)$的灰度值$\hat{I}_T(x_R,y)$

分别计算两个代价：

$$cos_1=\min_{x_R-\frac{1}{2}<=x<=x_{R}+\frac{1}{2}}|I_R(x_R,y)-\hat{I}(x+d,y)| \\\ cos_2=\min_{x_R-\frac{1}{2}<=x<=x_{R}+\frac{1}{2}}|I_T(x_{R}+d,y)-\hat{I_R}(x,y)|$$

最终的代价为两个代价的最小值：

$$cos=\min(cos_1,cos_2)$$

def ComputeCost(left_,right_,option):
    min_disparity = option.min_disparity;
    max_disparity = option.max_disparity;
    disp_range = max_disparity - min_disparity;
    cost_init_ = np.zeros([option.width,option.height,disp_range])
    for i in range(option.height):
        for j in range(option.width):
            for d in range(min_disparity,max_disparity):
                if (j - d < 0) | (j - d >= option.width):     
                    cost_init_[i,j,d-min_disparity] = 255/2
                    continue
                
                I_R = np.min((left_[i][j]+left_[i][j-1])/2,(left_[i][j]+left_[i][j+1])/2)
                # 右影像对应像点的census值
                I_T = np.min((right_[i][j-d]+right_[i][j-d-1])/2,(right_[i][j]+right_[i][j-d+1])/2)
                cos1 = np.abs(left_[i][j]-I_R)
                cos2 = np.abs(right_[i][j-d]-I_T)
                # 计算匹配代价
                cost_init_[i,j,d] = np.min(cos1, cos2);
    return cost_init_

NCC

对于原始的图像内任意一个像素点(px,py)构建一个n×n的邻域作为匹配窗口。然后对于目标相素位置(px+d,py)同样构建一个n×n大小的匹配窗口，对两个窗口进行相似度度量，注意这里的d有一个取值范围。其计算公式为：

$$NCC(p,d)=\frac{\sum_{(x,y)\in W_p}(I_1(x,y)-\hat I(p_x,p_y)) \cdot (I_2(x+d,y)-\hat I(p_x+d,p_y))}{\sqrt{{\sum_{(x,y)\in W_p}(I_1(x,y)-\hat I_1(p_x,p_y))^2 \cdot \sum_{(x,y)\in W_p}(I_2(x+d,y)-\hat I_2(p_x+d,p_y))^2}}}$$

def ComputeCost(left_,right_,option):
    min_disparity = option.min_disparity;
    max_disparity = option.max_disparity;
    disp_range = max_disparity - min_disparity;
    cost_init_l = np.zeros([option.width,option.height,disp_range])
    cost_init_r = np.zeros([option.width,option.height,disp_range])
    for i in range(option.height):
        for j in range(option.width):
            for d in range(min_disparity,max_disparity):
                if (j - d < 0) | (j - d >= option.width):     
                    cost_init_[i,j,d-min_disparity] = 255/2
                    continue                    
    			ncc1s = 0
                ncc2s = 0
                ncc0 = 0
                for s in range(-2,3):
                    for k in range(-2,3):
                        if (i+s < 0) |(j+k < 0) |(j+k - d >= option.width)|(j+k >= option.width)|(i+s>=option.height):
                             continue 
                         ncc1 = left_[i+s][j+k]-left_[i][j]
                         ncc2 = right_[i+s][j+k-d]-right_[i][j-d]
                         ncc0 +=ncc1*ncc2
                         ncc1s +=ncc1**2
                         ncc2s +=ncc2**2
                cost_init_l[i, j, d] = ncc0/np.sqrt(ncc1s*ncc2s)
                cost_init_r[i, j - d, d] = cost_init_l[i, j, d]
	return cost_init_l,cost_init_r

代价聚合

Box Filtering

$$C_d^A=\frac{1}{N}\sum_{q}C_d(q)$$

其计算过程是计算某视差维度上一个窗口的代价值的平均值，即相当于代价平滑。但是对跨物体边界不友好，但是速度快，一般不用。

Bilateral filter

自适应权重：颜色+空间距离。

双边滤波是结合图像的空间邻近度和像素值相似度的一种折衷处理，同时考虑空域信息和灰度相似性，达到保边去噪的目的。双边滤波器顾名思义比高斯滤波多了一个高斯方差sigma－d，它是基于空间分布的高斯滤波函数，所以在边缘附近，离的较远的像素不会太多影响到边缘上的像素值，这样就保证了边缘附近像素值的保存。如下图所示：

$$C_d^A = \sum_{q}\exp(-\frac{|p-q|}{\sigma_S})\exp(-\frac{|I(p)-I(q)|}{\sigma_R})C_d(q)$$

高斯平滑

Semi-Global Matching

方向r上的路径代价

$$L_r(p,d)=C(p,d)+ \\ min(L_r(p-r,d), \\\ L_r(p-r,d-1)+P_1, \\\ L_r(p-r,d+1)+P_1, \\\ \min_{i}L_r(p-r,i)+P_2) -\min_{k}L_r(p-r,k)$$

各个方向的总聚合代价

$$S(p,d)=\sum_{r}L_r(p,d)$$

代价聚合计算

计算横向方向
def agglr(costVolume, color_left, color_right, maxDis, P1, P2, thres):
    H = costVolume.shape[0];
    W = costVolume.shape[1]
    imgL = color_left.astype(np.float32);
    imgR = color_right.astype(np.float32)
    penalties = np.zeros(shape=(maxDis), dtype=np.float32)
    aggtwo = np.zeros(shape=(H, W, maxDis), dtype=np.float32)
    aggfour = np.zeros(shape=(H, W, maxDis), dtype=np.float32)
    aggtwo[:, 0, :] = costVolume[:, 0, :]
    aggfour[:, W - 1, :] = costVolume[:, W - 1, :]
    for w in range(1, W):
        for h in range(0, H):
            val = max(np.abs(imgL[h, w] - imgL[h, w - 1]))
            for d in range(maxDis):
                if w - d - 1 >= 0:
                    val1 = max(np.abs(imgR[h, w - d - 1] - imgR[h, w - d]))
                else:
                    val1 = val + 1
                penalties = penalty(P1, P2, thres, val, val1, maxDis, d)
                aggtwo[h, w, d] = costVolume[h, w, d] + np.min(aggtwo[h, w - 1, :] + penalties) - np.min(
                    aggtwo[h, w - 1, :])

    for w in range(W - 2, -1, -1):
        for h in range(0, H):
            val = max(np.abs(imgL[h, w] - imgL[h, w + 1]))
            for d in range(maxDis):
                if w - d >= 0:
                    val1 = max(np.abs(imgR[h, w - d + 1] - imgR[h, w - d]))
                else:
                    val1 = val + 1
                penalties = penalty(P1, P2, thres, val, val1, maxDis, d)
                aggfour[h, w, d] = costVolume[h, w, d] + np.min(aggfour[h, w + 1, :] + penalties) - np.min(
                    aggfour[h, w + 1, :])
    return aggtwo, aggfour

# 计算横向方向的代价聚合
def aggtb(costVolume, color_left, color_right, maxDis, P1, P2, thres):
    H = costVolume.shape[0];
    W = costVolume.shape[1]
    imgL = color_left.astype(np.float32);
    imgR = color_right.astype(np.float32)
    penalties = np.zeros(shape=(maxDis), dtype=np.float32)
    aggone = np.zeros(shape=(H, W, maxDis), dtype=np.float32)
    aggthree = np.zeros(shape=(H, W, maxDis), dtype=np.float32)
    aggone[0, :, :] = costVolume[0, :, :]
    aggthree[H - 1, :, :] = costVolume[H - 1, :, :]
    for h in range(1, H):
        for w in range(0, W):
            val = max(np.abs(imgL[h - 1, w] - imgL[h, w]))
            for d in range(maxDis):
                if w - d >= 0:
                    val1 = max(np.abs(imgR[h - 1, w - d] - imgR[h, w - d]))
                else:
                    val1 = val + 1
                penalties = penalty(P1, P2, thres, val, val1, maxDis, d)
                aggone[h, w, d] = costVolume[h, w, d] + np.min(aggone[h - 1, w, :] + penalties) - np.min(
                    aggone[h - 1, w, :])

    for h in range(H - 2, -1, -1):
        for w in range(0, W):
            val = max(np.abs(imgL[h + 1, w] - imgL[h, w]))
            for d in range(maxDis):
                if w - d >= 0:
                    val1 = max(np.abs(imgR[h + 1, w - d] - imgR[h, w - d]))
                else:
                    val1 = val + 1
                penalties = penalty(P1, P2, thres, val, val1, maxDis, d)
                aggthree[h, w, d] = costVolume[h, w, d] + np.min(aggthree[h + 1, w, :] + penalties) - np.min(
                    aggthree[h + 1, w, :])
    return aggone, aggthree

视差计算（WTA）

np.argmin(cost,axis=2).astype(np.int32)

视差后处理

左右一致性检测 (LRC)

左右一致性检测是用于遮挡检测的，具体做法：

根据左右两幅输入图像，分别得到左右两幅视差图。对于左图中的一个点p，求得的视差值是d1，那么p在右图里的对应点应该是(p-d1)，(p-d1)的视差值记作d2。若|d1-d2|>threshold，p标记为遮挡点(occluded point)。

Speckle Filter
本算法主要应用于处理椒盐噪声上，本算法在处理椒盐噪声的同时也相应的模糊了图像的细节，而且迭代次数越多，图像细节也越来越模糊。
亚像素插值
输入获取的最优视差图和视差代价空间。

def Sub_pixel_Enhancement(disimg, costVolume,option):
    disimg = disimg.astype(np.float32)
    for h in range(0, disimg.shape[0]):
        for w in range(0, disimg.shape[1]):
            d0 = int(disimg[h, w]);
            d1 = d0 - 1;
            d2 = d0 + 1
            if (d0<0)|(d1<0)|(d2<0):
                continue
            if d0>62:
                d0=62
                d1 = d0 - 1
                d2 = d0 + 1
            if (costVolume[h, w, d2] + costVolume[h, w, d1] - 2 * costVolume[h, w, d0]) != 0 and d1 >= 0 and d2 < option.max_disparity:
                disimg[h, w] = disimg[h, w] - ((costVolume[h, w, d2] - costVolume[h, w, d1]) / (2 * (costVolume[h, w, d2] + costVolume[h, w, d1] - 2 * costVolume[h, w, d0])))

    return disimg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

立体匹配算法原理

立体匹配

代价计算

AD

SAD

AD-Census

Census-Hamming

BT

NCC

代价聚合

Box Filtering

Bilateral filter

Semi-Global Matching

视差计算（WTA）

视差后处理

超分综述

Transformer综述

Small tiny dot object detect thinking

SISR

VSR

网络加速

网络轻量化

视频目标检测

目标检测

Transformer

摄像头校准

原始数据去噪

cvpr2021

场景文本识别论文

Clone this wiki locally