直线特征提取
问题概述
常见的直线特征提取算法是最小二乘法进行的直线拟合,但是线性二次拟合的效果容易受噪声影响,导致拟合效果较差。本文基于直线方程的另一种形式,详细给出最优直线的推导过程。
直线方程可以表示为
\[ r = \cos\theta*x + \sin\theta*y \]
几何关系如下图所示
假设一组数据集合为
\[ S_k= \{ (x_0,y_0),...,(x_k,y_k)\} \]
该数据集符合直线分布,那么满足\(S_k\)数据的最优直线\(L\)为
\[ r = \cos\theta*x + \sin\theta*y \tag{1} \]
这组数据集与直线\(L\)的误差值为
\[ e_i = \| \cos\theta*x_i + \sin\theta*y_i - r\| \tag{2} \]
方差值为
\[ E_2 =\sum_{S_k}e_i^2 \tag{3} \]
那么求取与数据点\(S_k\)匹配度最优的线段\(L\)问题,就可以转化为求取方差\(E_2\)的最小值问题。
最值求解过程推导
函数定义
根据最值原理,求最小值的问题,可以转化为求解函数导数为零的解的问题。 假设函数
\[ f(\theta,r) = \sum_{S_k}(\cos\theta*x_i + \sin\theta*y_i - r)^2 \tag{4} \]
导数为
\[ \dot{f}(\theta,r) = 2\sum_{S_k}(\cos\theta*x_i + \sin\theta*y_i - r)' \tag{5} \]
根据函数参数 \(\theta\) 和 \(r\),分别对其求偏导数得
\[ \dot{f}(\theta,r) = \frac{\partial f}{\partial \theta} + \frac{\partial f}{\partial r} \tag{6} \]
故得
\[ \dot{f}(\theta,r) = 2\sum_{S_k}(\cos\theta*x_i + \sin\theta*y_i - r)(\cos\theta*y_i - \sin\theta*x_i -1) \tag{7} \]
当 \(\dot{f}(\theta,r)= 0\)得
\[ \sum_{S_k}(\cos\theta*x_i + \sin\theta*y_i - r)(\cos\theta*y_i - \sin\theta*x_i -1) = 0 \tag{8} \]
求解假设
假设
\[ \begin{array}{r} \sum_{S_k}\cos\theta*x_i + \sin\theta*y_i - r = 0 \\ \cos\theta\sum_{i_0}^kx_i + \sin\theta\sum_{i_0}^ky_i - \sum_{i_0}^kr = 0 \tag{9} \end{array} \]
由上述得
\[ \begin{array}{l} \sum_{i=0}^{k}r &= \cos\theta\sum_{i=0}^kx_i + \sin\theta\sum_{i=0}^ky_i \\ r &= \cos\theta\frac{\sum_{i=0}^kx_i}{k} + \sin\theta\frac{\sum_{i=0}^ky_i}{k} \tag{10} \end{array} \]
令
\[ \begin{array}{l} V_x = \frac{\sum_{i=0}^kx_i}{k} \\ V_y = \frac{\sum_{i=0}^ky_i}{k} \tag{11} \end{array} \]
上述公式可以表述为
\[ r = \cos\theta*V_x + \sin\theta*V_y \tag{12} \]
函数导数化简
由公式(8)得
\[ \sum_{i=0}^{k}(\cos\theta*x_i + \sin\theta*y_i)(\cos\theta*y_i - \sin\theta*x_i) -\sum_{i=0}^{k}(\cos\theta*x_i + \sin\theta*y_i)- \sum_{i=0}^{k}r(\cos\theta*y_i - \sin\theta*x_i) + \sum_{i=0}^{k}r = 0 \tag{13} \]
将等式(12)带入等式(13),并将等式分解为一下4个部分
\[ \begin{array}{l} A = \sum_{i=0}^{k}\cos^2\theta*x_iy_i - \sin\theta\cos\theta*x_i^2 + \sin\theta\cos\theta*y_i^2 -\sin^2\theta*x_iy_i \\ B = \sum_{i=0}^{k}(\cos\theta*x_i + \sin\theta*y_i) \\ C = \sum_{i=0}^{k}(\cos\theta*V_x + \sin\theta*V_y)(\cos\theta*y_i - \sin\theta*x_i) \\ D = \sum_{i=0}^{k}(\cos\theta*V_x + \sin\theta*V_y) \tag{14} \end{array} \]
等式(14)的四个部分可化简为
- A
\[ \begin{array}{l} A = \sum_{i=0}^{k}\cos^2\theta*x_iy_i - \sin\theta\cos\theta*x_i^2 + \sin\theta\cos\theta*y_i^2 -\sin^2\theta*x_iy_i\\ \quad = (\cos^2\theta - \sin^2\theta)\sum_{i=0}^{k}x_iy_i - \sin\theta\cos\theta(\sum_{i=0}^{k}x_i^2 - \sum_{i=0}^{k}y_i^2)\\ \quad = \cos2\theta\sum_{i=0}^{k}x_iy_i - 0.5\sin2\theta(\sum_{i=0}^{k}x_i^2 - \sum_{i=0}^{k}y_i^2) \tag{15} \end{array} \]
- B
\[ \begin{array}{l} B = \sum_{i=0}^{k}(\cos\theta*x_i + \sin\theta*y_i) \\ \quad = \cos\theta\sum_{i=0}^{k}x_i + \sin\theta\sum_{i=0}^{k}y_i \tag{16} \end{array} \]
- C
\[ \begin{array}{l} C = \sum_{i=0}^{k}(\cos\theta*V_x + \sin\theta*V_y)(\cos\theta*y_i - \sin\theta*x_i) \\ \quad =\cos^2\theta*V_x\sum_{i=0}^{k}y_i - \sin\theta\cos\theta*V_x\sum_{i=0}^{k}x_i+\sin\theta\cos\theta*V_y\sum_{i=0}^{k}y_i -\sin^2\theta*V_y\sum_{i=0}^{k}x_i\\ \quad =(\cos^2\theta - \sin^2\theta)\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} - \sin\theta\cos\theta(V_x\sum_{i=0}^{k}x_i - V_y\sum_{i=0}^{k}y_i)\\ \quad =\cos2\theta\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} - 0.5\sin2\theta(V_x\sum_{i=0}^{k}x_i - V_y\sum_{i=0}^{k}y_i)\tag{17} \end{array} \]
- D
\[ \begin{array}{l} D = \sum_{i=0}^{k}(\cos\theta*V_x + \sin\theta*V_y) \\ \quad =\cos\theta*V_x\sum_{i=0}^{k} + \sin\theta*V_y\sum_{i=0}^{k} \\ \quad =\cos\theta\sum_{i=0}^{k}x_i + \sin\theta\sum_{i=0}^{k}y_i \tag{18} \end{array} \]
故等式(13)可以表示为
\[ A - B - C + D = 0 \tag{19} \]
根据等式(16)和(18)知,B和D相等,故等式(19)可以进一步简化为
\[ A - C = 0 \tag{20} \]
等式(20)两端同时扩大两倍得
\[ 2\cos2\theta\sum_{i=0}^{k}x_iy_i - \sin2\theta(\sum_{i=0}^{k}x_i^2 - \sum_{i=0}^{k}y_i^2) -2\cos2\theta\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} + \sin2\theta(V_x\sum_{i=0}^{k}x_i - V_y\sum_{i=0}^{k}y_i) = 0\tag{21} \]
合并同类项得
\[ \cos2\theta*2*(\sum_{i=0}^{k}x_iy_i - \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k}) + \sin2\theta(V_x\sum_{i=0}^{k}x_i - V_y\sum_{i=0}^{k}y_i - \sum_{i=0}^{k}x_i^2 + \sum_{i=0}^{k}y_i^2) = 0 \tag{22} \]
令
- \(V_{xy}\)
\[ \begin{array}{l} V_{xy} = \sum_{i=0}^{k}x_iy_i - \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k}\\ \qquad = \sum_{i=0}^{k}x_iy_i - \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} -\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} + \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k}\\ \qquad = \sum_{i=0}^{k}x_iy_i - \sum_{i=0}^{k}x_iV_y - \sum_{i=0}^{k}y_iV_x + \sum_{i=0}^{k}V_xV_y\\ \qquad = \sum_{i=0}^{k}(x_i - V_x)(y_i - V_y)\tag{23} \end{array} \]
- \(V_{xx}\)
\[ \begin{array}{l} V_{xx} = \sum_{i=0}^{k}x_i^2 - V_x\sum_{i=0}^{k}x_i\\ \qquad = \sum_{i=0}^{k}x_i^2 - 2\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}x_i}{k} + \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}x_i}{k}\\ \qquad =\sum_{i=0}^{k}x_i^2 - 2V_x\sum_{i=0}^{k}x_i + \sum_{i=0}^{k}V_xV_x\\ \qquad =\sum_{i=0}^{k}(x_i - V_x)^2 \tag{24} \end{array} \]
- \(V_{yy}\)
\[ V_{yy} = \sum_{i=0}^{k}(y_i - V_y)^2 \tag{25} \]
等式(22)可以表示为
\[ \cos2\theta*2*V_{xy} + \sin2\theta(V_{yy} - V_{xx}) = 0\tag{26} \]
根据博客求解方程Acos + Bsin = C可知
\[ \theta = \arctan\frac{(V_{yy} - V_{xx}) - \sqrt{(V_{yy} - V_{xx})^2 + 4V_{xy}^2}}{2V_{xy}}\tag{27} \]
根据数据集\(S_k\)和等式(12)(27),可以求出最优的\(\theta\)和\(r\)参数。