直线特征提取

问题概述

常见的直线特征提取算法是最小二乘法进行的直线拟合,但是线性二次拟合的效果容易受噪声影响,导致拟合效果较差。本文基于直线方程的另一种形式,详细给出最优直线的推导过程。

菊花展

直线方程可以表示为

\[ r = \cos\theta*x + \sin\theta*y \]

几何关系如下图所示

几何关系

假设一组数据集合为

\[ S_k= \{ (x_0,y_0),...,(x_k,y_k)\} \]

该数据集符合直线分布,那么满足\(S_k\)数据的最优直线\(L\)

\[ r = \cos\theta*x + \sin\theta*y \tag{1} \]

这组数据集与直线\(L\)的误差值为

\[ e_i = \| \cos\theta*x_i + \sin\theta*y_i - r\| \tag{2} \]

方差值为

\[ E_2 =\sum_{S_k}e_i^2 \tag{3} \]

那么求取与数据点\(S_k\)匹配度最优的线段\(L\)问题,就可以转化为求取方差\(E_2\)的最小值问题。

最值求解过程推导

函数定义

根据最值原理,求最小值的问题,可以转化为求解函数导数为零的解的问题。 假设函数

\[ f(\theta,r) = \sum_{S_k}(\cos\theta*x_i + \sin\theta*y_i - r)^2 \tag{4} \]

导数为

\[ \dot{f}(\theta,r) = 2\sum_{S_k}(\cos\theta*x_i + \sin\theta*y_i - r)' \tag{5} \]

根据函数参数 \(\theta\)\(r\),分别对其求偏导数得

\[ \dot{f}(\theta,r) = \frac{\partial f}{\partial \theta} + \frac{\partial f}{\partial r} \tag{6} \]

故得

\[ \dot{f}(\theta,r) = 2\sum_{S_k}(\cos\theta*x_i + \sin\theta*y_i - r)(\cos\theta*y_i - \sin\theta*x_i -1) \tag{7} \]

\(\dot{f}(\theta,r)= 0\)

\[ \sum_{S_k}(\cos\theta*x_i + \sin\theta*y_i - r)(\cos\theta*y_i - \sin\theta*x_i -1) = 0 \tag{8} \]

求解假设

假设

\[ \begin{array}{r} \sum_{S_k}\cos\theta*x_i + \sin\theta*y_i - r = 0 \\ \cos\theta\sum_{i_0}^kx_i + \sin\theta\sum_{i_0}^ky_i - \sum_{i_0}^kr = 0 \tag{9} \end{array} \]

由上述得

\[ \begin{array}{l} \sum_{i=0}^{k}r &= \cos\theta\sum_{i=0}^kx_i + \sin\theta\sum_{i=0}^ky_i \\ r &= \cos\theta\frac{\sum_{i=0}^kx_i}{k} + \sin\theta\frac{\sum_{i=0}^ky_i}{k} \tag{10} \end{array} \]

\[ \begin{array}{l} V_x = \frac{\sum_{i=0}^kx_i}{k} \\ V_y = \frac{\sum_{i=0}^ky_i}{k} \tag{11} \end{array} \]

上述公式可以表述为

\[ r = \cos\theta*V_x + \sin\theta*V_y \tag{12} \]

函数导数化简

由公式(8)得

\[ \sum_{i=0}^{k}(\cos\theta*x_i + \sin\theta*y_i)(\cos\theta*y_i - \sin\theta*x_i) -\sum_{i=0}^{k}(\cos\theta*x_i + \sin\theta*y_i)- \sum_{i=0}^{k}r(\cos\theta*y_i - \sin\theta*x_i) + \sum_{i=0}^{k}r = 0 \tag{13} \]

将等式(12)带入等式(13),并将等式分解为一下4个部分

\[ \begin{array}{l} A = \sum_{i=0}^{k}\cos^2\theta*x_iy_i - \sin\theta\cos\theta*x_i^2 + \sin\theta\cos\theta*y_i^2 -\sin^2\theta*x_iy_i \\ B = \sum_{i=0}^{k}(\cos\theta*x_i + \sin\theta*y_i) \\ C = \sum_{i=0}^{k}(\cos\theta*V_x + \sin\theta*V_y)(\cos\theta*y_i - \sin\theta*x_i) \\ D = \sum_{i=0}^{k}(\cos\theta*V_x + \sin\theta*V_y) \tag{14} \end{array} \]

等式(14)的四个部分可化简为

  • A

\[ \begin{array}{l} A = \sum_{i=0}^{k}\cos^2\theta*x_iy_i - \sin\theta\cos\theta*x_i^2 + \sin\theta\cos\theta*y_i^2 -\sin^2\theta*x_iy_i\\ \quad = (\cos^2\theta - \sin^2\theta)\sum_{i=0}^{k}x_iy_i - \sin\theta\cos\theta(\sum_{i=0}^{k}x_i^2 - \sum_{i=0}^{k}y_i^2)\\ \quad = \cos2\theta\sum_{i=0}^{k}x_iy_i - 0.5\sin2\theta(\sum_{i=0}^{k}x_i^2 - \sum_{i=0}^{k}y_i^2) \tag{15} \end{array} \]

  • B

\[ \begin{array}{l} B = \sum_{i=0}^{k}(\cos\theta*x_i + \sin\theta*y_i) \\ \quad = \cos\theta\sum_{i=0}^{k}x_i + \sin\theta\sum_{i=0}^{k}y_i \tag{16} \end{array} \]

  • C

\[ \begin{array}{l} C = \sum_{i=0}^{k}(\cos\theta*V_x + \sin\theta*V_y)(\cos\theta*y_i - \sin\theta*x_i) \\ \quad =\cos^2\theta*V_x\sum_{i=0}^{k}y_i - \sin\theta\cos\theta*V_x\sum_{i=0}^{k}x_i+\sin\theta\cos\theta*V_y\sum_{i=0}^{k}y_i -\sin^2\theta*V_y\sum_{i=0}^{k}x_i\\ \quad =(\cos^2\theta - \sin^2\theta)\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} - \sin\theta\cos\theta(V_x\sum_{i=0}^{k}x_i - V_y\sum_{i=0}^{k}y_i)\\ \quad =\cos2\theta\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} - 0.5\sin2\theta(V_x\sum_{i=0}^{k}x_i - V_y\sum_{i=0}^{k}y_i)\tag{17} \end{array} \]

  • D

\[ \begin{array}{l} D = \sum_{i=0}^{k}(\cos\theta*V_x + \sin\theta*V_y) \\ \quad =\cos\theta*V_x\sum_{i=0}^{k} + \sin\theta*V_y\sum_{i=0}^{k} \\ \quad =\cos\theta\sum_{i=0}^{k}x_i + \sin\theta\sum_{i=0}^{k}y_i \tag{18} \end{array} \]

故等式(13)可以表示为

\[ A - B - C + D = 0 \tag{19} \]

根据等式(16)和(18)知,B和D相等,故等式(19)可以进一步简化为

\[ A - C = 0 \tag{20} \]

等式(20)两端同时扩大两倍得

\[ 2\cos2\theta\sum_{i=0}^{k}x_iy_i - \sin2\theta(\sum_{i=0}^{k}x_i^2 - \sum_{i=0}^{k}y_i^2) -2\cos2\theta\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} + \sin2\theta(V_x\sum_{i=0}^{k}x_i - V_y\sum_{i=0}^{k}y_i) = 0\tag{21} \]

合并同类项得

\[ \cos2\theta*2*(\sum_{i=0}^{k}x_iy_i - \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k}) + \sin2\theta(V_x\sum_{i=0}^{k}x_i - V_y\sum_{i=0}^{k}y_i - \sum_{i=0}^{k}x_i^2 + \sum_{i=0}^{k}y_i^2) = 0 \tag{22} \]

  • \(V_{xy}\)

\[ \begin{array}{l} V_{xy} = \sum_{i=0}^{k}x_iy_i - \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k}\\ \qquad = \sum_{i=0}^{k}x_iy_i - \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} -\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k} + \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}y_i}{k}\\ \qquad = \sum_{i=0}^{k}x_iy_i - \sum_{i=0}^{k}x_iV_y - \sum_{i=0}^{k}y_iV_x + \sum_{i=0}^{k}V_xV_y\\ \qquad = \sum_{i=0}^{k}(x_i - V_x)(y_i - V_y)\tag{23} \end{array} \]

  • \(V_{xx}\)

\[ \begin{array}{l} V_{xx} = \sum_{i=0}^{k}x_i^2 - V_x\sum_{i=0}^{k}x_i\\ \qquad = \sum_{i=0}^{k}x_i^2 - 2\frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}x_i}{k} + \frac{\sum_{i=0}^{k}x_i\sum_{i=0}^{k}x_i}{k}\\ \qquad =\sum_{i=0}^{k}x_i^2 - 2V_x\sum_{i=0}^{k}x_i + \sum_{i=0}^{k}V_xV_x\\ \qquad =\sum_{i=0}^{k}(x_i - V_x)^2 \tag{24} \end{array} \]

  • \(V_{yy}\)

\[ V_{yy} = \sum_{i=0}^{k}(y_i - V_y)^2 \tag{25} \]

等式(22)可以表示为

\[ \cos2\theta*2*V_{xy} + \sin2\theta(V_{yy} - V_{xx}) = 0\tag{26} \]

根据博客求解方程Acos + Bsin = C可知

\[ \theta = \arctan\frac{(V_{yy} - V_{xx}) - \sqrt{(V_{yy} - V_{xx})^2 + 4V_{xy}^2}}{2V_{xy}}\tag{27} \]

根据数据集\(S_k\)和等式(12)(27),可以求出最优的\(\theta\)\(r\)参数。