卡尔曼滤波公式完全推导
24 Jun 2020整理自 GNSS 课程的 PPT.
从误差传播公式说起
线性情况
n 维向量 X 以及协方差表示为:
\[X=\left[x_{1}, x_{2}, x_{3}, \ldots, x_{n}\right]^{\top}\] \[\operatorname{cov}(X, X)=E\left((X-\bar{X})(X-\bar{X})^{\top}\right)=\left[\begin{array}{cccc} \sigma_{11} & \sigma_{11} & \dots & \sigma_{1 n} \\ \sigma_{21} & \sigma_{22} & \dots & \sigma_{2 n} \\ \sigma_{n1} & \sigma_{n 2} & \dots & \sigma_{n n} \end{array}\right]\]其中
\[\sigma_{i j}=\operatorname{cov}\left(x_{i}, x_{j}\right)=E\left(\left(x_{i}-\bar{x}_{i}\right)\left(x_{j}-\bar{x}_{j}\right)\right)\]如果 m 维向量 Y 满足关系 Y = Am×nX + b
\[Y_{m \times 1}=\left[\begin{array}{cccc} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ \vdots & \vdots & \cdots & \vdots \\ a_{m 1} & a_{m 2} & \dots & a_{m n} \end{array}\right]\left[\begin{array}{l} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array}\right]+b_{m \times 1}\]如果 {aij} 为常量,那么有
\[\mathrm{E}(\mathrm{Y})=\left[\begin{array}{llll} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ \vdots & \vdots & \cdots & \vdots \\ a_{m 1} & a_{m 2} & \dots & a_{m n} \end{array}\right]\left[\begin{array}{c} E\left(x_{1}\right) \\ E\left(x_{2}\right) \\ \vdots \\ E(x_n) \end{array}\right]+b_{\operatorname{mx1}}\]也就是说线性运算后的均值 = 均值的线性运算。对于 Y,已经可以根据 X (的均值)求出均值,E(Y) = AE(X) + b, Y 的协方差为:
\[\begin{aligned} \operatorname{cov}(Y, Y) &=E\left[(Y-\bar{Y})(Y-\bar{Y})^{T}\right] \\ &=E\left[(A X+b-E(A X+b))(A X+b-E(A X+b))^{T}\right] \\ &=E\left[(A(X-\bar{X}))(A(X-\bar{X}))^{T}\right] \\ &=E\left[A(X-\bar{X})(X-\bar{X})^{T} A^{T}\right] \\ &=AE[(X-\bar{X})(X-\bar{X})^{T}]A^{T} \\ &=A\operatorname{Cov}(X, X)A^{T} \end{aligned}\]对于变量 Z 满足 Z = AX + BY,ΣX=Cov(X,X), ΣY=Cov(Y,Y),有,并且变量 X 以及 Y 相互独立,Σ(X, Y) = 0:
\[E(Z) = AE(X) + BE(Y)\]以及
\[\operatorname{Cov}(Z, Z)=E\left[(A X+B Y-\bar{Z})(A X+B Y-\bar{Z})^{T}\right]\]其中
\[\begin{aligned} (A X+B Y-\bar{Z}) &=(A X-B Y-\overline{\mathrm{A} X+B Y}) \\ &=(A X-\overline{\mathrm{A} X}+B Y-\overline{B Y}) & \\ &=(A(X-\bar{X})+B(Y-\bar{Y})) \end{aligned}\] \[\begin{aligned} \operatorname{Cov}(Z, Z)=& E\left(A(X-\bar{X})(X-\bar{X})^{\top} A^{\top}\right) \\ &+E\left(A(X-\bar{X})(Y-\bar{Y})^{\top} B^{\top}\right) \\ &+E\left(B(Y-\bar{Y})(X-\bar{X})^{\top} A^{\top}\right) \\ &+E\left(B(Y-\bar{Y})(Y-\bar{Y})^{\top} B^{\top}\right) \\ =& A \Sigma_{X} A^{\top}+B \Sigma_{Y} B^{\top} \end{aligned}\]如果 X 和 Y 相互独立 ?
非线性情况
非线性情况下,设 Y = F(X) 为非线性关系,进行一阶泰勒展开进行线性化即可解决问题:
\[Y \approx Y(\bar{X})+\left[\begin{array}{llll} \frac{\partial y_{1}}{\partial x_{1}} & \frac{\partial y_{1}}{\partial x_{2}} & \cdots & \frac{\partial y_{1}}{\partial x_{n}} \\ \frac{\partial y_{2}}{\partial x_{1}} & \frac{\partial y_{2}}{\partial x_{2}} & \cdots & \frac{\partial y_{2}}{\partial x_{n}} \\ \frac{\partial y_{m}}{\partial x_{1}} & \frac{\partial y_{m}}{\partial x_{2}} & \cdots & \frac{\partial y_{m}}{\partial x_{n}} \end{array}\right]\left[\begin{array}{c} x_{1}-\overline{x_{1}} \\ x_{2}-\overline{x_{2}} \\ x_{n}-\overline{x_{n}} \end{array}\right]=Y(\bar{X})+A\left[\begin{array}{c} x_{1}-\overline{x_{1}} \\ x_{2}-\overline{x_{2}} \\ \vdots\\ x_{n}-\overline{x_{n}} \end{array}\right]\]求均值:
\[\mathrm{E}(\mathrm{Y})=\mathrm{E}(\mathrm{Y}(\overline{\mathrm{X}}))+\left[\begin{array}{llll} \frac{\partial y_{1}}{\partial x_{1}} & \frac{\partial y_{1}}{\partial x_{2}} & \dots & \frac{\partial y_{1}}{\partial x_{\mathrm{n}}} \\ \frac{\partial y_{2}}{\partial x_{1}} & \frac{\partial y_{2}}{\partial x_{2}} & \dots & \frac{\partial y_{2}}{\partial x_{\mathrm{n}}} \\ \frac{\partial y_{m}}{\partial x_{1}} & \frac{\partial y_{m}}{\partial x_{2}} & \dots & \frac{\partial y_{m}}{\partial x_{\mathrm{n}}} \end{array}\right]\left[\begin{array}{c} E\left(x_{1}-\overline{x_{1}}\right) \\ E\left(x_{2}-\overline{x_{2}}\right) \\ E(x_n-\overline{x_{n}}) \end{array}\right]=Y(\overline{\mathrm{X}})\]第二项的列向量等于 0,等式成立。
求方差:
\[\begin{aligned} \operatorname{Cov}(Y, Y)&=E\left((Y-\bar{Y})(Y-\bar{Y})^{\top}\right) \\ & = \end{aligned}\] \[E\left[\left(\left[\begin{array}{llll} \frac{\partial y_{1}}{\partial x_{1}} & \frac{\partial y_{1}}{\partial x_{2}} & \cdots & \frac{\partial y_{1}}{\partial x_{n}} \\ \frac{\partial y_{2}}{\partial x_{1}} & \frac{\partial y_{2}}{\partial x_{2}} & \cdots & \frac{\partial y_{2}}{\partial x_{n}} \\ \frac{\partial y_{m}}{\partial x_{1}} & \frac{\partial y_{m}}{\partial x_{2}} & \cdots & \frac{\partial y_{m}}{\partial x_{n}} \end{array}\right]\left[\begin{array}{c} X_{1}-\overline{X_{1}} \\ X_{2}-\overline{X_{2}} \\ X_{n}-\overline{X_{n}} \end{array}\right]\right) \left(\left[\begin{array}{llll} \frac{\partial y_{1}}{\partial x_{1}} & \frac{\partial y_{1}}{\partial x_{2}} & \cdots & \frac{\partial y_{1}}{\partial x_{n}} \\ \frac{\partial y_{2}}{\partial x_{1}} & \frac{\partial y_{2}}{\partial x_{2}} & \cdots & \frac{\partial y_{2}}{\partial x_{n}} \\ \frac{\partial y_{m}}{\partial x_{1}} & \frac{\partial y_{m}}{\partial x_{2}} & \cdots & \frac{\partial y_{m}}{\partial x_{n}} \end{array}\right]\left[\begin{array}{c} X_{1}-\overline{X_{1}} \\ X_{2}-\overline{X_{2}} \\ X_{n}-\overline{X_{n}} \end{array}\right]\right)^{\top}\right]\] \[=A(\bar{X}) \operatorname{Cov}(X, X) A(\bar{X})^T\]其中
\[A(\bar{X})=\left.\frac{\partial Y}{\partial X^{T}}\right|_{X=\bar{X}}=\left.\left[\begin{array}{llll} \frac{\partial y_{1}}{\partial x_{1}} & \frac{\partial y_{1}}{\partial x_{2}} & \cdots & \frac{\partial y_{1}}{\partial x_{n}} \\ \frac{\partial y_{2}}{\partial x_{1}} & \frac{\partial y_{2}}{\partial x_{2}} & \cdots & \frac{\partial y_{2}}{\partial x_{n}} \\ \frac{\partial y_{m}}{\partial x_{1}} & \frac{\partial y_{m}}{\partial x_{2}} & \cdots & \frac{\partial y_{m}}{\partial x_{n}} \end{array}\right]\right|_{X=\bar{x}}\]数据融合
Merging two measurement sources from Two data sources measured by different people using different instruments.
\[Z=\lambda X+(I-\lambda) Y\]根据误差传播公式:
\[\operatorname{Cov}(Z, Z) =\lambda \Sigma_{X} \lambda^{\top}+(I-\lambda) \Sigma_{Y}(I-\lambda)^{\top}\]要求
\[\lambda=\underset{\lambda}{\operatorname{argmin}}(\operatorname{Cov}(Z, Z))\]为了方便求导,实际上将优化函数记作:
\[\lambda=\underset{\lambda}{\operatorname{argmin}}(\operatorname{tr(Cov}(Z, Z)))\]利用
\[\frac{\partial}{\partial A}[\operatorname{tr}(A B A^{\top})]=2AB\]可以得到
\[\frac{\partial[\operatorname{tr}(\operatorname{cov}(Z, Z))]}{\partial \lambda}=2 \lambda \Sigma_{X}+2(\lambda-I) \Sigma_{Y}=0\]所以
\[\lambda=\Sigma_{Y}\left(\Sigma_{Y}-\Sigma_{X}\right)^{-1}\]卡尔曼滤波推导
对于运动方程和观测方程
\[\begin{array}{l} x_{k}=\Phi_{k, k-1} x_{k-1}+\Gamma_{k, k-1} w_{k-1} \\ z_{k}=H_{k} x_{k}+v_{k} \end{array}\]zk 是观测量 wk, vk 是高斯白噪声,一般来说 zk 的维数小于 xk
初值满足
\[E\left[x_{0}\right]=\bar{x}_{0}, \operatorname{var}\left(x_{0}\right)=P_{0}\]高斯白噪声满足
\[E\left[w_{k}\right]=0, \quad R_{w w}(k, j)=Q_{k} \delta_{k j} \\ E\left[v_{k}\right]=0, \quad R_{v v}(k, j)=R_{k} \delta_{k j} \\ R_{w v}(k, j)=0 \\ R_{x w}(k, j)=0 \\ R_{x y}(k, j)=0\]其中有克罗内克二元函数
\[\delta_{i j}=\left\{\begin{array}{ll} 1 & (i=j) \\ 0 & (i \neq j) \end{array}\right.\]卡尔曼滤波的目的是为了,从观测量中估计状态量这个隐变量,也就是说已知
\[z_{1}, z_{2}, \cdots, z_{k-1}, z_{k} \\ \hat{x}_{k-1 \mid k-1}\]如何求
\[\hat{x}_{k \mid k}\]首先写出预测部分(根据 k - 1 步的状态量进行 k 步的状态量预测,观测量预测):
\[\begin{aligned} E\left[w_{k-1}\right]=0 \Rightarrow \quad &\hat{x}_{k \mid k-1}=\Phi_{k, k-1} \hat{x}_{k-1 \mid k-1} \\ E\left[v_{k}\right]=0 \Rightarrow \quad &\hat{z}_{k \mid k-1}=H_{k} \hat{x}_{k \mid k-1} \end{aligned}\]然后用 k 步的观测量对预测部分得到的结果进行更新:
\[\left.\begin{array}{l} \hat{x}_{k \mid k-1} \\ \hat{z}_{k \mid k-1} \\ z_{k} \end{array}\right\} \Rightarrow \hat{x}_{k \mid k}=\hat{x}_{k \mid k-1}+\underline{K_{k}}\left(z_{k}-\hat{z}_{k \mid k-1}\right)\]Kk 称作待定校正增益矩阵。
如何得到待定校正增益矩阵?设置一个目标函数(估计误差方差):
\[\arg\min E\left[\tilde{x}_{k \mid k} \tilde{x}^{T}_{k \mid k}\right]\]其中估计误差记作
\[\tilde{x}_{k \mid k} = x_{k}-\hat{x}_{k \mid k}\]那么
\[\begin{aligned} \tilde{x}_{k \mid k} &=x_{k}-\hat{x}_{k \mid k} \\ &=x_{k}-\left(\hat{x}_{k \mid k-1}+K_{k}\left(z_{k}-\hat{z}_{k \mid k-1}\right)\right) \\ &=x_{k}-\left(\hat{x}_{k \mid k-1}+K_{k}\left(H_{k} x_{k}+v_{k}-H_{k} \hat{x}_{k \mid k-1}\right)\right) \\ &=\underline{\tilde{x}_{k \mid k-1}}-K_{k}\left(H_{k} \underline{\tilde{x}_{k \mid k-1}}+v_{k}\right) \\ &=\left(I-K_{k} H_{k}\right) \tilde{x}_{k \mid k-1}-K_{k} v_{k} \end{aligned}\]其中下划线为预测误差
\[\tilde{x}_{k \mid k-1} = x_{k}-\hat{x}_{k \mid k-1}\]代入估计误差方差的期望
\[\begin{aligned} E\left[ \tilde{x}_{k \mid k} \tilde{x}_{k \mid k}^{T} \right]&= E\left[\{\left.\left(I-K_{k} H_{k}\right) \tilde{x}_{k \mid k-1}-K_{k} v_{k}\right\}\{\tilde{x}_{k \mid k-1}^{T}\left(I-K_{k} H_{k}\right)^{T}-v_{k}^{T} K_{k}^{T}\}\right] \\ &= E\left[\left(I-K_{k} H_{k}\right) \tilde{x}_{k \mid k-1} \tilde{x}_{k \mid k-1}^{T}\left(I-K_{k} H_{k}\right)^{T}+K_{k} v_{k} v_{k}^{T} K_{k}^{T}\right.\\ &\quad\left.-K_{k} v_{k} \tilde{x}_{k \mid k-1}^{T}\left(I-K_{k} H_{k}\right)^{T}-\left(I-K_{k} H_{k}\right) \tilde{x}_{k \mid k-1} v_{k}^{T} K_{k}^{T}\right] \end{aligned}\]将
\[E \left[\tilde{X}_{k \mid k-1} v_{k}^{T}\right]=0, E\left[v_{k} \tilde{x}_{k \mid k-1}^{T}\right]=0\]代入,等式变成了
\[E \left[\tilde{x}_{k \mid k} \tilde{x}_{k \mid k}^{T}\right]= E \left[\left(I-K_{k} H_{k}\right) \underline{P_{k \mid k-1}}\left(I-K_{k} H_{k}\right)^{T}+K_{k} R_{k} K_{k}^{T}\right]\]将估计误差方差矩阵记作
\[P_{k \mid k}=\left(I-K_{k} H_{k}\right) P_{k \mid k-1}\left(I-K_{k} H_{k}\right)^{T}+K_{k} R_{k} K_{k}^{T}\]为了求得 Kk, 选择代价函数:
\[tr(P_{k \mid k})\]对 tr(Pk|k) 求导,设求导结果为 0
\[K_{k}=P_{k \mid k-1} H_{k}^{T}\left(H_{k} P_{k \mid k-1} H_{k}^{T}+R_{k}\right)^{-1}\]那么估计误差方差矩阵就可以表示为
\[\begin{aligned} P_{k \mid k} &=P_{k \mid k-1}-P_{k \mid k-1} H_{k}^{T}\left(H_{k} P_{k \mid k-1} H_{k}^{T}+R_{k}\right)^{-1} H_{k} P_{k \mid k-1} \\ &=\left(I-K_{k} H_{k}\right) P_{k \mid k-1} \end{aligned}\]下面是求解预测误差方差矩阵 Pk|k-1 的过程
预测误差
\[\begin{aligned} \tilde{x}_{k \mid k-1} &=x_{k}-\hat{x}_{k \mid k-1} \\ &=\Phi_{k, k-1} x_{k-1}+\Gamma_{k, k-1} w_{k-1}-\Phi_{k, k-1} \hat{x}_{k-1 \mid k-1} \\ &=\Phi_{k, k-1} \tilde{x}_{k-1 \mid k-1}+\Gamma_{k, k-1} w_{k-1} \end{aligned}\]预测误差方差的期望
\[\begin{aligned} E \left[\tilde{x}_{k \mid k-1} \tilde{x}_{k \mid k-1}^{T}\right]= & E\left[\left(\Phi_{k, k-1} \tilde{x}_{k-1 \mid k-1}+\Gamma_{k, k-1} w_{k-1}\right)\left(\Phi_{k, k-1} \tilde{x}_{k-1 \mid k-1}+\Gamma_{k, k-1} w_{k-1}\right)^{T}\right] \\ =& \Phi_{k, k-1} E\left[\tilde{x}_{k-1 \mid k-1} \tilde{x}_{k-1 \mid k-1}^{T}\right] \Phi_{k, k-1}^{T}+\Gamma_{k, k-1} E\left[w_{k-1} w_{k-1}^{T}\right] \Gamma_{k, k-1}^{T} \\ &+\Gamma_{k, k-1} E\left[w_{k-1} \tilde{x}_{k-1 \mid k-1}^{T}\right] \Phi_{k, k-1}^{T}+\Phi_{k, k-1} E\left[\tilde{x}_{k-1 \mid k-1} w_{k-1}^{T}\right] \Gamma_{k, k-1}^{T} \end{aligned}\]代入
\[E\left[w_{k-1} \tilde{X}_{k-1 \mid k-1}^{T}\right] = 0 \\ E\left[\tilde{X}_{k-1 \mid k-1} W_{k-1}^{T}\right] = 0\]得到
\[P_{k \mid k-1}=\Phi_{k, k-1} P_{k-1 \mid k-1} \Phi_{k, k-1}^{T}+\Gamma_{k, k-1} Q_{k-1} \Gamma_{k, k-1}^{T}\]到此为止,卡尔曼滤波有关的公式已经推导完成,将卡尔曼滤波分为两大部分
预测过程与校正过程
状态预测
\[\hat{x}_{k \mid k-1}=\Phi_{k, k-1} \hat{x}_{k-1 \mid k-1}\]预测误差方差
\[P_{k \mid k-1}=\Phi_{k, k-1} P_{k-1 \mid k-1} \Phi_{k, k-1}^{T}+\Gamma_{k, k-1} Q_{k-1} \Gamma_{k, k-1}^{T}\]校正增益矩阵
\[K_{k}=P_{k \mid k-1} H_{k}^{T}\left(H_{k} P_{k \mid k-1} H_{k}^{T}+R_{k}\right)^{-1}\]新息序列
\[\tilde{z}_{k \mid k-1}=\left(z_{k}-H_{k} \hat{x}_{k \mid k-1}\right)\]状态估计
\[\hat{x}_{k \mid k}=\hat{x}_{k \mid k-1}+K_{k} \tilde{z}_{k \mid k-1}\]估计误差方差
\[P_{k \mid k}=\left(I-K_{k} H_{k}\right) P_{k \mid k-1}\]