# Some notes on Kalman Filtering

State Space form

Measurement Equation

$\displaystyle \boxed{\mathbf{\underbrace{y_{t}}_{N \times 1}=\underbrace{Z_{t}}_{N \times m}\underbrace{a_{t}}_{m \times 1}+d_{t}+\varepsilon_{t}}}$

$\displaystyle Var(\varepsilon_{t})= \mathbf{H_{t}}$

Transition Equation

$\displaystyle \boxed{\mathbf{\underbrace{a_{t}}_{m \times 1} =\underbrace{T_{t}}_{m \times m} a_{t-1}+c_{t}+\underbrace{R_{t}}_{m \times g} \underbrace{\eta_{t}}_{g \times 1}}}$

$\displaystyle Var(\eta_{t})=\mathbf{Q}_{t}$

$\displaystyle E(a_{0})= \mathbf{a_{0} \; \; \; \; Var(a_{0})=P_{0}} \; \; \; \; E(\varepsilon_{t}a_{0}^{\top}) \; \; \; E(\eta_{t}a_{0}^{\top})$

Future form

$\displaystyle \mathbf{a_{t+1}=T_{t}a_{t}+c_{t}+R_{t}\eta_{t}}$

1.2. Kalman Filter

Recursive procedure for computing the optimal estimator of the state vector at time t. When the model is Gaussian Kalman filter can be interpreted as updating the mean and covariance matrix of the conditional distribution of the state vector as new observations become available.

$\displaystyle \alpha_{t-1} \sim N(\mathbf{a_{t-1},P_{t-1}})$

then

$\displaystyle \alpha_{t} \sim N(\mathbf{a_{t|t-1}, P_{t|t-1}})$

where

$\mathbf{a_{t|t-1}}=\mathbf{T_{t}a_{t-1}+c_{t}}$
$\mathbf{P_{t|t-1}}=\mathbf{T_{t}P_{t}T_{t}^{\top}+R_{t}Q_{t}R_{t}^{\top}}$

Predictive distribution of $\mathbf{y_{t}}$

$\displaystyle \mathbf{\tilde{y}}_{t|t-1}=\mathbf{Z_{t}a_{t|t-1}+d_{t}}$

$\displaystyle \mathbf{F_{t}=Z_{t}P_{t|t-1}Z_{t}^{\top}+H_{t}}$

$\displaystyle \left[ \begin{array}{c} \mathbf{\alpha_{t}}\\ \mathbf{y_{t}} \end{array}\right] \sim N \left[ \left( \begin{array}{c} \mathbf{a_{t|t-1}}\\ \mathbf{Z_{t}a_{t|t-1}+d_{t}} \end{array} \right), \left( \begin{array}{cc} \mathbf{P_{t|t-1}} & \mathbf{P_{t|t-1}Z_{t}^{\top}}\\ \mathbf{Z_{t}P_{t|t-1}} & \mathbf{Z_{t}P_{t|t-1}Z_{t}^{\top}+H_{t}} \end{array}\right) \right]$

Updating equations

$\displaystyle \boxed{\mathbf{a_{t}=a_{t|t-1}+\underbrace{P_{t|t-1}Z_{t}^{\top}}_{\Sigma_{12}} \underbrace{F_{t}^{-1}}_{\Sigma_{22}^{-1}}(y_{t}\underbrace{-Z_{t}a_{t|t-1}-d_{t}}_{-\mu_{2}})}}$

and

$\displaystyle \boxed{\mathbf{P_{t}=P_{t|t-1}-P_{t|t-1}Z_{t}^{\top}F_{t}^{-1}Z_{t}P_{t|t-1}}}$

Contemporaneous filter: $\displaystyle \mathbf{a_{t-1}} \rightarrow \mathbf{a_{t}}$

Predictive filter: $\displaystyle \mathbf{a_{t|t-1}} \rightarrow \mathbf{a_{t+1|t}}$

In the latter case

$\displaystyle \mathbf{a_{t+1|t}=T_{t+1}a_{t|t-1}+c_{t+1}+K_{t}v_{t}}$

or

$\mathbf{a_{t+1|t}\underbrace{=}_{\mathbf{T_{t+1}a_{t}}}(T_{t+1}-K_{t}Z_{t})a_{t|t-1}+K_{t}y_{t}+(c_{t+1}-K_{t}d_{t})}$

where the gain matrix ${\mathbf{K_{t}}}$ is given by

$\displaystyle \boxed{\mathbf{K_{t}=T_{t+1}P_{t|t-1}Z_{t}^{\top}F_{t}^{-1}}}$

and

$\displaystyle \boxed{\mathbf{P_{t+1|t}=T_{t+1}} \underbrace{\mathbf{(P_{t|t-1}-P_{t|t-1}Z_{t}^{\top}F_{t}^{-1}Z_{t}P_{t|t-1}})}_{{\mathbf{P_{t}}}} \mathbf{T_{t+1}^{\top}+R_{t+1}Q_{t+1}R_{t+1}^{\top}}}$

Initialization

Start Kalman at ${t=0}$ with diffuse prior

$\displaystyle \boxed{\mathbf{P_{0}=\kappa I} \; \; \; \; \kappa \rightarrow \infty}$

Prediction

$\displaystyle \boxed{\mathbf{a_{T+\ell |T}=T_{T+\ell}a_{T+\ell-1|T}+c_{T+\ell}}}$

$\displaystyle \boxed{\mathbf{P_{T+\ell |T}=T_{T+\ell}P_{T+\ell-1|T}T_{T+\ell}^{\top}+R_{T+\ell}Q_{T+\ell}R_{T+\ell}^{\top}}}$

Taking conditional expectations in the measurement equation for ${y_{t+\ell}}$

$\displaystyle \boxed{\mathbf{E[y_{t+\ell}|Y_{T}]=\tilde{Y}_{T+\ell|T}=Z_{T+\ell}a_{T+\ell|T}+d_{T+\ell}}}$

with MSE matrix

$\displaystyle MSE(\tilde{y}_{T+\ell|T})=\mathbf{Z_{T+\ell}P_{T+\ell|T}Z_{T+\ell}^{\top}+H_{T+\ell}}$

MLE and prediction error decomposition

$\displaystyle \mathbf{p(Y;\psi)=\prod_{t=1}^{T}p(y_{t}|Y_{t-1})}$

Prediction errors or innovations

$\displaystyle \boxed{\mathbf{v_{t}=y_{t}-\tilde{y}_{t|t-1} \sim NID(0,F_{t})}}$

Prediction error decomposition

$\displaystyle \boxed{\mathbf{\ell(\psi)= -\frac{T}{2} log2 \pi-\frac{1}{2} \sum_{t=1}^{T} log|F_{t}|-\frac{1}{2} \sum_{t=1}^{T} v_{t}^{\top} F_{t}^{-1}v_{t}}}$

Diagnostic tests can be based on the standardized innovations

${\mathbf{F_{t}^{-1/2} v_{t}}}$ which are serially independent if ${\psi}$ is known

* ${L(\mathbf{\psi})}$ is maximized w.r.t. ${ \mathbf{\psi} }$ numerically. Diffuse prior ${\Rightarrow}$ exact likelihood.
* ${ \mathbf{\psi= \left[ \underbrace{\psi^{\top}_{*} }_{n-1} , \sigma^{2}_{*} \right] ^{\top}} }$ and run independently on ${ \mathbf{\psi} }$.

—————————————————————————————————————–