Whittle’s Approximate Likelihood

The Whittle Likelihood is a frequency-based approximation to the Gaussian Likelihood which is up to a constant asymptotically efficient. The Whittle estimate is asymptotically efficient and can be interpreted as minimum distance estimate of the distance between the parametric spectral density and the (nonparametric) periodogram. It also minimises the asymptotic Kullback-Leibler divergence and, for autoregressive processes, is identical to the Yule-Walker estimate. The evaluation of the Whittle Likelihood can be done very fast by computing the periodogram via the FFT in only ${O(NlogN)}$ operations.

Suppose that a stationary, zero mean, gaussian process ${\{X_t \}}$ is observed at times ${t=1,2,..T}$. Assume ${\{X_t \}}$ has spectral density ${f_{\theta}(\lambda)}$, ${\lambda \in \Pi := (-\pi, \pi]}$, depending on a vector of unknown parameters ${\theta \subset \Theta \in \mathbb{R}^p}$. A natural approach to estimate the parameter ${\theta}$ from the sample ${\mathbf{X}_T}$ is to maximize the likelihood function or alternatively to minimise ${-1/T}$ times the log-likelihood. The later takes the form

$\displaystyle \mathcal{L}_T(\theta) := \frac{1}{2}\log(2\pi)+ \frac{1}{2T}\log \det \mathbf{\Gamma}_{\theta} + \frac{1}{2T} \mathbf{X}'_T \mathbf{\Gamma}_{\theta}^{-1} \mathbf{X}_T$

Observe that the variance-covariance matrix ${ \Gamma_{\theta}}$ can be expressed in terms of the spectral density as follows

$\displaystyle \Gamma_{\theta} := \left\lbrace \int\limits_{-\pi}^{\pi} f_{\theta} (\lambda)\exp(i \lambda (r-s) ) d \lambda \right\rbrace_{r,s=0,...,T-1}$

Hence ${\mathcal{L}_T(\theta)}$ can be rewritten as

$\displaystyle \mathcal{L}_T(\theta) := \frac{1}{2}\log(2\pi)+ \frac{1}{2T}\log \det \mathbf{T_T}(2 \pi f_{\theta}) + \frac{1}{2T} \mathbf{X}'_T \mathbf{T_T}( 2 \pi f_{\theta})^{-1} \mathbf{X}_T$

where ${\mathbf{T_T}(f)}$ is the Toeplitz matrix of f.

Unfortunately it is difficult to calculate the preceding function and that is especially true for the inverse of the Toeplitz matrix. However we can approximate it by using a famous result by Grenander and Szegö (1958).

Theorem (Szegö). Consider ${f: \Pi \rightarrow \mathbb{R}}$ with ${a(f) = \sum\limits_{k \in \mathbb{Z}}(1+|k|)|\hat{f}_k| < \infty}$ where ${\hat{f}_k}$ are the Fourier coefficients of ${f}$. Then ${\exists [a,b]}$ s.th. ${a \leq f \leq b}$ while the eigenvalues ${\tau_1...,\tau_T}$ of ${T_T(f)}$ belong to ${[a,b]}$, and ${\forall}$ continuous ${g:[a,b]\rightarrow \mathbb{R} }$

$\displaystyle \lim\limits_{T \rightarrow \infty} \frac{1}{T} \sum\limits_{i=1}^{T} g(\tau_i)= \frac{1}{2 \pi} \int\limits_{-\pi}^{\pi} g(f(\lambda)) d\lambda$

Hence

$\displaystyle \frac{1}{T}\log \det \mathbf{T_T}(2 \pi f_{\theta})= \frac{1}{2 \pi} \int\limits_{-\pi}^{\pi} \log 2 \pi f(\lambda) d \lambda$

Note that for large ${T}$, ${ \frac{1}{T} \mathbf{T_T}}$ is almost a homomorphism which suggests that ${\{T_T(2 \pi f_{\theta})\}^{-1}}$ can be replaced by ${\mathbf{T_T}(\frac{1}{2 \pi f_{\theta}})}$. Hence the third term of ${\mathcal{L}_T(\theta)}$ becomes

$\displaystyle \begin{array}{rcl} \frac{1}{2T} \mathbf{X}'_T \mathbf{T_T}( 2 \pi f_{\theta})^{-1} \mathbf{X}_T &=& \frac{1}{2T} \mathbf{X}'_T \mathbf{T_T}\left( \frac{1}{2 \pi f_{\theta}}\right) \mathbf{X}_T \\ &=& \frac{1}{2T} \mathbf{X}'_T \frac{1}{2 \pi} \left\lbrace \int\limits_{-\pi}^{\pi}\frac{1}{2 \pi f_{\theta}(\lambda)} \exp(i \lambda (r-s) ) d \lambda \right\rbrace_{r,s=0,...,T-1} \mathbf{X}_T \\ &=& \frac{1}{2T} \mathbf{X}'_T \frac{1}{2 \pi} \left\lbrace \int\limits_{-\pi}^{\pi}\frac{1}{2 \pi f_{\theta}(\lambda)} \exp(i \lambda (r-s) ) d \lambda \right\rbrace_{r,s=0,...,T-1} \mathbf{X}_T \\ &=& \frac{1}{2 \pi T} \sum\limits_{r,s=1}^{T} X_r X_s \int\limits_{-\pi}^{\pi} \frac{1}{4 \pi f_{\theta}(\lambda)} \exp(i \lambda (r-s) ) d \lambda \\ &=& \int\limits_{-\pi}^{\pi} \left[ \frac{1}{2 \pi T} \sum\limits_{r,s=1}^{T} X_r X_s \exp(i \lambda (r-s) ) \right] \frac{1}{4 \pi f_{\theta}(\lambda)} d \lambda \\ &=& \frac{1}{4 \pi } \int\limits_{-\pi}^{\pi} \frac{I_T(\lambda)}{ f_{\theta}(\lambda)} d \lambda \end{array}$

where

$\displaystyle I_T(\lambda) = \frac{1}{2 \pi T} \left| \sum\limits_{t=1}^{T} X_t \exp(- i \lambda t) \right|^2$

is the periodogram ${I_T}$ of the process.

This finally leads to the likelihood approximation suggested by Whittle (1953).

$\displaystyle \boxed{\mathcal{L}^{(W)}_T(\lambda) = \frac{1}{4 \pi} \int\limits_{- \pi}^{\pi} \left\lbrace \log f_{\theta}(\lambda)+\frac{I_T(\lambda)}{f_{\theta}(\lambda)} \right\rbrace d \lambda}$

————————————

References

R. Azencott and D. Dacunha-Castelle (1986). Series of Irregular Observations. Springer, New York.
K. Dzhaparidze (1986). Parameter Estimation and Hypothesis Testing in Spectral Analysis of Stationary Time Series. Springer, New York.
U. Grenander and G. Szegö (1958). Toeplitz forms and their applications, University of California Press, Berkeley.
P.Whittle (1953). Estimation and information in stationary time series. Ark.Mat. 2 423-434.