Next: C.2 The eigenvalue equation Up: C. Proofs Previous: C. Proofs

C.1 Probabilistic PCA and error measures

In probabilistic principal component analysis, the observed d-dimensional data { $\bf x_{i}^{}$ } are assumed to origin from a probability density p( $\bf x$ ). This density can be written as

p( $\displaystyle \bf x$ ) = (2 $\displaystyle \pi$ )^-d/2(det $\displaystyle \bf B$ )^-1/2exp $\displaystyle \left(\vphantom{-\frac{1}{2}({\bf x}-{\bf c})^T{\bf B}^{-1}({\bf x}-{\bf c})}\right.$ - $\displaystyle {\frac{{1}}{{2}}}$ ( $\displaystyle \bf x$ - $\displaystyle \bf c$ )^T $\displaystyle \bf B^{{-1}}_{}$ ( $\displaystyle \bf x$ - $\displaystyle \bf c$ ) $\displaystyle \left.\vphantom{-\frac{1}{2}({\bf x}-{\bf c})^T{\bf B}^{-1}({\bf x}-{\bf c})}\right)$ ,

(C.1)

with $\bf B$ = $\sigma^{2}_{}$ $\bf I_{d}^{}$ + $\bf U$ $\bf U^{T}_{}$ (Tipping and Bishop, 1997). $\bf I_{d}^{}$ is the d-dimensional identity matrix, and $\sigma^{2}_{}$ is the noise variance. The d×q matrix $\bf U$ is obtained by maximizing the likelihood of the data { $\bf x_{i}^{}$ } given the probability p( $\bf x$ ). Tipping and Bishop (1999) showed that the result is

$\displaystyle \bf U$ = $\displaystyle \bf W$ ( $\displaystyle \Lambda$ - $\displaystyle \sigma^{2}_{}$ $\displaystyle \bf I_{q}^{}$ )^1/2 $\displaystyle \bf R$ .

(C.2)

The columns of the d×q matrix $\bf W$ are the q principal eigenvectors of the covariance matrix of { $\bf x_{i}^{}$ }. The q largest eigenvalues $\lambda_{j}^{}$ of the covariance matrix are the entries of the diagonal matrix $\Lambda$ . $\bf R$ is an arbitrary q×q rotational matrix.

In the following, it is shown that the double negative logarithm of (C.1) equals the normalized Mahalanobis distance plus reconstruction error (section 3.2.1) plus a constant. Using (C.2) to rewrite the expression for $\bf B$ gives

$\displaystyle \bf B$ = $\displaystyle \sigma^{2}_{}$ $\displaystyle \bf I_{d}^{}$ + $\displaystyle \bf W$ ( $\displaystyle \Lambda$ - $\displaystyle \sigma^{2}_{}$ $\displaystyle \bf I_{q}^{}$ ) $\displaystyle \bf W^{T}_{}$ = $\displaystyle \bf W$ $\displaystyle \Lambda$ $\displaystyle \bf W^{T}_{}$ + $\displaystyle \sigma^{2}_{}$ ( $\displaystyle \bf I_{d}^{}$ - $\displaystyle \bf W$ $\displaystyle \bf W^{T}_{}$ ) .

(C.3)

By showing that $\bf B$ $\bf B^{{-1}}_{}$ = $\bf I$ and $\bf B^{{-1}}_{}$ $\bf B$ = $\bf I$ , we can verify that the inverse of $\bf B$ is

$\displaystyle \bf B^{{-1}}_{}$ = $\displaystyle \bf W$ $\displaystyle \Lambda$ ^-1 $\displaystyle \bf W^{T}_{}$ + $\displaystyle {\frac{{1}}{{\sigma^2}}}$ ( $\displaystyle \bf I_{d}^{}$ - $\displaystyle \bf W$ $\displaystyle \bf W^{T}_{}$ ) .

(C.4)

The eigenvalues of $\bf B$ are $\lambda_{1}^{}$ ,..., $\lambda_{q}^{}$ and $\sigma^{2}_{}$ . The latter occurs (d - q)-times. Thus, the determinant of $\bf B$ is

det $\displaystyle \bf B$ = $\displaystyle \left(\vphantom{\sigma^2}\right.$ $\displaystyle \sigma^{2}_{}$ $\displaystyle \left.\vphantom{\sigma^2}\right)^{{d-q}}_{}$ $\displaystyle \prod_{{j=1}}^{q}$ $\displaystyle \lambda_{j}^{}$ .

(C.5)

Finally, we evaluate the logarithm of p( $\bf x$ ) using (C.1), (C.4), and (C.5):

ln p( $\displaystyle \bf x$ ) = - $\displaystyle {\frac{{d}}{{2}}}$ ln(2 $\displaystyle \pi$ ) - $\displaystyle {\frac{{1}}{{2}}}$ E( $\displaystyle \bf x$ - $\displaystyle \bf c$ )

(C.6)

with

E( $\displaystyle \xi$ ) = $\displaystyle \xi$ ^T $\displaystyle \bf W$ $\displaystyle \Lambda$ ^-1 $\displaystyle \bf W^{T}_{}$ $\displaystyle \xi$ + $\displaystyle {\frac{{1}}{{\sigma^2}}}$ ( $\displaystyle \xi$ ^T $\displaystyle \xi$ - $\displaystyle \xi$ ^T $\displaystyle \bf W$ $\displaystyle \bf W^{T}_{}$ $\displaystyle \xi$ ) + $\displaystyle \sum_{j}^{}$ ln $\displaystyle \lambda_{j}^{}$ + (d - q)ln $\displaystyle \sigma^{2}_{}$ ,

(C.7)

and $\xi$ = $\bf x$ - $\bf c$ . E is a normalized Mahalanobis distance plus reconstruction error.

Next: C.2 The eigenvalue equation Up: C. Proofs Previous: C. Proofs

Heiko Hoffmann
2005-03-22