Next: C.2 The eigenvalue equation
Up: C. Proofs
Previous: C. Proofs
C.1 Probabilistic PCA and error measures
In probabilistic principal component analysis, the observed d-dimensional data
{
} are assumed to origin from a probability density
p(
). This density can be written as
p( ) = (2 )-d/2(det )-1/2exp - ( - )T ( - ) , |
(C.1) |
with
= 
+ 
(Tipping and Bishop, 1997).
is the d-dimensional identity matrix, and
is the noise variance. The d×q matrix
is obtained by maximizing the likelihood of the data
{
} given the probability
p(
). Tipping and Bishop (1999) showed that the result is
The columns of the d×q matrix
are the q principal
eigenvectors of the covariance matrix of
{
}.
The q largest eigenvalues
of the covariance matrix are the entries of the diagonal matrix
.
is an arbitrary q×q rotational
matrix.
In the following, it is shown that the double negative logarithm of (C.1) equals the normalized Mahalanobis distance plus reconstruction error (section 3.2.1) plus a constant. Using (C.2) to rewrite the expression for
gives
=  + ( -  ) =   + ( -  ) . |
(C.3) |
By showing that

=
and

=
, we can verify that the inverse of
is
The eigenvalues of
are
,...,
and
. The latter occurs (d - q)-times. Thus, the determinant of
is
Finally, we evaluate the logarithm of
p(
) using (C.1), (C.4), and (C.5):
ln p( ) = - ln(2 ) - E( - ) |
(C.6) |
with
E( ) = T -1 + ( T - T  ) + ln + (d - q)ln , |
(C.7) |
and
=
-
. E is a normalized Mahalanobis distance plus reconstruction error.
Next: C.2 The eigenvalue equation
Up: C. Proofs
Previous: C. Proofs
Heiko Hoffmann
2005-03-22