Next: C.2 The eigenvalue equation
Up: C. Proofs
Previous: C. Proofs
C.1 Probabilistic PCA and error measures
In probabilistic principal component analysis, the observed d-dimensional data
{} are assumed to origin from a probability density
p(). This density can be written as
p() = (2)-d/2(det)-1/2exp - ( - )T( - ) , |
(C.1) |
with
= + (Tipping and Bishop, 1997). is the d-dimensional identity matrix, and is the noise variance. The d×q matrix is obtained by maximizing the likelihood of the data
{} given the probability
p(). Tipping and Bishop (1999) showed that the result is
The columns of the d×q matrix are the q principal
eigenvectors of the covariance matrix of
{}.
The q largest eigenvalues of the covariance matrix are the entries of the diagonal matrix
. is an arbitrary q×q rotational
matrix.
In the following, it is shown that the double negative logarithm of (C.1) equals the normalized Mahalanobis distance plus reconstruction error (section 3.2.1) plus a constant. Using (C.2) to rewrite the expression for gives
= + ( - ) = + ( - ) . |
(C.3) |
By showing that
= and
= , we can verify that the inverse of is
The eigenvalues of are
,..., and . The latter occurs (d - q)-times. Thus, the determinant of is
Finally, we evaluate the logarithm of
p() using (C.1), (C.4), and (C.5):
ln p() = - ln(2) - E( - ) |
(C.6) |
with
E() = T-1 + (T - T) + ln + (d - q)ln , |
(C.7) |
and
= - . E is a normalized Mahalanobis distance plus reconstruction error.
Next: C.2 The eigenvalue equation
Up: C. Proofs
Previous: C. Proofs
Heiko Hoffmann
2005-03-22