next up previous contents
Next: 2.4.3 Common kernel functions Up: 2.4 Kernel PCA Previous: 2.4.1 Feature extraction


2.4.2 Centering in feature space

So far, we have assumed that {$ \varphi$($ \bf x_{i}^{}$)} has zero mean, which is usually not fulfilled. Therefore, the formalism needs to be adjusted (Schölkopf et al., 1998b). The following set of points will be centered:

$\displaystyle \tilde{\boldsymbol}$$\displaystyle \varphi$($\displaystyle \bf x_{i}^{}$) = $\displaystyle \varphi$($\displaystyle \bf x_{i}^{}$) - $\displaystyle {\frac{{1}}{{n}}}$$\displaystyle \sum_{{r=1}}^{n}$$\displaystyle \varphi$($\displaystyle \bf x_{r}^{}$) . (2.31)

The above analysis holds if the covariance matrix is computed from $ \tilde{\boldsymbol}$$ \varphi$($ \bf x_{i}^{}$). Thus, the kernel matrix Kij = $ \varphi$($ \bf x_{i}^{}$)T$ \varphi$($ \bf x_{j}^{}$) needs to be replaced by $ \tilde{K}_{{ij}}^{}$ = $ \tilde{\boldsymbol}$$ \varphi$($ \bf x_{i}^{}$)T$ \tilde{\boldsymbol}$$ \varphi$($ \bf x_{j}^{}$). Using (2.31), $ \tilde{K}$ can be written as,


$\displaystyle \tilde{K}_{{ij}}^{}$ = $\displaystyle \varphi$($\displaystyle \bf x_{i}^{}$)T$\displaystyle \varphi$($\displaystyle \bf x_{j}^{}$) - $\displaystyle {\frac{{1}}{{n}}}$$\displaystyle \sum_{{r=1}}^{n}$$\displaystyle \varphi$($\displaystyle \bf x_{i}^{}$)T$\displaystyle \varphi$($\displaystyle \bf x_{r}^{}$) - $\displaystyle {\frac{{1}}{{n}}}$$\displaystyle \sum_{{r=1}}^{n}$$\displaystyle \varphi$($\displaystyle \bf x_{r}^{}$)T$\displaystyle \varphi$($\displaystyle \bf x_{j}^{}$)  
  + $\displaystyle {\frac{{1}}{{n^2}}}$$\displaystyle \sum_{{r,s=1}}^{n}$$\displaystyle \varphi$($\displaystyle \bf x_{r}^{}$)T$\displaystyle \varphi$($\displaystyle \bf x_{{ s}}^{}$)  
  = Kij - $\displaystyle {\frac{{1}}{{n}}}$$\displaystyle \sum_{{r=1}}^{n}$Kir - $\displaystyle {\frac{{1}}{{n}}}$$\displaystyle \sum_{{r=1}}^{n}$Krj + $\displaystyle {\frac{{1}}{{n^2}}}$$\displaystyle \sum_{{r,s=1}}^{n}$Krs . (2.32)

Therefore, we can evaluate the kernel matrix for the centered data using the known matrix $ \bf K$. For the remainder of this thesis, I denote with $ \alpha$ the eigenvectors of $ \tilde{{\bf K}}$ instead of $ \bf K$, and they are normalized according to (2.29) using the eigenvalues of $ \tilde{{\bf K}}$. The principal components are $ \tilde{{\bf w}}$ = $ \sum_{{i=1}}^{n}$$ \alpha_{i}^{}$$ \tilde{\boldsymbol}$$ \varphi$($ \bf x_{i}^{}$).


next up previous contents
Next: 2.4.3 Common kernel functions Up: 2.4 Kernel PCA Previous: 2.4.1 Feature extraction
Heiko Hoffmann
2005-03-22