next up previous contents
Next: 2.2.2 Soft-clustering Up: 2.2 Vector quantization Previous: 2.2 Vector quantization

2.2.1 K-means

The most used hard-clustering algorithm is k-means (Moody and Darken, 1989; Lloyd, 1982). Here, in each iteration step, first, based on the Voronoi tessellation, for all j, P(j|$ \bf x_{i}^{}$) is calculated for one $ \bf x_{i}^{}$ (on-line version) or all patterns (batch version). Then, in the on-line version, the code-book assigned to $ \bf x_{i}^{}$ is moved closer to $ \bf x_{i}^{}$ (using (2.13) with $ \varepsilon$(t) = $ \varepsilon$). In the batch version, each $ \bf c_{j}^{}$ is moved to the center of its assigned patterns (which minimizes (2.12) given the assignment P(j|$ \bf x_{i}^{}$)). These steps are repeated until convergence is reached. The disadvantage of k-means is that it is prone to end at a local minimum, and therefore, its success depends on the initial choice of {$ \bf c_{j}^{}$}.



Heiko Hoffmann
2005-03-22