2.2.1 K-means

Next: 2.2.2 Soft-clustering Up: 2.2 Vector quantization Previous: 2.2 Vector quantization

2.2.1 K-means

The most used hard-clustering algorithm is k-means (Moody and Darken, 1989; Lloyd, 1982). Here, in each iteration step, first, based on the Voronoi tessellation, for all j, P(j| $\bf x_{i}^{}$ ) is calculated for one $\bf x_{i}^{}$ (on-line version) or all patterns (batch version). Then, in the on-line version, the code-book assigned to $\bf x_{i}^{}$ is moved closer to $\bf x_{i}^{}$ (using (2.13) with $\varepsilon$ (t) = $\varepsilon$ ). In the batch version, each $\bf c_{j}^{}$ is moved to the center of its assigned patterns (which minimizes (2.12) given the assignment P(j| $\bf x_{i}^{}$ )). These steps are repeated until convergence is reached. The disadvantage of k-means is that it is prone to end at a local minimum, and therefore, its success depends on the initial choice of { $\bf c_{j}^{}$ }.

Heiko Hoffmann
2005-03-22