next up previous contents
Next: 2.2.3 Deterministic annealing Up: 2.2 Vector quantization Previous: 2.2.1 K-means

2.2.2 Soft-clustering

In soft-clustering, many code-book vectors compete for one pattern $ \bf x_{i}^{}$. A common assignment P(j|$ \bf x_{i}^{}$) is the normalized Gaussian function (see, for example, Yair et al. (1992)),

P(j|$\displaystyle \bf x_{i}^{}$) = $\displaystyle {\frac{{\exp\left(-\beta \Vert{\bf x}_i - {\bf c}_j\Vert^2\right)}}{{\sum_j\exp\left(-\beta \Vert{\bf x}_i - {\bf c}_j\Vert^2\right)}}}$ . (2.14)

The parameter $ \beta$ controls the influence range of a pattern $ \bf x_{i}^{}$ or code-book vector. For the limit $ \beta$ $ \rightarrow$ $ \infty$, the algorithm is the same as hard-clustering. P(j|$ \bf x_{i}^{}$) is normalized such that $ \sum_{j}^{}$P(j|$ \bf x_{i}^{}$) = 1 (this allows a probabilistic interpretation). Equation (2.14) can be derived using the maximum entropy principle. From all possible sets {P(j|$ \bf x_{i}^{}$)} that result in a given total cost E according to (2.12), the Gibbs distribution (2.14) maximizes the entropy (Rose et al., 1990). Thus, the assignment (2.14) does not rely on any assumptions about the distribution of patterns. In this physical perspective, $ \beta$ has the role of an inverse temperature. The soft-clustering approach can be extended to an annealing process, which helps to avoid local minima (Rose et al., 1990).


next up previous contents
Next: 2.2.3 Deterministic annealing Up: 2.2 Vector quantization Previous: 2.2.1 K-means
Heiko Hoffmann
2005-03-22