Parent Topic: FUZCLUS

ALGORITHM

The fuzzy K-means algorithm is based on minimisation of the following objective function, with respect to U, a fuzzy K-partition of the data set, and to V, a set of K prototypes:

    J_q(U,V) = sum(j=1,N)sum(i=1,K) u_{ij}^q d^2(Xj,Vi)
where Xj is the j-th feature vector, Vi is the centroid of the i-th cluster, u_{ij} is the degree of membership of Xj in the i-th cluster, d(Xj,Vi) is some distance metric between Xj and Vi (here we use the Euclidean distance), N is the number of data points, K is the number of clusters. The parameter q is the weighting exponent for u_{ij} and controls the "fuzziness" of the resulting cluster. In this implementation, a fixed value of q=2 is chosen which appears to be reasonable for most applications and it also allows a fast implementation.

Fuzzy partition is carried out through an iterative optimization:

(1) Choose initial cluster centroids (seeds) Vi (2) Compute the degree of membership of all feature vectors in all the clusters:

                    (1/d^2(Xj,Vi))^(1/(q-1))
     u_{ij} = -----------------------------------
              sum(k=1,K) (1/d^2(Xj,Vk))^(1/(q-1))
(3) Compute new centroids Vi_new:

                 sum(j=1,N)(u_{ij})^q Xj
     Vi_new = -----------------------------
                  sum(j=1,N)(u_{ij})^q
(4) When the movement of centroids (relative changes) is less than a predetermined threshold MOVETHRS, stop the iteration. Otherwise go to step 2. The algorithm will also terminate when a maximum number of iterations is reached.

(5) Finally, a data point Xj is assigned to cluster i if the fuzzy membership u_{ij} >= u_{kj} for all k clusters.

The cluster centroids Vi will be saved in a signature segment, if requested.

More details about the Fuzzy K-means method can be found in the following publication:

 J.C. Bezdek, "Guzzy mathematics in pattern classification",
 Ph.D. dissertation, Cornell Univ., Itheca, NY, 1973.

Parent Topic: FUZCLUS
About PCI Help Gateway