Parent Topic: FUZCLUS
J_q(U,V) = sum(j=1,N)sum(i=1,K) u_{ij}^q d^2(Xj,Vi)
where Xj is the j-th feature vector, Vi is the centroid of the i-th
cluster, u_{ij} is the degree of membership of Xj in the i-th
cluster, d(Xj,Vi) is some distance metric between Xj and Vi (here we
use the Euclidean distance), N is the number of data points, K is
the number of clusters. The parameter q is the weighting exponent
for u_{ij} and controls the "fuzziness" of the resulting cluster. In
this implementation, a fixed value of q=2 is chosen which appears to
be reasonable for most applications and it also allows a fast
implementation.Fuzzy partition is carried out through an iterative optimization:
(1) Choose initial cluster centroids (seeds) Vi (2) Compute the degree of membership of all feature vectors in all the clusters:
(1/d^2(Xj,Vi))^(1/(q-1))
u_{ij} = -----------------------------------
sum(k=1,K) (1/d^2(Xj,Vk))^(1/(q-1))
(3) Compute new centroids Vi_new:
sum(j=1,N)(u_{ij})^q Xj
Vi_new = -----------------------------
sum(j=1,N)(u_{ij})^q
(4) When the movement of centroids (relative changes) is less than a
predetermined threshold MOVETHRS, stop the iteration. Otherwise go
to step 2. The algorithm will also terminate when a maximum number
of iterations is reached.(5) Finally, a data point Xj is assigned to cluster i if the fuzzy membership u_{ij} >= u_{kj} for all k clusters.
The cluster centroids Vi will be saved in a signature segment, if requested.
More details about the Fuzzy K-means method can be found in the following publication:
J.C. Bezdek, "Guzzy mathematics in pattern classification", Ph.D. dissertation, Cornell Univ., Itheca, NY, 1973.