Parent Topic: KCLUS

DETAILS

This program uses the K-Means method to classify image data into different clusters. Up to 16 image channels can be analyzed, and 255 clusters (classes) found using this program.

The program reads in image data from the specified channels in the specified file.

Due to the large amount of memory required, KCLUS will sample a subset of the image data during cluster means calculation to generate a histogram. The amount of sampling depends on the amount of image data.

For example, for 1024x1024 image data, KCLUS will sample every other pixel during calculation of cluster means. However, when writing results to an output channel, all pixels will be classified.

The MASK parameter specifies the area within the input channel which will be processed. Only the area under mask will be classified and the rest of the image will not be processed. If a single value is specified, then this value refers to a bitmap segment, which defines the area to be classified. When four values are specified, these values define the x,y offsets and x,y dimensions of rectangular window within the image to be classified.

It is quite common for satellite images to have a lot of black- filled areas (with zero gray levels) which should not be included in the classification. To solve this problem, the user can first run the program THR by setting the TVAL's minimum and maximum values to 1 and 255, respectively. A bitmap mask is thus created only on the image area. The user then inputs this bitmap as the MASK parameter in this program.

The user can specify the number of clusters desired through the NUMCLUS parameter. This can be any value between 1 and 255. The initial seed values can be entered in a text file and specified by the SEEDFILE parameter. If no filename is given in the SEEDFILE parameter, seeds will be generated diagonally along the n-dimensional histogram.

The text file containing the initial seeds for 4 channels and 6 clusters would have the following format:

      1   1   1   1            | 1st seed, channels 1,2,3,4
      5   3   5   9            | 2nd seed, channels 1,2,3,4
     40  43  20  10            | 3rd seed, channels 1,2,3,4
    100 101 140  50            | 4th seed, channels 1,2,3,4
    150 155 200 175            | 5th seed, channels 1,2,3,4
    240 200 195 140            | 6th seed, channels 1,2,3,4
In the above example, the numbers represent grey level values. So, the values 5,3,5,9 represent the second seed grey level values in channels 1,2,3,4 respectively.

The user should define the maximum number of iterations allowed in the program through the parameter MAXITER and the movement threshold through the parameter MOVETHRS.

The result of the clustering is a theme map directed to a specified database image channel (DBOC). If a DBOC value is not specified results will not be saved into a channel. A theme map encodes each cluster with a unique grey level. Cluster number is represented by grey level. For example, cluster 1 is assigned the grey level of 1, and cluster 2 is assigned the grey level of 2. Grey level 0 represents unclassified pixels. Therefore, if the theme map is later directed to the display, a pseudo-colour table should be loaded so that each cluster is represented by a different colour.

KCLUS allows the user to specify a background grey level value (BACKVAL) to be ignored during classification. If this value is specified, pixels with background grey level value will be assigned class 0 (null class).

KCLUS generates a report of the current cluster mean values and sample counts after each iteration.

After the execution of this program, the user can run the AGGREG program to view clusters or aggregate clusters. See AGGREG documentation for more details.

More details about the K-means method can be found in the following publication:

 Julius T. Tou and Rafael C. Gonzalez.  1974.  Pattern Recognition 
 Principles.  Addison-Wesley Publishing Co.

Parent Topic: KCLUS
About PCI Help Gateway