Parent Topic: Hints and Tips

Improving Performance

Image registration is traditionally considered to be a computationally expensive operation. Advances in CPU speeds, Input/Output (I/O) throughput and memory sizes have greatly alleviated this problem, making an all in one interactive product like GCPWorks possible. However, final disk to disk registration may still take hours, so it can be worth trying to improve registration performance.

Reducing CPU Usage
When considering which resampling technique to use, it is important to realize that Cubic Convolution can take substantially more computation effort than Bilinear, or Nearest Neighbour resampling. On today's fast workstations, this is less of a factor than it once was; however, if you believe that your registration time is being limited by the required amount of computation (CPU usage), switching from Cubic Convolution to another resampling method will speed things up.

Another factor affecting CPU usage is the order of the polynomial used to perform the geometric warping. The polynomials have to be evaluated for each pixel in the output image. High order polynomials require substantially more operations to compute than low order polynomials.

Input Caching and Disk Thrashing
It is much more likely that the limiting factor in a registration will be the time spent reading and writing data to disk. In particular there can be problems with having to reread input imagery many times during a registration. To understand why this occurs, and what can be done to avoid unnecessary I/O, it is necessary to discuss how the registration works.

GCPWorks maintains a number of input scanlines in memory. These are called the ``Cache'', and the size of the cache determines the number of input scanlines that can be ``remembered'' at a time.

The registration process involves moving over the output image one scanline at a time. As each pixel along an output scanline is processed, the location is transformed according to the polynomial model to find a sampling position on the input image. The pixel value at this point in the input image is then fetched from the scanlines in the cache. If the transformed point does not fall on one of the scanlines already in the cache, the scanline is read in from disk, and the least recently used scanline in the cache is replaced.

Ideally each input scanline would only be read once, and kept in the cache until it is not needed again. This is the case where either no rotation, or a ninety degree rotation takes place. As one scanline in the output image is processed, the transformation requests samples from the same input scanline each time.

Unfortunately this is not a typical case. In cases with significant rotation and with a small cache relative to the size of the input image, it is likely that requested scanlines will not be in the cache already. In the worst case, it might be necessary to read an entire input scanline for each output pixel processed. To register a 6000x6000 output image from a 6000x6000 input image, the worst possible case is having to read each input scanline 6000 times; which means that the entire input file would have been read 6000 times. This could make the registration as much as 6000 times slower than if everything remained in the cache! This is the state known as ``Thrashing''.

Fortunately, the registration algorithm includes extra logic to detect thrashing and to take helpful measures. When GCPWorks detects that thrashing is taking place, it will abort the registration and start over again. When it starts over, it will only attempt a thin vertical strip of the output file at a time.

This will, hopefully, result in reuse of most of the input scanlines in the cache over a number of partial output scanlines, thus reducing the number of times each input scanline has to be read. This is most helpful in the case of a ninety degree rotation, and most destructive in the case of no rotation.

If GCPWorks detects thrashing, the progress monitor for the registration will drop back to zero as the algorithm starts again on smaller strips.

Reducing Disk Thrashing
One way to improve the effectiveness of the input cache is to increase its size. By default the input cache is 4MB, about four million bytes. It can be increased using the ``Memory'' control on the ``Disk to Disk Registration'' panel. For instance setting the memory size to ``10.0'' would start GCPWorks with a 10MB cache. Increasing the size of the cache is often useful, but only if there is enough memory (RAM) on the computer to satisfy the request. On a system with 16MB of memory and only one user it is likely that 10MB or so is already used by the computer. Increasing the cache size above 6MB is likely to adversely affect performance. However, on a workstation with 64MB of memory, it may be useful to specify a cache as large as 40MB.

If increasing the size of the cache does not help enough, it can also be useful to reduce the number of channels registered at once. For instance, if seven bands of a full Landsat TM scene (6000x6000) are registered at once with a 4MB cache, then only about 100 scanlines of the input image can be held at once (6000x100x7=4200000); however, if only one band is registered, then 700 scanlines can be held (6000x700x1 = 4200000). If only one band at a time is registered, the polynomial transform and resampling calculations have to be performed once for each band; so this is a tradeoff of I/O against CPU usage. It is also wise to ensure that the input file is stored in band interleaved form if it is registered one band at a time.

Since the input image caching tends to break down when rotating by a large angle, it is helpful to scan input images in such an orientation that little rotation is needed (north up). This is not possible with data received in digital form.

See Also: Memory Cache


Parent Topic: Hints and Tips
About PCI Help Gateway